fix(central-ui): resolve CentralUI-007..014 — nav authz, UTC date filters, disposal guards, N+1 fix, async script analysis

This commit is contained in:
Joseph Doherty
2026-05-16 20:58:03 -04:00
parent 738e67acc5
commit 71b90ba499
21 changed files with 976 additions and 81 deletions

View File

@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 |
| Reviewer | claude-agent |
| Commit reviewed | `9c60592` |
| Open findings | 15 |
| Open findings | 7 |
## Summary
@@ -269,7 +269,17 @@ fixed 30-minute model. The code and the documented decision must agree.
**Resolution**
_Unresolved._
_Unresolved — requires a cross-module change plus a design decision, both out of
scope for a CentralUI-only fix._ Verified 2026-05-16: the discrepancy is real.
The sliding-expiration mechanism, however, is owned by the cookie
authentication middleware configured in **`ScadaLink.Security`**
(`ServiceCollectionExtensions.AddCookie` — currently sets neither
`ExpireTimeSpan` nor `SlidingExpiration`); `AuthEndpoints` (CentralUI) only sets
the absolute `ExpiresUtc`/`expires_at`. Implementing "15-minute sliding token"
means editing `ScadaLink.Security`, which this module's review cannot touch, and
the alternative — amending the documented decision to a fixed 30-minute model —
is a design decision, not a code fix. Left Open and surfaced for a follow-up
that spans CentralUI + Security, or a design-doc amendment.
### CentralUI-006 — Deployment status page polls every 10s despite the documented SignalR-push design
@@ -300,7 +310,16 @@ If polling is kept as a fallback, fetch only changed/in-progress records.
**Resolution**
_Unresolved._
_Unresolved — a genuine SignalR-push fix requires an event source in another
module._ Verified 2026-05-16: `Deployments.razor` does poll every 10s, contrary
to the design doc. But a real push implementation needs the **Deployment
Manager** module (`ScadaLink.DeploymentManager``DeploymentService` /
`ArtifactDeploymentService` write the `DeploymentRecord` rows) to raise a
status-change event/observable that the page subscribes to; there is no such
event today and no CentralUI-only seam to subscribe to. Building that event
source is out of scope for a CentralUI-only review. Left Open and surfaced for a
follow-up that adds a deployment-status broadcaster in DeploymentManager (or a
design-doc amendment acknowledging the polling fallback).
### CentralUI-007 — Monitoring nav links to Deployment-only pages are shown to all roles
@@ -308,7 +327,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Layout/NavMenu.razor:69-78`; `src/ScadaLink.CentralUI/Components/Pages/Monitoring/EventLogs.razor:2`; `src/ScadaLink.CentralUI/Components/Pages/Monitoring/ParkedMessages.razor:2` |
**Description**
@@ -333,7 +352,17 @@ all-roles (it is, per the design).
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). Confirmed: both `EventLogs.razor` and
`ParkedMessages.razor` carried a bare `[Authorize]`, so any authenticated user
could query site event logs and retry/discard parked messages — contrary to the
design doc's Deployment-Role classification. Both pages now use
`[Authorize(Policy = AuthorizationPolicies.RequireDeployment)]`, and the
"Event Logs" / "Parked Messages" nav links were moved out of the all-roles
Monitoring block into an `<AuthorizeView Policy="RequireDeployment">` (Health
Dashboard stays all-roles, as the design intends). Regression tests
`MonitoringAuthorizationTests.{EventLogsPage,ParkedMessagesPage}_RequiresDeploymentPolicy`
fail against the pre-fix code and pass after;
`HealthDashboard_IsIntentionallyAllAuthenticatedRoles` guards the all-roles page.
### CentralUI-008 — Audit-log date filters treat browser-local datetimes as UTC
@@ -341,7 +370,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Pages/Monitoring/AuditLog.razor:242-243` |
**Description**
@@ -364,7 +393,19 @@ time-range filters.
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). Confirmed: `FetchPage` wrapped the
`datetime-local` value with `new DateTimeOffset(value, TimeSpan.Zero)`,
relabelling the browser-local wall-clock value as UTC and shifting the audit
query window by the user's offset. Added a pure helper
`Components/BrowserTime.LocalInputToUtc(DateTime?, int)` that converts a
local-input value to UTC using the browser's `Date.getTimezoneOffset()`;
`AuditLog.razor` now fetches that offset once via JS interop in
`OnAfterRenderAsync` (defaulting to 0/UTC on prerender or a disconnected
circuit) and runs both `from`/`to` filters through the helper. Regression suite
`BrowserTimeTests` (5 tests) fails against the naive relabelling and passes
after — including `LocalInputToUtc_NonUtcBrowser_DoesNotEqualNaiveRelabelling`,
which pins the exact pre-fix bug. `EventLogs.razor` was checked and has no
time-range filters, so it is unaffected.
### CentralUI-009 — `DebugView` stream callbacks touch a possibly-disposed `ToastNotification`
@@ -372,7 +413,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Pages/Deployment/DebugView.razor:400-409,538-544` |
**Description**
@@ -395,7 +436,21 @@ callbacks should no-op once disposed.
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). Confirmed: the `onEvent`/`onTerminated`
callbacks captured `this` and `_toast` and ran on an Akka/gRPC thread with no
disposal guard. Added a `volatile bool _disposed` flag, set first thing in
`Dispose()` before the stream is stopped. Every callback now checks `_disposed`
and no-ops if set; the render dispatch goes through a new `SafeInvokeAsync`
helper that re-checks the flag and swallows `ObjectDisposedException` should the
component be disposed between the guard and the dispatch. Regression tests
`DebugViewDisposalTests.{DebugView_HasDisposalGuardField,
DebugView_Dispose_SetsDisposedFlag_AndIsIdempotent}` pin the observable contract
(the guard field exists; `Dispose()` sets it and is idempotent) — the first
fails against the pre-fix code, which had no `_disposed` field. The Akka-thread
timing race itself is not deterministically reproducible in a unit test:
`DebugStreamService` is a non-virtual concrete class with no seam to inject and
later fire the callbacks, so the closest meaningful tests pin the guard
mechanism rather than the race window.
### CentralUI-010 — `ToastNotification` auto-dismiss continuation runs after component disposal
@@ -403,7 +458,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Shared/ToastNotification.razor:62-71,90` |
**Description**
@@ -424,7 +479,21 @@ body in a try/catch for `ObjectDisposedException`.
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). Confirmed: `AddToast` scheduled
`Task.Delay(...).ContinueWith(...)` with no cancellation and `Dispose()` was an
empty body, so the continuation ran `InvokeAsync(StateHasChanged)` against a
disposed component. Added a `CancellationTokenSource _disposalCts` cancelled in
`Dispose()`; the auto-dismiss is now an `AutoDismissAsync` method that awaits
`Task.Delay(dismissMs, token)`, returns on `OperationCanceledException`, and
wraps the post-delay `InvokeAsync(StateHasChanged)` in a try/catch for
`ObjectDisposedException`. `AddToast` also short-circuits if the component is
already disposed. Regression tests:
`ToastNotificationTests.ShowToast_AfterDisposal_IsNoOp_AndSchedulesNothing`
fails against the pre-fix code (which still added the toast / mis-scheduled
after disposal) and passes after;
`AutoDismiss_AfterDisposal_DoesNotThrowUnobservedException` and
`AutoDismiss_BeforeDisposal_StillRemovesToast` guard the no-throw and
still-works behaviours.
### CentralUI-011 — `DiffDialog` leaves a dangling `TaskCompletionSource` when disposed while open
@@ -432,7 +501,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Shared/DiffDialog.razor:89-95,151-157` |
**Description**
@@ -452,7 +521,15 @@ awaiter completes deterministically.
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). Confirmed: `OpenAsync` returned
`_tcs.Task`, completed only by `Close()`; `DisposeAsync` never touched the TCS,
so disposing the dialog while open left the awaiting caller suspended forever.
`DisposeAsync` now calls `_tcs?.TrySetResult(false)` before unlocking the body,
so a dialog disposed while open resolves its caller to `false` (not confirmed).
Regression test `DiffDialogTests.DisposeAsync_WhileOpen_CompletesPendingTask`
fails against the pre-fix code (the pending task stays `WaitingForActivation`)
and passes after; `Close_CompletesPendingTaskWithTrue` guards the normal close
path.
### CentralUI-012 — N+1 query loading data connections for the Sites page
@@ -460,7 +537,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Performance & resource management |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Pages/Admin/Sites.razor:196-205` |
**Description**
@@ -479,7 +556,15 @@ summary in a single query.
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). Confirmed: `LoadDataAsync` looped
`GetDataConnectionsBySiteIdAsync(site.Id)` once per site (N+1). `ISiteRepository`
already exposes `GetAllDataConnectionsAsync()` and `DataConnection` carries a
`SiteId`, so the loop was replaced with a single `GetAllDataConnectionsAsync()`
call grouped client-side by `SiteId` — one query regardless of site count, on
every load and post-delete refresh. Regression tests
`SitesPageTests.{LoadData_FetchesAllConnectionsInOneQuery_NoPerSiteQueries,
LoadData_GroupsConnectionsBySite_AndRendersThem}` fail against the pre-fix code
(`GetDataConnectionsBySiteIdAsync` was called per site) and pass after.
### CentralUI-013 — `ScriptAnalysisService` blocks on async shared-script lookups
@@ -487,8 +572,8 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Status | Open |
| Location | `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:951-952` |
| Status | Resolved |
| Location | `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:951-952` (actual call at `:975`) |
**Description**
@@ -509,15 +594,27 @@ them synchronously would remove the blocking call.
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). Confirmed (the sync-over-async call is at
`:975`, not `:951-952` as originally cited — `ResolveCalledShape`'s
`Scripts.CallShared` branch). Took the recommended root-cause fix: `Hover` and
`SignatureHelp` are now `async Task<...>` and `ResolveCalledShape` is
`async Task<ScriptShape?>` which `await`s `_sharedScripts.GetShapesAsync()`
instead of `.GetAwaiter().GetResult()`. The two minimal-API endpoints
(`/hover`, `/signature-help`) were updated to `await` the methods. Regression
suite `ScriptAnalysisAsyncResolveTests` (3 tests): the structural test
`HoverAndSignatureHelp_AreAsync_NotSyncOverAsync` fails against the pre-fix
synchronous signatures, and two behavioural tests resolve shared-script shapes
through a catalog that only completes after `Task.Yield()` (a genuinely async
source). The five existing `Hover`/`SignatureHelp` tests in
`ScriptAnalysisServiceTests` were updated to `await` the now-async methods.
### CentralUI-014 — Test Run side effects (HTTP/SQL/SMTP) fire against production services
| | |
|--|--|
| Severity | Medium |
| Severity | ~~Medium~~ → Low (re-triaged 2026-05-16 — see Resolution) |
| Category | Error handling & resilience |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:254-259`; `src/ScadaLink.CentralUI/ScriptAnalysis/SandboxHostHelpers.cs:26-117` |
**Description**
@@ -541,7 +638,25 @@ dry-run mode that stubs the helpers, defaulting to dry-run.
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending) — **re-triaged**. Re-verified against the
reviewed commit `9c60592`: the finding's premise that "the blast radius is not
surfaced to the user" is **inaccurate**. Both Test Run surfaces that can produce
real side effects — `SharedScriptForm.razor` and the script Test Run in
`TemplateEdit.razor` — already carry a prominent `Real I/O` badge on the panel
header and an `alert-warning` block stating `External`/`Database`/`Notify` calls
"fire for real … real HTTP, real SQL, real emails. Side effects are permanent"
(present since commit `2951507`, an ancestor of the reviewed commit, confirmed
via `git merge-base`). `ApiMethodForm.razor` (Inbound API kind) has **no**
real-I/O surface at all — `SandboxInboundScriptHost` exposes only
`Parameters`/`Route` (Route throws) — and correctly omits the badge while still
warning. Revealing the panel ("Test Run" toggle) then clicking "Run" is itself a
two-step explicit opt-in. The minimum recommendation is therefore already met;
the optional dry-run mode is a separate feature decision the design doc does not
mandate. Severity re-triaged Medium → Low (intentional, documented, clearly
warned behaviour — not a bug). Regression suite `TestRunWarningTests` (3 tests)
pins the `Real I/O` badge + warning text in `SharedScriptForm`/`TemplateEdit`
and the deliberate absence of the badge in `ApiMethodForm`, so the warning
cannot silently regress.
### CentralUI-015 — `DialogService` continuations resolve off the render thread