Files
lmxopcua/code-reviews/Admin/findings.md
Joseph Doherty 2b33b64a58 fix(admin): resolve Low code-review findings (Admin-010,011,012)
- Admin-010: vendor Bootstrap 5.3.3 (CSS + JS bundle + maps + provenance
  README) under wwwroot/lib/bootstrap and reference local paths from
  App.razor — Admin no longer pulls Bootstrap from jsDelivr.
- Admin-011: swap FleetStatusPoller's three plain dictionaries for
  ConcurrentDictionary so ResetCache can't race a poll tick.
- Admin-012: drop the EquipmentId column from EquipmentCsvImporter (per
  admin-ui.md — equipment id is system-derived from EquipmentUuid);
  EquipmentImportBatchService and the textarea placeholder updated to
  match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:24:07 -04:00

223 lines
25 KiB
Markdown

# Code Review — Admin
| Field | Value |
|---|---|
| Module | `src/Server/ZB.MOM.WW.OtOpcUa.Admin` |
| Reviewer | Claude Code |
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 0 |
## Checklist coverage
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Admin-005 |
| 2 | OtOpcUa conventions | Admin-010 |
| 3 | Concurrency & thread safety | Admin-011 |
| 4 | Error handling & resilience | Admin-008, Admin-013 |
| 5 | Security | Admin-001, Admin-002, Admin-003, Admin-004, Admin-006 |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | Admin-007, Admin-012 |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Admin-009 |
| 10 | Documentation & comments | Admin-012 |
## Findings
### Admin-001
| Field | Value |
|---|---|
| Severity | Critical |
| Category | Security |
| Location | `Components/Routes.razor:4-11`, `Program.cs:150` |
| Status | Resolved |
**Description:** The router uses a plain `RouteView` (not `AuthorizeRouteView`), and `MapRazorComponents<App>()` is registered without `.RequireAuthorization()`. A page-level `[Authorize]` attribute on a routable Razor component is only enforced when the router is `AuthorizeRouteView` — with `RouteView` the attribute is inert. Consequently every page in the app, including those that carry `@attribute [Authorize]` (`ClusterDetail`, `DraftEditor`, `Reservations`, `RoleGrants`, `Certificates`, `VirtualTags`, `ScriptedAlarms`, `ScriptLog`, `DiffViewer`, `ImportEquipment`, `Account`), is reachable by a fully unauthenticated user. There is no authentication gate anywhere in the pipeline. An anonymous browser can read the full fleet configuration, audit log, certificates and ACLs, and exercise mutating pages (see Admin-002).
**Recommendation:** Replace `RouteView` with `AuthorizeRouteView` in `Routes.razor` (with a `<NotAuthorized>` slot that redirects to `/login`), or call `.RequireAuthorization()` on the `MapRazorComponents` endpoint with `/login` and `/auth/*` explicitly allowed anonymous. Add a fallback policy (`AddAuthorizationBuilder().SetFallbackPolicy(...)`) so new pages are secure-by-default. Re-verify every page after the gate is in place.
**Resolution:** Resolved 2026-05-22 — `Routes.razor` switched to `AuthorizeRouteView` with a `NotAuthorized` slot routing unauthenticated callers to `/login` via a new `RedirectToLogin` component; `AddAuthorizationBuilder().SetFallbackPolicy(RequireAuthenticatedUser())` makes pages secure-by-default; `Login.razor` opts out with `[AllowAnonymous]` so the login page and static assets stay anonymous. Covered by `PageAuthorizationTests` (verified failing pre-fix, passing post-fix).
### Admin-002
| Field | Value |
|---|---|
| Severity | Critical |
| Category | Security |
| Location | `Components/Pages/Clusters/NewCluster.razor:1-7`, `Home.razor`, `Fleet.razor`, `Hosts.razor`, `AlarmsHistorian.razor`, `Clusters/ClustersList.razor`, `Clusters/Generations.razor`, `Drivers/FocasDetail.razor` |
| Status | Resolved |
**Description:** Several routable pages carry no authorization attribute at all. Most critically `NewCluster` (`/clusters/new`) is a mutating page — its `CreateAsync` writes a new `ServerCluster` row and a draft generation. Combined with Admin-001 (the router does not enforce `[Authorize]` either), an unauthenticated user can create clusters and seed config-DB rows. `Home`, `Fleet`, `Hosts`, `AlarmsHistorian`, `ClustersList`, `Generations` and `FocasDetail` likewise expose fleet topology, host status, historian diagnostics and generation history to anonymous callers.
**Recommendation:** Add `@attribute [Authorize(...)]` to every routable page with the role/policy appropriate to its function (`NewCluster` and other write surfaces -> `CanPublish`/`CanEdit`; read pages -> an authenticated-user policy). A solution-wide fallback policy (see Admin-001) is the durable fix; per-page attributes remain the explicit declaration of intent.
**Resolution:** Resolved 2026-05-22 — `@attribute [Authorize]` added to every unprotected routable page (`Home`, `Fleet`, `Hosts`, `AlarmsHistorian`, `ClustersList`, `FocasDetail`, `ModbusAddressPreview`, `ModbusDiagnostics`); `NewCluster` gated with `[Authorize(Policy = "CanPublish")]` per the admin-ui.md FleetAdmin cluster-create flow. Re-triage note: `Clusters/Generations.razor` carries no `@page` directive — it is a child component of `ClusterDetail`, not a routable page, so it needs no attribute (it inherits the parent route's gate). The Admin-001 fallback policy is the durable secure-by-default backstop; the per-page attributes are the explicit declaration of intent. Covered by `PageAuthorizationTests`.
### Admin-003
| Field | Value |
|---|---|
| Severity | High |
| Category | Security |
| Location | `Program.cs:137-139`, `Hubs/FleetStatusHub.cs:11`, `Hubs/AlertHub.cs:10`, `Hubs/ScriptLogHub.cs:30` |
| Status | Resolved |
**Description:** All three SignalR hubs (`/hubs/fleet`, `/hubs/alerts`, `/hubs/script-log`) are mapped with no `[Authorize]` attribute and no `.RequireAuthorization()` on the `MapHub` call. Any unauthenticated client can open a hub connection: `FleetStatusHub.SubscribeFleet()` streams every node generation/role/resilience state, `AlertHub` pushes all fleet alerts (including failure detail text), and `ScriptLogHub.TailLogAsync` streams the contents of the server `scripts-*.log` files. This is an unauthenticated information-disclosure channel that bypasses the (already broken — see Admin-001) page auth entirely.
**Recommendation:** Add `[Authorize]` to each `Hub` class, or chain `.RequireAuthorization()` onto each `MapHub(...)` call in `Program.cs`. The hub `SubscribeCluster`/`TailLogAsync` methods should additionally validate that the caller claims permit the requested cluster/script scope.
**Resolution:** Resolved 2026-05-22 — `[Authorize]` added to `FleetStatusHub`, `AlertHub` and `ScriptLogHub`, and `.RequireAuthorization()` chained onto all three `MapHub(...)` calls in `Program.cs` as a belt-and-braces backstop, so an anonymous client can no longer open any hub connection. Covered by `AuthEndpointsTests.Anonymous_hub_negotiate_is_rejected`.
### Admin-004
| Field | Value |
|---|---|
| Severity | High |
| Category | Security |
| Location | `appsettings.json:3,13-14` |
| Status | Resolved |
**Description:** The checked-in `appsettings.json` contains live-looking secrets in plaintext: the `ConfigDb` connection string with `User Id=sa;Password=OtOpcUaDev_2026!` and the LDAP `ServiceAccountPassword: "serviceaccount123"`. It also sets `Encrypt=False` and `AllowInsecureLdap: true`, so the SQL and LDAP credentials travel unencrypted on the wire. Committing the `sa` account password and a service-account password to source control is a credential-exposure risk; `sa` additionally grants full server control, conflicting with the `ClusterService` doc comment that production should connect with a least-privilege grant.
**Recommendation:** Move all secrets out of the committed file — use user-secrets for dev and environment variables / a secret store for production; leave only non-secret placeholders in `appsettings.json`. Use a least-privilege SQL login rather than `sa`. Enable TLS for both SQL (`Encrypt=True`) and LDAP (`UseTls=true`, `AllowInsecureLdap=false`) for any non-loopback deployment, and document the dev-only exception.
**Resolution:** Resolved 2026-05-22 — the `sa` connection string and the LDAP `ServiceAccountPassword` were replaced with empty placeholders in `appsettings.json`; a `_secrets` note documents that they are supplied via user-secrets (dev) or the `ConnectionStrings__ConfigDb` / `Authentication__Ldap__ServiceAccountPassword` environment variables (prod), and that the connection string must use `Encrypt=True` and a least-privilege SQL login. A `UserSecretsId` was added to the Admin csproj, and `Program.cs` now fails fast with a clear message when `ConfigDb` is empty/missing. Covered by `AppSettingsSecretHygieneTests`.
### Admin-005
| Field | Value |
|---|---|
| Severity | High |
| Category | Correctness & logic bugs |
| Location | `Components/Pages/Login.razor:15,107-110` |
| Status | Resolved |
**Description:** `Login.razor` is an interactive component (the project default render mode is interactive server; the page declares no `@rendermode` but uses `EditForm`/`InputText` interactive binding and runs `SignInAsync` from an event handler). It calls `HttpContext.SignInAsync(...)` followed by `ctx.Response.Redirect("/")` from within a SignalR circuit callback. Writing auth cookies and HTTP redirect headers requires a live, unstarted HTTP response; in an interactive circuit the original HTTP response has long completed, so the cookie is typically not emitted and the redirect is ineffective (or throws "response has already started"). `admin-ui.md` section "Operator authentication" explicitly specifies the login as a static server-rendered HTML form POSTing to a `/auth/login` minimal-API endpoint with `data-enhance="false"` — that endpoint is not implemented and is not mapped in `Program.cs`.
**Recommendation:** Implement the login as designed: a static-rendered form (`@rendermode` none, `data-enhance="false"`) posting to a `MapPost("/auth/login", ...)` minimal-API handler that does the LDAP bind, grant resolution, `SignInAsync` and redirect while the HTTP response is still owned by the endpoint. Do not perform `SignInAsync` from an interactive circuit.
**Resolution:** Resolved 2026-05-22 — `Login.razor` rewritten as a static-rendered plain HTML `<form method="post" action="/auth/login" data-enhance="false">` (no `@rendermode`, no `EditForm`/`SignInAsync` in a circuit); the LDAP bind, grant resolution, cookie `SignInAsync` and redirect now run in a new `AuthEndpoints.MapAuthEndpoints()` minimal-API handler (`/auth/login`, `/auth/logout`) while the endpoint still owns the HTTP response. The handler is `AllowAnonymous`, carries an open-redirect guard on `returnUrl`, and surfaces bind errors back to the login page via a query-string. Covered by `AuthEndpointsTests` (valid login issues the cookie, invalid login redirects with error, open-redirect rejected, logout clears the cookie).
### Admin-006
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Security |
| Location | `Components/Layout/MainLayout.razor:47-49`, `Program.cs:129,131-135` |
| Status | Resolved |
**Description:** `app.UseAntiforgery()` is enabled, but the Sign-out form (`<form method="post" action="/auth/logout">`) renders no antiforgery token, and the `MapPost("/auth/logout", ...)` endpoint does not call `.DisableAntiforgery()` or otherwise opt out. Depending on framework version this either makes logout fail with a 400 for legitimate users, or — if the endpoint is treated as exempt — leaves logout as an unprotected state-changing POST (CSRF logout). The same concern applies to the login form once Admin-005 is addressed.
**Recommendation:** Emit an antiforgery token in the logout form and let `UseAntiforgery()` validate it; or explicitly and deliberately mark the endpoint `.DisableAntiforgery()` if a tokenless logout is intended. Verify login/logout round-trips after the change.
**Resolution:** Resolved 2026-05-22 — `<AntiforgeryToken />` added to the sign-out form in `MainLayout.razor` and `.DisableAntiforgery()` removed from the `/auth/logout` endpoint so `UseAntiforgery()` validates the token; a tokenless POST now returns 400, preventing CSRF-logout. The login endpoint retains `.DisableAntiforgery()` (login is not a state-changing operation CSRF can abuse). `AuthEndpointsTests.Logout_without_antiforgery_token_is_rejected` regression-guards this.
### Admin-007
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Design-document adherence |
| Location | `Components/Pages/Clusters/NewCluster.razor:91,95-96` |
| Status | Resolved |
**Description:** `NewCluster.CreateAsync` hardcodes `CreatedBy = "admin-ui"` (both on the `ServerCluster` row and the draft generation) instead of the signed-in operator principal name. `admin-ui.md` section "Audit" requires "the operator principal" be recorded on every write. The audit trail therefore cannot attribute cluster creation to a person. The same literal would apply to any anonymous creation that Admin-001/002 currently permit.
**Recommendation:** Pass the authenticated user identity (`ClaimTypes.Name` / `NameIdentifier` from the cascaded `AuthenticationState`) as `createdBy`. Apply the same pattern to every other Admin write path that records a `CreatedBy`/`PublishedBy`/`ReleasedBy` field.
**Resolution:** Resolved 2026-05-22 — `NewCluster.razor` and `ClusterDetail.razor` (the two pages that call `ClusterService.CreateAsync` / `GenerationService.CreateDraftAsync` with a hardcoded literal) now resolve `ClaimTypes.Name` / `ClaimTypes.NameIdentifier` from the cascaded `AuthenticationState` and pass the operator principal name as `createdBy`; the fallback is `"unknown"` (defensive, should never occur on an `[Authorize]`-gated page).
### Admin-008
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Error handling & resilience |
| Location | `Services/ReservationService.cs:28-37` |
| Status | Resolved |
**Description:** `ReservationService.ReleaseAsync` calls `sp_ReleaseExternalIdReservation` with only `@Kind`, `@Value`, `@ReleaseReason`. `admin-ui.md` section "Release an external-ID reservation" specifies the proc sets `ReleasedBy` to the FleetAdmin who performed the release, and the action is the only path that allows ZTag/SAPID reuse and "requires explicit FleetAdmin action with a documented reason." The service does not capture or pass the operator principal, so the compliance audit trail for a release records no actor (unless the proc derives it from the DB session login, which would be the shared service account, not the operator).
**Recommendation:** Add an operator-principal parameter to `ReleaseAsync`, pass it to the stored proc as `@ReleasedBy`, and have callers supply the signed-in user. Confirm the proc signature accepts it.
**Resolution:** Resolved 2026-05-22 — a new EF migration (`20260522000001_AddReleasedByToReleaseExternalIdReservation`) adds `@ReleasedBy nvarchar(128)` to `sp_ReleaseExternalIdReservation` and uses it for both `ExternalIdReservation.ReleasedBy` and `ConfigAuditLog.Principal` (replacing `SUSER_SNAME()`); `ReservationService.ReleaseAsync` gains a `releasedBy` parameter with a guard; `Reservations.razor` resolves `ClaimTypes.Name` / `ClaimTypes.NameIdentifier` from the cascaded `AuthenticationState` and passes the operator principal to the service.
### Admin-009
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Testing coverage |
| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Admin` (whole module) |
| Status | Resolved |
**Description:** The module most security-critical behaviours have no enforced test coverage at the boundary that matters. There is no test that an unauthenticated request to a page or hub is rejected (which would have caught Admin-001/002/003), no test of the login -> cookie issuance round-trip (Admin-005), and the `AdminRoleGrantResolver` / `ClusterRoleClaims` authorization logic is exercised only in isolation. `InternalsVisibleTo` points at `ZB.MOM.WW.OtOpcUa.Admin.Tests`, but the auth pipeline itself is not asserted end-to-end. Per `REVIEW-PROCESS.md` category 9 these are untested critical paths.
**Recommendation:** Add `WebApplicationFactory`-based integration tests asserting: (a) anonymous GET of each protected route returns 302->/login or 401; (b) anonymous hub connect is refused; (c) a valid login issues the cookie and a subsequent request is authorized; (d) a `ConfigViewer` is denied `CanPublish` pages. Wire the check into the `*.Admin.Tests` suite.
**Resolution:** Resolved 2026-05-22 — (a) covered by existing `PageAuthorizationTests`; (b) covered by existing `AuthEndpointsTests.Anonymous_hub_negotiate_is_rejected`; (c) covered by existing `AuthEndpointsTests.Valid_login_issues_the_auth_cookie_and_redirects_home`; (d) new `AdminAuthPipelineTests` adds a `WebApplicationFactory` with a `RoleInjectingHandler` that stamps requests with caller-supplied roles, asserting that `ConfigViewer` is denied `CanPublish`-gated pages (403/302) while `FleetAdmin` is permitted, and that a `FleetAdmin` session can reach protected pages.
### Admin-010
| Field | Value |
|---|---|
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `Components/App.razor:9,16` |
| Status | Resolved |
**Description:** `App.razor` loads Bootstrap CSS and JS from the `cdn.jsdelivr.net` CDN. `admin-ui.md` section "Tech Stack" specifies "Bootstrap 5 vendored under `wwwroot/lib/bootstrap/`" precisely so the Admin app has no third-party runtime dependency. A CDN reference makes the UI fail in air-gapped / locked-down fleet deployments (a stated deployment target), introduces an uncontrolled third-party origin, and is not covered by a Subresource Integrity hash.
**Recommendation:** Vendor Bootstrap under `wwwroot/lib/bootstrap/` and reference the local copies, as the design doc requires. If a CDN is retained for any asset, add `integrity` + `crossorigin` SRI attributes.
**Resolution:** Resolved 2026-05-23 — Bootstrap 5.3.3 (CSS + JS bundle, plus their source maps) vendored under `src/Server/ZB.MOM.WW.OtOpcUa.Admin/wwwroot/lib/bootstrap/{css,js}/`; `App.razor` now references the local copies (`lib/bootstrap/css/bootstrap.min.css`, `lib/bootstrap/js/bootstrap.bundle.min.js`); a README under the vendor directory records provenance + upgrade steps. Covered by `BootstrapVendoringTests` (asserts no `cdn.jsdelivr.net`/`cdnjs`/`unpkg` references in `App.razor`, that the vendored files exist with non-trivial sizes, and that `App.razor` references the vendored paths) — verified failing pre-fix, passing post-fix.
### Admin-011
| Field | Value |
|---|---|
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `Hubs/FleetStatusPoller.cs:24-26,98-103` |
| Status | Resolved |
**Description:** `FleetStatusPoller` keeps three plain `Dictionary<>` fields (`_last`, `_lastRole`, `_lastResilience`) mutated from `PollOnceAsync`. The poller `ExecuteAsync` loop is single-threaded so the steady-state poll path is safe, but `ResetCache()` (exposed `internal` for tests) clears those same dictionaries with no synchronization. If a test (or any caller) invokes `ResetCache()` while a poll tick is mid-iteration, the `Dictionary` enumeration/mutation race can throw `InvalidOperationException` or corrupt state.
**Recommendation:** Either document `ResetCache()` as "only safe when the poller is stopped" and have tests stop the service first, or guard the three dictionaries with a lock / swap them atomically. Using `ConcurrentDictionary` (as the sibling `ResilientLdapGroupRoleMappingService` does) would make the intent explicit.
**Resolution:** Resolved 2026-05-23 — `_last`, `_lastRole`, and `_lastResilience` swapped from plain `Dictionary<,>` to `ConcurrentDictionary<,>` so concurrent `ResetCache()` / poll-tick mutations are safe by construction (the recommendation's "explicit intent" form). Covered by `FleetStatusPollerConcurrencyTests` — one test guards the structural choice via reflection so a future refactor cannot silently revert; the other stress-runs concurrent mutate + `ResetCache()` via reflection, verifying the race throws no exception (verified failing pre-fix with `Dictionary<,>`).
### Admin-012
| Field | Value |
|---|---|
| Severity | Low |
| Category | Design-document adherence |
| Location | `Services/EquipmentCsvImporter.cs:18-19,33-37,229,232` |
| Status | Resolved |
**Description:** `EquipmentCsvImporter` declares `EquipmentId` as a required CSV column and parses it into a `required` field. `admin-ui.md` section "Equipment CSV import" (revised after adversarial review finding #4) is explicit: "No `EquipmentId` column — operator-supplied EquipmentId would mint duplicate equipment identity on typos ... never accepted from CSV imports." `EquipmentId` is system-derived (`EQ-` plus first 12 hex chars of `EquipmentUuid`). Accepting it from CSV either contradicts the design or silently lets an import set an identity field the doc says is un-settable. The XML doc on the class also cites the column as required per "decision #117", so either the code or the design doc is stale. `EquipmentImportBatchService.StageRowsAsync` propagates `row.EquipmentId` into the staging row, so any change must cover the finalize path.
**Recommendation:** Reconcile with the design: drop `EquipmentId` from `RequiredColumns` and the `EquipmentCsvRow` shape (deriving it from `EquipmentUuid` at finalize time), or — if accepting it is a deliberate reversal — update `admin-ui.md` and the decision log so the two agree.
**Resolution:** Resolved 2026-05-23 — code reconciled with the design: `EquipmentId` dropped from `EquipmentCsvImporter.RequiredColumns`, `BuildRow`, `GetCell`, and the `EquipmentCsvRow` shape; the class XML doc now records the admin-ui.md "No EquipmentId column" rule. The finalize path is covered: `EquipmentImportBatchService.StageRowsAsync` now derives the staging-row's `EquipmentId` via `DraftValidator.DeriveEquipmentId(equipmentUuid)`, and `FinaliseBatchAsync` re-derives it from the UUID that actually lands in the `Equipment` row (so a blank/invalid staged UUID that gets replaced by `Guid.NewGuid()` no longer leaves `EquipmentId` and `EquipmentUuid` out of sync). `ImportEquipment.razor`'s textarea placeholder updated to the new header shape. Covered by `EquipmentCsvNoEquipmentIdColumnTests` (five tests guarding `RequiredColumns`/`OptionalColumns`/`EquipmentCsvRow` shape and asserting CSVs with an `EquipmentId` column are rejected as unknown while CSVs without are accepted) — verified failing pre-fix, passing post-fix. The existing `EquipmentCsvImporterTests` + `EquipmentImportBatchServiceTests` were updated to the new header shape and pass green (DB-backed suite ran against `10.100.0.35,14330`).
### Admin-013
| Field | Value |
|---|---|
| Severity | High |
| Category | Error handling & resilience |
| Location | `Components/Pages/Clusters/ClusterDetail.razor:180-197`, `Components/Pages/Clusters/AclsTab.razor`, `Components/Pages/Clusters/RedundancyTab.razor`, `Components/Pages/RoleGrants.razor`, `Components/Pages/Hosts.razor`, `Components/Pages/ScriptLog.razor`, `Program.cs:157-159` |
| Status | Resolved |
**Description:** The Admin-003 fix gated all three SignalR hubs with `[Authorize]` plus `.RequireAuthorization()`, but the six pages that open a client `HubConnection` to those hubs were never updated to authenticate. A server-side Blazor `HubConnection` runs inside the interactive circuit and has no access to the browser's HttpOnly `OtOpcUa.Admin` auth cookie, so the hub `negotiate` request returns 401. Four pages (`ClusterDetail`, `AclsTab`, `RedundancyTab`, `RoleGrants`) called `HubConnection.StartAsync()` with no `try`/`catch`, so the 401 surfaced as an unhandled exception — a full HTTP 500 page for the prerendered `/clusters/{ClusterId}` route (the core cluster-config surface) and a faulted circuit for the others. `Hosts` and `ScriptLog` already wrapped the connect in `try`/`catch`, so they did not crash, but the SignalR live-update feature was non-functional Admin-wide regardless. The Admin-003 hardening was therefore incomplete: it secured the hub server side without giving the in-process clients any way to present credentials. Discovered during a post-review browser smoke test of `/clusters/cluster-dev`.
**Recommendation:** Two parts. (1) Stop the crash: guard every `HubConnection.StartAsync()` in `try`/`catch`, matching the best-effort pattern already documented in `Hosts.razor` — a hub hiccup must degrade live updates, not fault the page. (2) Restore the feature: give the hub clients a real credential. Cookie forwarding is not viable (the HttpOnly cookie is unreachable from the interactive circuit and persisting it into page state would leak it), so add a token scheme — mint a short-lived token for the circuit's authenticated user and supply it via `HttpConnectionOptions.AccessTokenProvider`, with a matching server-side authentication handler on the hub endpoints.
**Resolution:** Resolved 2026-05-22 — (1) `StartAsync`/`SendAsync` wrapped in `try`/`catch` on `ClusterDetail`, `AclsTab`, `RedundancyTab` and `RoleGrants` so a hub failure degrades gracefully. (2) Added a bearer-token auth path: `HubTokenService` mints/validates short-lived tokens using ASP.NET Core Data Protection (no signing-key management, no new packages); `HubTokenAuthenticationHandler` is a custom `HubToken` scheme reading the token from the `Authorization: Bearer` header (negotiate) or the `access_token` query parameter (WebSocket upgrade); the `HubClients` authorization policy runs both the cookie and `HubToken` schemes and is applied via `RequireAuthorization("HubClients")` on all three `MapHub` calls; `AdminHubConnectionFactory` builds connections with an `AccessTokenProvider` that re-mints a token for the circuit's authenticated user on every (re)connect, and all six hub-consuming pages resolve their connections through it. Verified end-to-end in the browser: hub `negotiate` returns 200 and the WebSocket upgrades (101) where it previously 401'd.