Files
wwtools/mbproxy/codereviews/2026-05-16/Frontend.md
T
Joseph Doherty b222362ce0 mbproxy: remediate the 2026-05-16 code-review findings
Fixes every finding from the codereviews/2026-05-16 multi-agent review
(2 Critical, 20 Major, 38 Minor) and adds that review to the repo.

Highlights: dashboard XSS escape; response cache invalidated on the
write request (not just the response); ReloadValidator now runs at
startup so port collisions / duplicate names / malformed Resilience
profiles fail fast; AdminPort 0 genuinely disables the admin endpoint;
PlcListener accept-loop faults propagate to the supervisor's faulted
path; reconciler Restart builds before removing; Resilience pipelines
are restart-only from a frozen snapshot; multiplexer connect-race leak,
watchdog party-list snapshot, backend-response and FC16 framing
validation; frontend reconnect retry and util.js load guard; plus the
log-event/doc drift sweep and test-port hygiene.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 18:08:06 -04:00

14 KiB
Raw Blame History

Frontend / Live Web Dashboard — Code Review

Scope: commit 0308490 ("close out the dashboard code-review minor findings"). Files reviewed under src/Mbproxy/Admin/wwwroot/: index.html, plc.html, dashboard.js, detail.js, util.js, theme.css, dashboard.css, detail.css. Vendored assets out of scope. Cross-checked against docs/Operations/StatusPage.md, src/Mbproxy/Admin/StatusDto.cs, src/Mbproxy/Admin/DebugDto.cs, src/Mbproxy/Admin/StatusBroadcaster.cs, and src/Mbproxy/Admin/StatusHub.cs. Prior findings from codereviews/2026-05-15/Frontend.md were verified: the XSS on detail.js (C1 in that review) and the shortTime/escapeHtml gap are confirmed fixed. The cold-start retry loop, URL-decode guard, NaN guards, accessibility attributes, and util.js extraction are all verified resolved. This review covers the current code fresh.

Finding counts: 1 Critical, 1 Major, 4 Minor.


Critical Findings

C1 — dashboard.js stateChip renders listener.state into innerHTML without escaping

src/Mbproxy/Admin/wwwroot/dashboard.js:48

What is wrong. stateChip(state) in dashboard.js returns:

return `<span class="chip ${cls}">${state}</span>`;   // line 48

The state value is a server-supplied string from plc.listener.state, interpolated raw into an innerHTML-bound template string with no escapeHtml call. It flows into innerHTML at line 166:

<td>${stateChip(plc.listener.state)}</td>

Inconsistency with detail.js. The identical stateChip function in detail.js was fixed by the prior review and now correctly uses escapeHtml(state) at line 76:

return `<span class="chip ${cls}">${escapeHtml(state)}</span>`;

The fix was applied only to detail.js and was not propagated back to dashboard.js.

Impact. listener.state is a string on the SignalR wire. Its values in the current implementation are the closed set "bound" / "recovering" / "stopped", produced by the supervisor state machine — so the immediate exploitability is limited. However, the frontend must not rely on server-side type discipline for its own XSS safety (same rationale as the prior C1 finding). Under the same threat model as the prior review — no authentication on the admin port, PLC names and other state fields originate from operator-editable appsettings.json — any path that injects an unexpected string value into listener.state (serialization change, a future hot-reload edge case, a compromised service) would execute arbitrary script in every open fleet-dashboard tab. The fleet dashboard is more exposed than the detail page because it is the persistent long-lived view.

Fix. One character change: replace ${state} with ${escapeHtml(state)} at dashboard.js:48. escapeHtml is already imported from window.mbproxyUtil at line 18.


Major Findings

M1 — onreconnected silently swallows subscribe failure; connected pill is misleading when the subscribe invoke is rejected

src/Mbproxy/Admin/wwwroot/dashboard.js:254, src/Mbproxy/Admin/wwwroot/detail.js:285

What is wrong. Both pages' onreconnected callbacks fire after withAutomaticReconnect restores the transport. They call the hub method (SubscribeFleet / SubscribePlc) and swallow any error with .catch(() => {}):

// dashboard.js:252-255
connection.onreconnected(() => {
  setConn('connected');
  connection.invoke('SubscribeFleet').catch(() => {});
});

If the hub method throws (hub overloaded, exception in OnConnectedAsync, network jitter between the reconnect and the invoke), the connection is live but the subscription is silently dropped. The pill shows connected (green) while the page receives zero further updates. There is no retry, no error indication to the operator, and no watchdog that would notice the dead feed. The cold-start path (connect()) retries with backoff and guards against re-entering on a live socket; the warm-reconnect path has no equivalent safety net.

Impact. On a typical LAN deployment with stable infrastructure this will be rare, but it is the kind of silent failure that is hard to diagnose. An operator watching a page that appears connected but shows stale counter values has no indication anything is wrong. The detail page additionally does not re-arm armSnapshotWatchdog() on the reconnect path, so the 6-second "no data" notice will not fire either if the subscribe silently fails.

Fix. Add a minimal retry for the onreconnected subscribe. The simplest form: invoke, and on rejection set the pill to 'disconnected'/'retrying' and call connect() (the cold-start function already guards against re-starting a non-Disconnected connection). Alternatively, factor the subscribe into a shared helper used by both the cold-start and warm-reconnect paths, so the retry discipline is written once.


Minor Findings

N1 — detail.js does not destructure escapeAttr from window.mbproxyUtil; the attribute-escaping helper is silently unavailable

src/Mbproxy/Admin/wwwroot/detail.js:23

What is wrong. detail.js imports only escapeHtml:

const { escapeHtml } = window.mbproxyUtil;  // line 23 — escapeAttr not imported

escapeAttr is defined in util.js (line 15) and is available on window.mbproxyUtil, but detail.js never references it. All current attribute-context class interpolations in detail.js (the cls slots at lines 65, 76; dirCls at line 222; stale.trim() at line 224) happen to be computed from literal strings, so there is no current XSS. The risk is latent: any future change that passes a server-derived value into a class or other attribute slot in detail.js will produce an unescaped-attribute injection with no indication at the call site that the escaping tool is absent. The inconsistency was directly implicated in the prior review's C1 finding (the fix added escapeHtml to detail.js but did not import escapeAttr).

Recommendation. Change line 23 to:

const { escapeHtml, escapeAttr } = window.mbproxyUtil;

No other changes needed — the helper just needs to be in scope so future reviewers and the linter can verify attribute contexts are correctly escaped.


N2 — prevPdu and rateByName Maps are never pruned; stale entries accumulate across hot-reload PLC removal

src/Mbproxy/Admin/wwwroot/dashboard.js:1112, 6678

What is wrong. updateRates(snapshot) (lines 6678) only writes entries for PLCs present in the current snapshot; it never deletes entries for PLCs removed by a hot-reload. Over a long session with PLC churn, prevPdu and rateByName grow without bound. Concretely: rateByName.size (line 102) is used to decide whether the fleet PDU/s card shows a rate or ; after a PLC is removed its stale entry keeps rateByName.size > 0, so the card shows 0 instead of the arguably more correct for a fleet that has shrunk to zero active PLCs. renderAggregates already iterates only s.plcs (line 91) so the stale rate values are never summed — this is correct — but the Map still grows.

Impact. Negligible in the current 54-PLC deployment with low PLC churn rate. Becomes visible if PLCs are repeatedly added and removed via hot-reload over a long-lived browser session. The rateByName.size false-positive is the most user-visible symptom.

Recommendation. At the end of updateRates, prune both Maps to the current snapshot's PLC name set:

const currentNames = new Set(snapshot.plcs.map(p => p.name));
for (const k of prevPdu.keys())   { if (!currentNames.has(k)) prevPdu.delete(k); }
for (const k of rateByName.keys()) { if (!currentNames.has(k)) rateByName.delete(k); }

N3 — Empty class="" attribute emitted on non-stale debug rows; stale.trim() call is a no-op

src/Mbproxy/Admin/wwwroot/detail.js:223224

What is wrong. The stale/non-stale class is computed as:

const stale = (t.ageSeconds || 0) > 30 ? ' stale' : '';   // line 223
return `<tr class="${stale.trim()}">                        // line 224

When the row is not stale, stale is '', stale.trim() is '', and the emitted markup is <tr class="">. The .trim() call is defensive against the leading space in ' stale', but a simpler and cleaner expression that also avoids the empty-attribute emission is:

const staleCls = (t.ageSeconds || 0) > 30 ? 'stale' : '';
return `<tr${staleCls ? ` class="${staleCls}"` : ''}>

Or simply omit .trim() and use class="${stale}" — the leading space is harmless in CSS class matching. This is cosmetic.

Impact. Zero functional impact. The empty class="" attribute is parsed and ignored by every browser. Flagged as a code-cleanliness observation.


N4 — window.mbproxyUtil is destructured at script-top-level; a util.js load failure aborts the entire page silently

src/Mbproxy/Admin/wwwroot/dashboard.js:18, src/Mbproxy/Admin/wwwroot/detail.js:23

What is wrong. Both scripts destructure window.mbproxyUtil at the top of their IIFE, before any DOM or error-handling setup:

const { escapeHtml, escapeAttr } = window.mbproxyUtil;  // dashboard.js:18
const { escapeHtml } = window.mbproxyUtil;              // detail.js:23

If util.js fails to load (404 after a publish error, embedded-asset routing bug, script tag typo), window.mbproxyUtil is undefined, the destructuring throws a TypeError at the first const, and the remaining script — including DOMContentLoaded, error notices, and the SignalR connection — never runs. The page renders its static HTML indefinitely with no visible error. The HTML <noscript> fallback is absent.

Impact. Unlikely in production given the assets are embedded in the binary and verified by EmbeddedAssetsTests. Worth noting because the failure mode is silent — no browser console error surfaces to the operator, and the page appears partially rendered rather than broken.

Recommendation. Add a null-guard:

if (!window.mbproxyUtil) {
  document.body.innerHTML = '<p style="padding:2rem;color:red">Admin UI failed to load — missing util.js. Check the browser console.</p>';
  throw new Error('window.mbproxyUtil not defined');
}
const { escapeHtml, escapeAttr } = window.mbproxyUtil;

Or equivalently move the destructuring inside DOMContentLoaded after a check. The throw intentionally aborts the script — at least with a visible error — rather than letting it limp along with broken escaping.


What Looks Good

  • All prior C1 and M-series findings from codereviews/2026-05-15/Frontend.md are confirmed fixed. Specifically: t.direction, t.rawHex, t.name, shortTime(c.connectedAtUtc), and l.lastBindError in detail.js are all now correctly wrapped in escapeHtml. The shortTime fallback now goes through escapeHtml at the call site (line 105). The plcNameError guard prevents the script from silently failing on a malformed URL. The util.js shared helper was introduced and is loaded before both page scripts. The armSnapshotWatchdog addresses the "unknown PLC sits forever on waiting" scenario. Keyboard/Enter/Space sort handlers and aria-sort on <th> elements are in place. aria-live="polite" on the connection pill and aria-label on the toolbar inputs are present. The PLC name is a real <a> link with rel="noopener" — no more window.open.

  • detail.js stateChip is correctly escaped. The fix from the prior review was applied to detail.js:76 (escapeHtml(state)). Only dashboard.js has the regression (C1 above).

  • detail.js card() v slot is safe through consistent caller discipline. All values passed as the second element of card rows are either stateChip() output (internally escaped), num() output (numeric-only), ratioText() output (n% or ), or explicitly escapeHtml-wrapped. No server string lands unescaped in the v slot.

  • SignalR lifecycle is correct. onreconnected re-invokes SubscribeFleet/SubscribePlc — essential for group re-subscription after a transport reconnect. onclose does not re-subscribe (correct — withAutomaticReconnect owns the warm path). tabId is stable per page-load (not per ConnectionId) so transport reconnects do not leak armed captures server-side. gotSnapshot() correctly clears the watchdog timer on the first data arrival.

  • Cold-start retry is well-structured. The connect() function guards connection.state === Disconnected before calling start(), preventing a subscribe-failure from re-starting a live socket. Capped exponential backoff (retryMs = Math.min(retryMs * 2, 30000)) prevents infinite tight retries.

  • All DTO field names are correct. detail.js and dashboard.js read coalescedHitCount, cacheHitCount, backendHeartbeatsSent, backendHeartbeatsFailed, backendIdleDisconnects, disconnectCascades, queueDepth, txIdWraps, inFlight, maxInFlight, invalidBcdWarnings, exceptionsByCode.codeOther, etc. — all matching the camelCase JSON policy applied by StatusJsonContext/JsonKnownNamingPolicy.CamelCase. No field name typos found.

  • escapeHtml and escapeAttr in util.js are correct. & is replaced before </> (correct order), and escapeAttr adds " on top of escapeHtml. Both functions are pure and side-effect-free.

  • No CDN dependencies. All <script>/<link> tags reference /assets/… — consistent with the firewalled-network design intent documented in StatusPage.md.

  • formatUptime and formatAge both guard Number.isFinite and < 0. NaN/negative clock skew returns rather than NaN text.

  • tagCell(t) correctly coerces t.address with Number(). The address is rendered as a pure decimal or hex numeric, never as a raw server string.

  • debug-rows empty-state and colspan counts are consistent with the HTML <thead>. The debug table has 6 columns; all static empty-row <td colspan="6"> and the no-traffic span layout (1+1+3+1) sum to 6. The KPI table has 10 columns; colspan="10" matches. No cross-file DOM-id inconsistencies found.

  • CSS is clean. focus-visible outlines on sortable <th> elements and the PLC-name <a> provide keyboard focus indicators. No !important abuse. Design tokens are consistent across the three CSS files.