Reviewed the new SignalR dashboard and fixed its two top findings: a stored XSS on the connection-detail page (unescaped tag name / direction / timestamp rendered into innerHTML) and FC03/FC04 cache hits bypassing the debug-view capture, which left cached tags frozen while their age climbed. Also adds an optional human-friendly Name to BCD tags surfaced on the debug view, and loads the real fleet config from tags.txt (12 named BCD tags, PLC Z28061) so the published appsettings.json is deploy-ready. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
17 KiB
Frontend / Live Web Dashboard — Code Review
Scope: commit e719dd5 ("replace status page with a live SignalR web dashboard"). Files reviewed — all under src/Mbproxy/Admin/wwwroot/:
index.html, plc.html, dashboard.js, detail.js, theme.css, dashboard.css, detail.css. Vendored assets (bootstrap.min.css, bootstrap.bundle.min.js, signalr.min.js, *.woff2) explicitly out of scope. Cross-checked against docs/Operations/StatusPage.md (wire contract), CLAUDE.md (mbproxy), and the server side StatusHub.cs / StatusBroadcaster.cs for the SignalR contract.
Summary
- The dashboard is well-built vanilla JS: thoughtful escaping helpers, clean rate computation, sensible reconnection wiring, and a correct subscribe-on-reconnect pattern that keeps the on-demand tag capture armed/disarmed correctly.
- The single most important finding is a real, exploitable XSS hole on the detail page:
detail.jsrenderst.rawHex,t.address,t.width,t.direction,c.remote, andplc.hoststraight intoinnerHTMLwithout escaping.escapeHtmlexists in the file but is applied inconsistently — several attacker/PLC-influenceable fields bypass it. The fleet table (dashboard.js) is escaped correctly throughout; the detail page is not. This is a Critical finding. - A second Major issue: on a cold SignalR failure the page enters a
setTimeout(start, 3000)retry loop that builds a brand-newHubConnectionon every retry — no, it reuses the same connection object, but it never tears down handlers, and combined withwithAutomaticReconnectthe failure/retry semantics are muddled. There is also no upper bound and no UX past "retrying". - DOM updates are full-
innerHTMLre-renders of the whole table body every ~1 s. Correct and flicker-free for 54 rows, but it blows away focus/selection and is wasteful; acceptable at this scale, flagged.
Critical findings
C1. Stored/reflected XSS on the connection-detail page — multiple unescaped fields rendered via innerHTML. detail.js defines escapeHtml (line 23) and uses it for some fields but omits it for several others, all of which are interpolated into template strings later assigned to .innerHTML:
detail.js:194and:203— the debug-row builder:<td>${t.address} <span class="ratio-sub">${hex4(t.address)}</span></td> <td>${t.width}-bit</td> ... <td><span class="dir-tag ${dirCls}">${t.direction}</span></td>t.directionis a server string ("read"/"write"per the schema) interpolated raw into both an attribute-ish context and element text.t.address/t.widthare typedintin the documented DTO, so they are lower risk — but the code does not coerce them (Number(...)) or escape them, so it relies entirely on the server DTO type.t.directionis a freestringon the wire and is not escaped.detail.js:86— the client list:`<div class="client-line">${escapeHtml(c.remote)}` + `<span class="pdu"> · ${num(c.pdusForwarded)} PDUs · since ${shortTime(c.connectedAtUtc)}</span></div>`c.remoteis escaped (good), butshortTime(c.connectedAtUtc)is not — andshortTime(line 36) has acatch { return iso; }branch that returns the raw, unescapedconnectedAtUtcstring verbatim whennew Date(iso)parsing throws or the value is non-ISO. A malformed/attacker-controlled timestamp string therefore lands unescaped insideinnerHTML.detail.js:203—escapeHtml(t.rawHex)is applied (good), andnum(t.decodedValue)is numeric-safe. SorawHexis the one debug field that is handled correctly. (Correcting the brief:rawHexis escaped;directionand the timestamp path are the live holes.)detail.js:65—$('plc-sub').textContent =${plc.host}:${plc.listenPort};is safe (textContent). Butdetail.js:14–16does the same forplcNameviatextContent— also safe. So the identity header is fine; the holes are specifically theinnerHTMLcard/table builders.
Severity rationale: per docs/Operations/StatusPage.md the admin endpoint binds IPAddress.Any with no authentication ("Authentication lives at the network layer"). PLC name/host and backend-derived strings come off the Modbus wire / appsettings.json and are operator- or device-influenceable. A PLC named or a backend that returns a crafted string can inject <img src=x onerror=...> into any operator's browser session on the trusted segment. Read-only UI does not mean low impact: the injected script runs with the operator's origin.
Concrete fix: route every dynamic value through escapeHtml before it enters an innerHTML template, with no exceptions:
detail.js:194/:203: wrapt.directioninescapeHtml(...); coercet.address/t.widthwithNumber(...)(or escape) rather than trusting the DTO type.detail.js:86: wrap theshortTime(...)result inescapeHtml(...), and/or changeshortTime's fallback toescapeHtml(iso)/ return'—'.- Add a single
escapeAttrhelper (asdashboard.jshas at line 179) and use it for thedirCls/class interpolation at:197ifdirClsever becomes data-derived (currently it is a literal, so low priority — butcard()'sclsparameter atdetail.js:45is interpolated intoclass="v ${cls||''}"and is caller-supplied; today all callers pass literals, so it is safe now but fragile). - Better still: stop hand-building HTML. Build rows with
document.createElement+textContent/dataset, which makes escaping structural rather than a discipline that one missed call defeats.
Major findings
M1. Cold-start retry loop layers a manual setTimeout retry on top of withAutomaticReconnect, with no bound and a misleading pill. dashboard.js:252–263 and the identical detail.js:236–245:
async function start() {
try { ...; await connection.start(); await connection.invoke('SubscribeFleet'); ... }
catch { setConn('disconnected','retrying'); setTimeout(start, 3000); }
}
withAutomaticReconnect only covers a connection that was established and then dropped — it does not retry the initial start(). So the manual loop is needed. But: (a) it retries forever every 3 s with no backoff and no cap — if the hub is permanently gone the browser hammers it indefinitely; (b) if connection.start() succeeds but invoke('SubscribeFleet') throws, the catch calls start() again on an already-started connection, which will throw "cannot start a connection that is not in the Disconnected state" and wedge the retry loop; (c) the pill shows disconnected/retrying which is reasonable, but there is no terminal "giving up" state and no way for an operator to know the difference between "server down" and "server slow". Fix: separate "start the connection" from "subscribe" so a subscribe failure doesn't restart the socket; add capped exponential backoff; guard start() against re-entry when connection.state !== 'Disconnected'.
M2. Detail page never handles "PLC name not in fleet" vs. "hub never delivers". detail.js subscribes with SubscribePlc(plcName) where plcName is taken from location.pathname. If the name is wrong (typo, stale bookmark, or a name that never existed), StatusHub.SubscribePlc happily adds the connection to a plc:{bogus} group and _captureRegistry.Arm is a documented no-op — so the server never sends a plc message and onDetail never fires. The page sits forever on "Waiting for first snapshot…" with a green connected pill. There is no timeout, no "unknown PLC" state. The renderMissing() path (detail.js:168) only triggers when the server does push a payload with detail.plc === null (PLC removed by hot-reload) — it cannot fire for a name that was never configured because nothing is pushed at all. Fix: after connected, start a watchdog (e.g. 3× the push interval); if no plc message arrives, show an "unknown or unreachable PLC" notice.
M3. Fleet PDU/s rate is wrong for the first snapshot after any reconnect, and silently leaks prevPdu/rateByName entries for removed PLCs. dashboard.js:65–77 updateRates keys prevPdu/rateByName by plc.name and never removes entries. On hot-reload removal of a PLC, its stale rate stays in rateByName forever and is still summed into the fleet rate? — no, renderAggregates only iterates s.plcs, so a removed PLC drops out of the sum; but the Map still grows unboundedly across the process lifetime if PLCs churn. More importantly, after a SignalR reconnect the counters are cumulative since service start, so cur - prev.forwarded across a multi-second reconnect gap produces a correct (if coarse) rate — that part is fine. The real bug: performance.now() is monotonic per-page, so that's fine too. Net: low-grade Map leak on PLC churn; prune prevPdu/rateByName to the current snapshot.plcs name set each cycle.
M4. Full-table-body innerHTML re-render on every ~1 s push destroys transient UI state. dashboard.js:155–173 rebuilds the entire <tbody> string and assigns tbody.innerHTML. Consequences at the 1 s cadence: any text selection inside the table is lost every second; :hover is re-evaluated (minor); the browser re-parses ~54 rows of HTML each tick. It is not a flicker source (synchronous replace, no intermediate paint) and 54 rows is cheap, so this is acceptable — but it is the kind of thing that becomes a problem if columns/rows grow. A keyed diff (update existing <tr> cells in place, add/remove only on set change) would also fix the selection-loss annoyance. Flagged as Major because of the selection-loss UX regression on a screen operators stare at; the brief explicitly asked about this.
Minor findings
N1. detail.js has no escapeAttr and dashboard.js's is duplicated. Both files define their own escapeHtml (dashboard.js:176, detail.js:23) — identical code, copy-pasted. escapeAttr exists only in dashboard.js:179. Since both files are separately served and there is no shared module, duplication is the pragmatic choice for a no-build-step project, but the inconsistency (detail.js lacking escapeAttr) is exactly what enabled C1. Recommend a tiny shared util.js served from /assets/, or at minimum make the two escapeHtml definitions and an escapeAttr identical and present in both.
N2. Accessibility gaps.
- The sortable
<th>elements (index.html:72–81) are clickable via a JSclickhandler but are not keyboard-focusable and carry norole="button"/tabindex="0"/aria-sort. A keyboard-only operator cannot sort the table. Addtabindex="0",aria-sortreflectingsorted-asc/sorted-desc, and akeydown(Enter/Space) handler. - Table rows are clickable (
dashboard.js:232) to open a detail page but are<tr>withcursor:pointeronly — not keyboard-reachable and not announced as interactive. Consider making the PLC-name cell an actual<a href="/plc/...">so it is focusable, middle-click/ctrl-click works natively, and screen readers announce it. This would also remove the need for the JSwindow.openhandler. - The connection-state pill (
#conn) updates visually but has noaria-liveregion, so a screen-reader user is never told the hub dropped. Addaria-live="polite"to#connor to#conn-text. <input type="search" id="f-search">has aplaceholderbut no associated<label>(visible oraria-label). Same for#f-state. Addaria-label.
N3. Row click always window.open(..., '_blank') — no modifier-key respect, popup-blocker exposure. dashboard.js:235 unconditionally calls window.open. A plain click should arguably open in the same tab or respect the user's intent; programmatic window.open not in direct response to a trusted click on an anchor can be caught by popup blockers in some configs. Tied to N2's suggestion: render the PLC name as a real <a target="_blank" rel="noopener"> and delete the handler. (rel="noopener" also matters — window.open without it leaves window.opener live; here the opened page is same-origin so impact is low, but it is still best practice.)
N4. detail.js plcName parsing is brittle. detail.js:10: decodeURIComponent(location.pathname.replace(/^\/plc\//, '')). If the route is ever served under a path prefix, or the name itself contains an encoded /, this misparses. Also if decodeURIComponent throws (malformed % sequence in the URL) the whole script aborts at line 10 before connect() is ever reached — the page is then blank with no error. Wrap in try/catch and fall back to a visible error state.
N5. dashboard.js:101 and :106 — fleet PDU/s shows — until the second snapshot. Expected (rate needs two samples) and correctly handled, but the aggregate card shows — for the first ~1 s while the table already shows per-row — rates. Cosmetic; no fix required, noted for completeness.
N6. No console.log, no hardcoded absolute URLs, no obvious dead code. All asset/hub URLs are root-relative (/assets/..., /hub/status, /plc/...) — correct, survives any host/port. card()'s extra parameter and the cls third element of card rows are lightly-used but not dead. Clean on this axis.
N7. formatUptime / formatAge silently misrender negative or NaN input. dashboard.js:199 and detail.js:28: if uptimeSeconds/ageSeconds ever arrive negative (clock skew) or non-numeric, Math.floor yields NaN and the card shows NaN. Low risk given server types; a Number.isFinite guard returning '—' is cheap insurance.
What looks good
- Reconnect → re-subscribe is correct.
dashboard.js:249anddetail.js:233both re-invoke('SubscribeFleet'/'SubscribePlc')insideonreconnected. This is essential and easy to forget: SignalR auto-reconnect gives a newConnectionId, server-side group membership does not survive, and the detail page's tag capture is armed per-connection — without the re-subscribe the capture would silently disarm on every transient drop. Verified againstStatusHub.SubscribePlc/OnDisconnectedAsyncandStatusBroadcastergroup targeting — the lifecycle is sound. onclosecorrectly does not re-subscribe — it just sets the pill;withAutomaticReconnect's own loop owns recovery. No double-retry on the warm path.- Single
HubConnectionper page, closed implicitly on navigation. Opening a detail page in a new tab (window.open) means each tab owns exactly one hub connection; closing the tab firesOnDisconnectedAsyncserver-side which disarms the capture. No connection leak across navigation. - The fleet table escapes correctly.
dashboard.js:161–163routesplc.namethroughescapeAttrfor thedata-nameattribute andescapeHtmlfor cell text;plc.hostthroughescapeHtml;plc.listener.lastBindErrorthroughescapeAttrfor thetitle=attribute (line 160). This is the right discipline — it is onlydetail.jsthat fails to match it (C1). escapeHtmlis a correct minimal implementation —&,<,>cover element-text contexts;escapeAttradds". Order matters (&first) and is correct.- Rate computation is robust.
dashboard.js:70guardsnow > prev.t,Math.max(0, ...)clamps a counter reset/reconnect to a non-negative rate, andperformance.now()(monotonic) is the right clock for deltas — notDate.now(). - Filter is correct and cheap.
visiblePlcs(dashboard.js:130) filters a.slice()copy (never mutates the snapshot), search is case-folded once, and the sort has a stablelocaleComparetiebreaker by name. At 54 rows the filter+sort+render per keystroke is sub-millisecond. It correctly survives live updates becauserender()always re-derives fromlatest+ currentfilterstate. renderMissing()hot-reload path. The detail page genuinely handles a PLC disappearing mid-session (detail.js:168) —noticeshown, cards cleared andhidden. Good defensive UX for the hot-reload scenario (the gap is the never-existed name, see M2).- No CDN dependencies — every
<script>/<link>is/assets/..., consistent with the firewalled-network design goal inStatusPage.md. - CSS is clean: design tokens via custom properties,
font-display: swapon all@font-face, responsiveagg-gridbreakpoints,prefers-free but no egregious issues.tr.stale/tr.no-trafficstyling gives the debug view real legibility. No!importantabuse beyond two justified.empty-rowoverrides.
Open questions
- C1 exploitability depends on whether
t.directionandconnectedAtUtccan actually carry attacker bytes.directionis server-derived from the FC, so today it is effectively a closed set — but it is typedstringon the wire anddetail.jstrusts it.connectedAtUtcis aDateTimeOffsetserialized server-side, so it too is well-formed today. The finding stands because the frontend must not depend on server-side type discipline for its own XSS safety — but if the threat model formally excludes a compromised service, C1 could be re-rated Major. Recommend fixing regardless: the cost is threeescapeHtmlcalls. - Does
StatusBroadcasterever push aplcpayload for a PLC that has zero configured BCD tags?detail.js:183handlesdebug.tagsempty → "No BCD tags configured". Confirmed handled; noted only to flag that the empty-state is covered. - Should the detail page cap how long it waits before declaring the PLC unknown (M2)? The server has no "unknown PLC" rejection in
SubscribePlc— it silently accepts any name. A client-side watchdog is the only place this can be surfaced without a server change.