# Frontend / Live Web Dashboard — Code Review
Scope: commit `e719dd5` ("replace status page with a live SignalR web dashboard"). Files reviewed — all under `src/Mbproxy/Admin/wwwroot/`:
`index.html`, `plc.html`, `dashboard.js`, `detail.js`, `theme.css`, `dashboard.css`, `detail.css`. Vendored assets (`bootstrap.min.css`, `bootstrap.bundle.min.js`, `signalr.min.js`, `*.woff2`) explicitly out of scope. Cross-checked against `docs/Operations/StatusPage.md` (wire contract), `CLAUDE.md` (mbproxy), and the server side `StatusHub.cs` / `StatusBroadcaster.cs` for the SignalR contract.
## Summary
- The dashboard is well-built vanilla JS: thoughtful escaping helpers, clean rate computation, sensible reconnection wiring, and a correct subscribe-on-reconnect pattern that keeps the on-demand tag capture armed/disarmed correctly.
- **The single most important finding is a real, exploitable XSS hole on the detail page**: `detail.js` renders `t.rawHex`, `t.address`, `t.width`, `t.direction`, `c.remote`, and `plc.host` straight into `innerHTML` **without escaping**. `escapeHtml` exists in the file but is applied inconsistently — several attacker/PLC-influenceable fields bypass it. The fleet table (`dashboard.js`) is escaped correctly throughout; the detail page is not. This is a Critical finding.
- A second Major issue: on a *cold* SignalR failure the page enters a `setTimeout(start, 3000)` retry loop that **builds a brand-new `HubConnection` on every retry** — no, it reuses the same connection object, but it never tears down handlers, and combined with `withAutomaticReconnect` the failure/retry semantics are muddled. There is also no upper bound and no UX past "retrying".
- DOM updates are full-`innerHTML` re-renders of the whole table body every ~1 s. Correct and flicker-free for 54 rows, but it blows away focus/selection and is wasteful; acceptable at this scale, flagged.
## Critical findings
**C1. Stored/reflected XSS on the connection-detail page — multiple unescaped fields rendered via `innerHTML`.** `detail.js` defines `escapeHtml` (line 23) and uses it for *some* fields but omits it for several others, all of which are interpolated into template strings later assigned to `.innerHTML`:
- **`detail.js:194` and `:203`** — the debug-row builder:
```js
${t.address} ${hex4(t.address)}
${t.width}-bit
...
${t.direction}
```
`t.direction` is a server string (`"read"`/`"write"` per the schema) interpolated raw into both an attribute-ish context and element text. `t.address`/`t.width` are typed `int` in the documented DTO, so they are lower risk — but the code does not coerce them (`Number(...)`) or escape them, so it relies entirely on the server DTO type. `t.direction` is a free `string` on the wire and is **not escaped**.
- **`detail.js:86`** — the client list:
```js
`
${escapeHtml(c.remote)}` +
` · ${num(c.pdusForwarded)} PDUs · since ${shortTime(c.connectedAtUtc)}
`
```
`c.remote` *is* escaped (good), but `shortTime(c.connectedAtUtc)` is **not** — and `shortTime` (line 36) has a `catch { return iso; }` branch that returns the **raw, unescaped `connectedAtUtc` string** verbatim when `new Date(iso)` parsing throws or the value is non-ISO. A malformed/attacker-controlled timestamp string therefore lands unescaped inside `innerHTML`.
- **`detail.js:203`** — `escapeHtml(t.rawHex)` *is* applied (good), and `num(t.decodedValue)` is numeric-safe. So `rawHex` is the one debug field that is handled correctly. (Correcting the brief: `rawHex` is escaped; `direction` and the timestamp path are the live holes.)
- **`detail.js:65`** — `$('plc-sub').textContent = `${plc.host}:${plc.listenPort}`;` is **safe** (`textContent`). But `detail.js:14–16` does the same for `plcName` via `textContent` — also safe. So the identity header is fine; the holes are specifically the `innerHTML` card/table builders.
Severity rationale: per `docs/Operations/StatusPage.md` the admin endpoint binds `IPAddress.Any` with **no authentication** ("Authentication lives at the network layer"). PLC `name`/`host` and backend-derived strings come off the Modbus wire / `appsettings.json` and are operator- or device-influenceable. A PLC named or a backend that returns a crafted string can inject `` into any operator's browser session on the trusted segment. Read-only UI does not mean low impact: the injected script runs with the operator's origin.
Concrete fix: route **every** dynamic value through `escapeHtml` before it enters an `innerHTML` template, with **no exceptions**:
- `detail.js:194` / `:203`: wrap `t.direction` in `escapeHtml(...)`; coerce `t.address`/`t.width` with `Number(...)` (or escape) rather than trusting the DTO type.
- `detail.js:86`: wrap the `shortTime(...)` result in `escapeHtml(...)`, and/or change `shortTime`'s fallback to `escapeHtml(iso)` / return `'—'`.
- Add a single `escapeAttr` helper (as `dashboard.js` has at line 179) and use it for the `dirCls`/class interpolation at `:197` if `dirCls` ever becomes data-derived (currently it is a literal, so low priority — but `card()`'s `cls` parameter at `detail.js:45` is interpolated into `class="v ${cls||''}"` and is caller-supplied; today all callers pass literals, so it is safe *now* but fragile).
- Better still: stop hand-building HTML. Build rows with `document.createElement` + `textContent`/`dataset`, which makes escaping structural rather than a discipline that one missed call defeats.
## Major findings
**M1. Cold-start retry loop layers a manual `setTimeout` retry on top of `withAutomaticReconnect`, with no bound and a misleading pill.** `dashboard.js:252–263` and the identical `detail.js:236–245`:
```js
async function start() {
try { ...; await connection.start(); await connection.invoke('SubscribeFleet'); ... }
catch { setConn('disconnected','retrying'); setTimeout(start, 3000); }
}
```
`withAutomaticReconnect` only covers a connection that *was* established and then dropped — it does **not** retry the initial `start()`. So the manual loop is needed. But: (a) it retries **forever** every 3 s with no backoff and no cap — if the hub is permanently gone the browser hammers it indefinitely; (b) if `connection.start()` succeeds but `invoke('SubscribeFleet')` throws, the `catch` calls `start()` again on an **already-started** connection, which will throw "cannot start a connection that is not in the Disconnected state" and wedge the retry loop; (c) the pill shows `disconnected`/`retrying` which is reasonable, but there is no terminal "giving up" state and no way for an operator to know the difference between "server down" and "server slow". Fix: separate "start the connection" from "subscribe" so a subscribe failure doesn't restart the socket; add capped exponential backoff; guard `start()` against re-entry when `connection.state !== 'Disconnected'`.
**M2. Detail page never handles "PLC name not in fleet" vs. "hub never delivers".** `detail.js` subscribes with `SubscribePlc(plcName)` where `plcName` is taken from `location.pathname`. If the name is wrong (typo, stale bookmark, or a name that never existed), `StatusHub.SubscribePlc` happily adds the connection to a `plc:{bogus}` group and `_captureRegistry.Arm` is a documented no-op — so the server **never sends a `plc` message** and `onDetail` never fires. The page sits forever on "Waiting for first snapshot…" with a green `connected` pill. There is no timeout, no "unknown PLC" state. The `renderMissing()` path (`detail.js:168`) only triggers when the server *does* push a payload with `detail.plc === null` (PLC removed by hot-reload) — it cannot fire for a name that was never configured because nothing is pushed at all. Fix: after `connected`, start a watchdog (e.g. 3× the push interval); if no `plc` message arrives, show an "unknown or unreachable PLC" notice.
**M3. Fleet PDU/s rate is wrong for the first snapshot after any reconnect, and silently leaks `prevPdu`/`rateByName` entries for removed PLCs.** `dashboard.js:65–77` `updateRates` keys `prevPdu`/`rateByName` by `plc.name` and never removes entries. On hot-reload removal of a PLC, its stale rate stays in `rateByName` forever and is still summed into the fleet rate? — no, `renderAggregates` only iterates `s.plcs`, so a removed PLC drops out of the *sum*; but the Map still grows unboundedly across the process lifetime if PLCs churn. More importantly, after a SignalR reconnect the counters are *cumulative since service start*, so `cur - prev.forwarded` across a multi-second reconnect gap produces a correct (if coarse) rate — that part is fine. The real bug: `performance.now()` is monotonic per-page, so that's fine too. Net: low-grade Map leak on PLC churn; prune `prevPdu`/`rateByName` to the current `snapshot.plcs` name set each cycle.
**M4. Full-table-body `innerHTML` re-render on every ~1 s push destroys transient UI state.** `dashboard.js:155–173` rebuilds the entire `` string and assigns `tbody.innerHTML`. Consequences at the 1 s cadence: any text selection inside the table is lost every second; `:hover` is re-evaluated (minor); the browser re-parses ~54 rows of HTML each tick. It is *not* a flicker source (synchronous replace, no intermediate paint) and 54 rows is cheap, so this is acceptable — but it is the kind of thing that becomes a problem if columns/rows grow. A keyed diff (update existing `
` cells in place, add/remove only on set change) would also fix the selection-loss annoyance. Flagged as Major because of the selection-loss UX regression on a screen operators stare at; the brief explicitly asked about this.
## Minor findings
**N1. `detail.js` has no `escapeAttr` and `dashboard.js`'s is duplicated.** Both files define their own `escapeHtml` (`dashboard.js:176`, `detail.js:23`) — identical code, copy-pasted. `escapeAttr` exists only in `dashboard.js:179`. Since both files are separately served and there is no shared module, duplication is the pragmatic choice for a no-build-step project, but the *inconsistency* (detail.js lacking `escapeAttr`) is exactly what enabled C1. Recommend a tiny shared `util.js` served from `/assets/`, or at minimum make the two `escapeHtml` definitions and an `escapeAttr` identical and present in both.
**N2. Accessibility gaps.**
- The sortable `
` elements (`index.html:72–81`) are clickable via a JS `click` handler but are not keyboard-focusable and carry no `role="button"`/`tabindex="0"`/`aria-sort`. A keyboard-only operator cannot sort the table. Add `tabindex="0"`, `aria-sort` reflecting `sorted-asc`/`sorted-desc`, and a `keydown` (Enter/Space) handler.
- Table rows are clickable (`dashboard.js:232`) to open a detail page but are `