feat(ui): Audit KPI tiles on Health dashboard (#23 M7)

Adds three KPI tiles to the central Health dashboard for the Audit channel:
volume (rows in the last hour), error rate (Failed/Parked/Discarded over
total), and backlog (sum of SiteAuditBacklog.PendingCount across all sites).

Repo + service:
- IAuditLogRepository.GetKpiSnapshotAsync(window, nowUtc) — single aggregate
  SELECT over the trailing window returning total + error counts; nowUtc is
  optional for production callers and pinned by integration tests against the
  shared MSSQL fixture so the global counts are deterministic.
- AuditLogQueryService.GetKpiSnapshotAsync() — composes the repo aggregate
  with a sum of SiteAuditBacklog.PendingCount read from ICentralHealthAggregator.
- AuditLogKpiSnapshot record in Commons/Types/.

UI:
- New AuditKpiTiles Blazor component (Components/Health/) — three Bootstrap
  card-tiles, click navigates to /audit/log with the matching pre-filter.
- Health.razor wires the tiles in alongside the existing Notification Outbox
  KPIs; LoadAuditKpis() runs on every 10s refresh tick and degrades to em
  dashes + inline error if the query fails.
- AuditLogPage extended to parse ?status= so the error-rate tile drill-in
  (?status=Failed) auto-loads the grid.

Tests:
- AuditLogRepositoryTests: GetKpiSnapshotAsync mixed-status + empty-window
  cases against the MSSQL migration fixture.
- AuditLogQueryServiceTests: forwarding + backlog composition; sites with
  null SiteAuditBacklog contribute zero.
- AuditKpiTilesTests: 9 bUnit tests covering tile render, error-rate maths
  with safe zero-events handling, em-dash unavailable path, click-through
  navigation, and warning/danger border thresholds.
- HealthPageTests: new Renders_AuditKpiTiles_WithValues plus IAuditLogQueryService
  stub registration in the constructor so existing outbox tests still pass.
- AuditLogPageScaffoldTests: ?status=Failed auto-load + unknown status drop.
This commit is contained in:
Joseph Doherty
2026-05-20 20:43:57 -04:00
parent 38fc9b4102
commit 943c2ced39
18 changed files with 969 additions and 7 deletions
@@ -0,0 +1,59 @@
@*
Audit Log (#23) M7 Bundle E (T13) — three Health-dashboard KPI tiles for the
Audit channel: Volume / Error rate / Backlog. Renders Bootstrap card tiles in
a single row, each acting as a navigation link to a pre-filtered Audit Log
view. The component is purely presentational — the parent page owns the
refresh loop and passes the latest snapshot via the Snapshot parameter.
*@
@namespace ScadaLink.CentralUI.Components.Health
@inject NavigationManager Navigation
<div class="d-flex justify-content-between align-items-center mb-2">
<h6 class="text-muted mb-0">Audit</h6>
<a class="small" href="/audit/log">View details &rarr;</a>
</div>
<div class="row g-3 mb-3">
@* ── Volume tile ───────────────────────────────────────────────────────── *@
<div class="col-lg-4 col-md-6 col-12">
<button type="button"
class="card h-100 w-100 text-start border-0 shadow-none p-0 audit-kpi-tile"
data-test="audit-kpi-volume"
@onclick="NavigateToVolume">
<div class="card-body text-center">
<h3 class="mb-0">@VolumeDisplay</h3>
<small class="text-muted">Audit volume (last hour)</small>
</div>
</button>
</div>
@* ── Error rate tile ───────────────────────────────────────────────────── *@
<div class="col-lg-4 col-md-6 col-12">
<button type="button"
class="card h-100 w-100 text-start border-0 shadow-none p-0 audit-kpi-tile @ErrorRateBorderClass"
data-test="audit-kpi-error-rate"
@onclick="NavigateToErrors">
<div class="card-body text-center">
<h3 class="mb-0 @ErrorRateTextClass">@ErrorRateDisplay</h3>
<small class="text-muted">Audit error rate (last hour)</small>
</div>
</button>
</div>
@* ── Backlog tile ──────────────────────────────────────────────────────── *@
<div class="col-lg-4 col-md-6 col-12">
<button type="button"
class="card h-100 w-100 text-start border-0 shadow-none p-0 audit-kpi-tile @BacklogBorderClass"
data-test="audit-kpi-backlog"
@onclick="NavigateToBacklog">
<div class="card-body text-center">
<h3 class="mb-0 @BacklogTextClass">@BacklogDisplay</h3>
<small class="text-muted">Audit backlog (sites pending)</small>
</div>
</button>
</div>
</div>
@if (!IsAvailable && !string.IsNullOrEmpty(ErrorMessage))
{
<div class="text-muted small mb-3">Audit KPIs unavailable: @ErrorMessage</div>
}
@@ -0,0 +1,157 @@
using Microsoft.AspNetCore.Components;
using ScadaLink.Commons.Types;
namespace ScadaLink.CentralUI.Components.Health;
/// <summary>
/// Audit Log (#23) M7 Bundle E (T13) code-behind for <see cref="AuditKpiTiles"/>.
/// Renders three KPI tiles — volume, error rate, backlog — from a
/// <see cref="AuditLogKpiSnapshot"/> the parent page supplies. Tiles act as
/// drill-in links: clicking navigates to <c>/audit/log</c> with the relevant
/// query-string filter pre-applied (Bundle D already parses these params).
/// </summary>
/// <remarks>
/// <para>
/// <b>Why purely presentational.</b> The Health dashboard already owns a 10s
/// auto-refresh loop and an "as-of" timestamp display; pushing those concerns
/// into the tile component would either duplicate them (one timer per tile) or
/// awkwardly couple back to the page. The parent passes a fresh
/// <see cref="AuditLogKpiSnapshot"/> every refresh and the tile component
/// re-renders.
/// </para>
/// <para>
/// <b>Error rate division.</b> When <c>TotalEventsLastHour == 0</c> we render
/// "0%" rather than "—" — the snapshot itself is available, the system just had
/// no audit traffic to evaluate. This avoids a divide-by-zero AND keeps the
/// "0% errors" reading semantically true. The em dash is reserved for
/// <see cref="IsAvailable"/> = <c>false</c>, which represents a failed snapshot
/// query (different signal from "quiet hour").
/// </para>
/// </remarks>
public partial class AuditKpiTiles
{
/// <summary>
/// Latest KPI snapshot. <c>null</c> means the parent has not loaded it yet
/// or the load failed — the tiles render em dashes in that case.
/// </summary>
[Parameter] public AuditLogKpiSnapshot? Snapshot { get; set; }
/// <summary>
/// True when <see cref="Snapshot"/> is a successful query result. False
/// when the parent's refresh threw and the displayed values should be
/// rendered as em dashes with an error explanation underneath.
/// </summary>
[Parameter] public bool IsAvailable { get; set; }
/// <summary>
/// Optional error message to render underneath the tiles when
/// <see cref="IsAvailable"/> is false. Mirrors how the Notification Outbox
/// section on the Health dashboard surfaces transient KPI failures.
/// </summary>
[Parameter] public string? ErrorMessage { get; set; }
// ── Volume tile ─────────────────────────────────────────────────────────
private string VolumeDisplay =>
IsAvailable && Snapshot is not null
? Snapshot.TotalEventsLastHour.ToString("N0")
: "—";
private void NavigateToVolume()
{
// Volume is "all audit rows in the last hour" — no status filter; the
// page's existing instance-search seam is enough for drill-in. We rely
// on the page's default render which omits a time-range constraint and
// shows the newest rows first.
Navigation.NavigateTo("/audit/log");
}
// ── Error rate tile ─────────────────────────────────────────────────────
/// <summary>
/// Percentage of error rows (Failed/Parked/Discarded) over the trailing
/// hour. Returns 0 when the snapshot is unavailable OR when total events
/// is zero (rather than throwing). The display layer renders "—" for the
/// unavailable case and "0%" for the zero-events case.
/// </summary>
internal double ErrorRatePercent
{
get
{
if (!IsAvailable || Snapshot is null || Snapshot.TotalEventsLastHour <= 0)
{
return 0;
}
return 100.0 * Snapshot.ErrorEventsLastHour / Snapshot.TotalEventsLastHour;
}
}
private string ErrorRateDisplay
{
get
{
if (!IsAvailable || Snapshot is null)
{
return "—";
}
// Format to one decimal so a 1-error-in-2000 rate doesn't round to 0%.
return $"{ErrorRatePercent:0.0}%";
}
}
// Border + text colour bracket the tile visually: any nonzero error rate
// gets a warning border; anything above 10% bumps it to danger. The
// thresholds match the Notification Outbox tile pattern (border-warning
// when Stuck > 0, border-danger when Parked > 0).
private string ErrorRateBorderClass =>
!IsAvailable || Snapshot is null || Snapshot.ErrorEventsLastHour == 0
? string.Empty
: (ErrorRatePercent >= 10 ? "border-danger" : "border-warning");
private string ErrorRateTextClass =>
!IsAvailable || Snapshot is null || Snapshot.ErrorEventsLastHour == 0
? string.Empty
: (ErrorRatePercent >= 10 ? "text-danger" : "text-warning");
private void NavigateToErrors()
{
// Drill in pre-filtered to Failed — the most common error class.
// (The Audit Log page also accepts ?status=Parked / =Discarded for
// operators who want to see those specifically; the tile picks Failed
// as the primary surface since it's the only synchronous-failure
// status. Parked + Discarded both still appear in the unfiltered grid.)
Navigation.NavigateTo("/audit/log?status=Failed");
}
// ── Backlog tile ────────────────────────────────────────────────────────
private string BacklogDisplay =>
IsAvailable && Snapshot is not null
? Snapshot.BacklogTotal.ToString("N0")
: "—";
// Backlog above zero is itself a signal — sites should normally drain to
// empty. We render warning when there's a backlog at all; a hard danger
// threshold could be added later if ops want it but the on-call playbook
// for "backlog > 0" is the same as "backlog > 1000": check why the site
// isn't draining.
private string BacklogBorderClass =>
IsAvailable && Snapshot is not null && Snapshot.BacklogTotal > 0
? "border-warning"
: string.Empty;
private string BacklogTextClass =>
IsAvailable && Snapshot is not null && Snapshot.BacklogTotal > 0
? "text-warning"
: string.Empty;
private void NavigateToBacklog()
{
// The audit-log page itself doesn't carry a per-site backlog grid —
// the Health dashboard already shows that per-site card. The natural
// drill-in for "the system has a backlog" is the unfiltered Audit Log
// page sorted by newest, so an operator can see the most recent rows
// and judge whether the queue is moving.
Navigation.NavigateTo("/audit/log");
}
}