docs(requirements): add Site Call Audit KPIs to Health Monitoring
This commit is contained in:
@@ -64,11 +64,23 @@ The Notification Outbox is a **central** component, so its KPIs are **central-co
|
||||
|
||||
These are distinct from the site-reported **Store-and-forward buffer depth** notification metric, which now covers the **site→central leg** — notifications still buffered in a site's Store-and-Forward Engine awaiting forward to central — and remains part of the site health report.
|
||||
|
||||
## Site Call Audit KPIs
|
||||
|
||||
The Site Call Audit is a **central** component, so its KPIs — like the Notification Outbox's — are **central-computed** rather than collected from sites and carried in the site health report:
|
||||
|
||||
- The dashboard surfaces Site Call Audit **headline** KPI tiles alongside the existing Notification Outbox tiles.
|
||||
- The Site Call Audit component computes these on demand from the central `SiteCalls` table, **global and per-source-site**; the health dashboard polls it for the headline tiles.
|
||||
- The KPI set is **buffered count** (`Pending` + `Retrying`), **parked count** (`Parked`), **failed (last interval)**, **delivered (last interval)**, **oldest pending age**, and **stuck count** (`Pending` / `Retrying` rows older than the configurable stuck-age threshold).
|
||||
- **Stuck** is `Pending` / `Retrying` rows older than a configurable threshold (default **10 minutes**) — **display-only** (KPI count plus a row badge), with no escalation or alerting, consistent with the Notification Outbox stuck metric.
|
||||
- Site Call Audit KPIs are **point-in-time**, computed on demand from the `SiteCalls` table. There is no time-series store — consistent with Health Monitoring's "current status only" philosophy.
|
||||
|
||||
Unlike the Notification Outbox, the Site Call Audit is **not a dispatcher** — cached calls are delivered by each site's Store-and-Forward Engine, and the `SiteCalls` table is an eventually-consistent central mirror of site-owned status.
|
||||
|
||||
## Central Storage
|
||||
|
||||
- Health metrics are held **in memory** at the central cluster for display in the UI.
|
||||
- No historical health data is persisted — the dashboard shows current/latest status only.
|
||||
- Notification Outbox KPIs are not stored by Health Monitoring; they are computed point-in-time from the central `Notifications` table each time the dashboard refreshes — consistent with the current-status-only philosophy.
|
||||
- Notification Outbox and Site Call Audit KPIs are not stored by Health Monitoring; they are computed point-in-time from the central `Notifications` and `SiteCalls` tables respectively each time the dashboard refreshes — consistent with the current-status-only philosophy.
|
||||
- Site connectivity history (online/offline transitions) may optionally be logged via the Audit Log or a separate mechanism if needed in the future.
|
||||
|
||||
## No Alerting
|
||||
@@ -84,6 +96,7 @@ These are distinct from the site-reported **Store-and-forward buffer depth** not
|
||||
- **Store-and-Forward Engine (site)**: Provides buffer depth metrics, including the notification backlog awaiting forward to central.
|
||||
- **Cluster Infrastructure (site)**: Provides node role status.
|
||||
- **Notification Outbox (central)**: Provides central-computed outbox KPIs — queue depth, stuck count, parked count — for the headline dashboard tiles.
|
||||
- **Site Call Audit (central)**: Provides central-computed cached-call KPIs — buffered count, parked count, failed/delivered (last interval), oldest pending age, stuck count — for the headline dashboard tiles.
|
||||
|
||||
## Interactions
|
||||
|
||||
|
||||
Reference in New Issue
Block a user