docs(audit): apply cross-bundle review fixes before merge

Final cross-bundle reviewer identified 7 inconsistencies that the per-bundle
reviewers couldn't see; all fixed in one logical commit.

Critical:
- HighLevelReqs AL-3: drop 'then upsert-on-newer-status' — AuditLog is
  strictly append-only (correct for SiteCalls/Notifications, wrong for
  the immutable AuditLog shadow).
- Component-AuditLog Error rate KPI: align with HealthMonitoring's
  exclusion list (Success/Delivered/Enqueued) rather than just non-Success;
  otherwise every Delivered notification or Enqueued cached call would be
  counted as an error.

Important:
- Component-AuditLog line 154: ISiteAuditWriter -> IAuditWriter (canonical
  name per Commons and the rest of this doc).
- Component-AuditLog Central direct-write paragraph: convert remaining
  slash notation (ApiInbound/Completed, Notification/Attempt,
  Notification/Terminal) to dot notation used everywhere else.
- Component-ClusterInfrastructure: scope SiteCallAuditActor to
  reconciliation + KPIs + Retry/Discard relay; cached-telemetry ingest is
  AuditLogIngestActor's role per Combined Telemetry contract.
- Component-CentralUI Audit Log page: state the OperationalAudit read
  permission and the read-vs-export split (matching CLI doc).
- Component-NotificationOutbox: add never-fail-the-action invariant for
  dispatcher audit writes.

Minor:
- Component-InboundAPI: 'Non-blocking semantics' was ambiguous (could be
  read as async); reword to 'Fail-soft' — the write is still synchronous
  before flush, but failures are caught and don't change the response.
- Component-CLI: realign audit-query/audit-export flags to actually match
  the Central UI Audit Log filter set (channel, kind, status, site,
  instance, target, actor, correlation-id, errors-only); drop --user and
  --entity-id which are IAuditService concepts, not Audit Log columns.
- Component-AuditLog KPI tile names: 'Volume/Error rate/Backlog' ->
  'Audit volume/Audit error rate/Audit backlog' (matches Central UI and
  Health Monitoring); drop the two orphan KPIs (Top inbound callers, Top
  outbound 5xx) that were never surfaced anywhere.
- Component-AuditLog Interactions: re-attribute DbOutbound emissions to
  ESG (where Database.* lives) with a note that Site Runtime is the API
  surface for scripts.
- HighLevelReqs AL-12: drop 'and reconciliation operations' (CLI has no
  reconcile command; reconciliation is an internal self-healing pull).
  Add note that verify-chain becomes operational once AL-11's hash chain
  ships.
This commit is contained in:
Joseph Doherty
2026-05-20 09:00:11 -04:00
parent 34ea97bae9
commit c929562e41
7 changed files with 20 additions and 21 deletions

View File

@@ -151,7 +151,7 @@ writers — all idempotent on `EventId`.
The component completing a script-trust-boundary action (External System
Gateway, Database layer, Store-and-Forward Engine) builds an `AuditEvent` with a
fresh `EventId` (Guid v4) and `OccurredAtUtc = UtcNow`, then appends it to the
site-local `AuditLog` SQLite via `ISiteAuditWriter` with
site-local `AuditLog` SQLite via `IAuditWriter` with
`ForwardState = 'Pending'`. The append is a single-statement INSERT and is
durable in microseconds; control returns to the script with no central
round-trip on the hot path.
@@ -178,10 +178,10 @@ pattern as Site Call Audit's reconciliation of `SiteCalls`.
### Central direct-write (central-originated events)
Events originating at central never touch site SQLite. Inbound API writes one
`ApiInbound`/`Completed` row via `ICentralAuditWriter` synchronously inside the
`ApiInbound.Completed` row via `ICentralAuditWriter` synchronously inside the
request-handler middleware, before the HTTP response is flushed. The
Notification Outbox dispatcher writes `Notification`/`Attempt` per delivery
attempt and `Notification`/`Terminal` on terminal status. Central direct-writes
Notification Outbox dispatcher writes `Notification.Attempt` per delivery
attempt and `Notification.Terminal` on terminal status. Central direct-writes
use the same insert-if-not-exists semantics keyed on `EventId`.
## Cached Operations — Combined Telemetry
@@ -291,11 +291,9 @@ MS SQL for direct-write events). Unredacted secrets never persist.
Point-in-time, computed from the central `AuditLog` table; global and per-site.
- **Volume** — events/min.
- **Error rate** — % non-`Success` rows, rolling 5 min.
- **Backlog** — sum of `Pending` site rows across sites.
- **Top inbound callers** — top-10 `Actor` by request count, last 1 h.
- **Top outbound 5xx** — top-10 `Target` by 5xx-status count, last 1 h.
- **Audit volume** — events/min landing in the central `AuditLog`; global plus per-site sparkline.
- **Audit error rate** — % of central `AuditLog` rows with `Status` NOT IN (`Success`, `Delivered`, `Enqueued`) over a rolling 5-minute window. This is the operational error rate of audited operations (HTTP 5xx, transient failures, parked deliveries) — NOT audit-writer health, which surfaces separately via `CentralAuditWriteFailures` and `AuditRedactionFailure`.
- **Audit backlog** — sum of `Pending` site rows across sites; click drills into a per-site breakdown.
[Notification Outbox](Component-NotificationOutbox.md) and
[Site Call Audit](Component-SiteCallAudit.md) KPIs are unaffected — they remain
@@ -355,9 +353,7 @@ global value in v1; per-channel overrides are deferred to v1.x.
emits `ApiOutbound.SyncCall` rows on every sync `Call()`. For `CachedCall`,
emits the combined cached telemetry packet (audit row + operational update)
per Cached Operations — Combined Telemetry.
- **[Site Runtime (#3)](Component-SiteRuntime.md) — Database layer** — emits
`DbOutbound.SyncWrite`, `DbOutbound.SyncRead`, and the cached-write variants
via the same combined-telemetry path.
- **[External System Gateway (#7)](Component-ExternalSystemGateway.md) — Database layer** — the database access modes inside ESG emit `DbOutbound.SyncWrite` and `DbOutbound.SyncRead` on script-initiated `Connection()` calls; `Database.CachedWrite` emits the cached-write lifecycle rows via the combined-telemetry packet (same path as `ApiOutbound.Cached*`). Site Runtime is the API surface that exposes the `Database.*` calls to scripts; the audit emission itself lives in ESG.
- **[Inbound API (#14)](Component-InboundAPI.md)** — emits one
`ApiInbound.Completed` row per request from request-handler middleware,
written directly to central via `ICentralAuditWriter` before the response is