feat(health): SiteAuditWriteFailures counter + AuditLog bridge (#23)

Bundle G of Audit Log #23 M2. Bridges the FallbackAuditWriter primary-
failure counter into the Site Health Monitoring report payload so a
sustained audit-write outage surfaces on /monitoring/health instead of
disappearing into a NoOp sink.

- SiteHealthReport: add SiteAuditWriteFailures (defaulted, additive).
- ISiteHealthCollector + SiteHealthCollector: new
  IncrementSiteAuditWriteFailures() counter, per-interval reset
  semantics matching ScriptErrorCount / DeadLetterCount.
- HealthMetricsAuditWriteFailureCounter: adapter forwarding
  IAuditWriteFailureCounter.Increment() to the collector.
- AddAuditLogHealthMetricsBridge(): swaps the NoOp default
  registration for the real bridge; called from
  SiteServiceRegistration after AddSiteHealthMonitoring + AddAuditLog.
- Existing host-wiring test updated: site composition now resolves
  HealthMetricsAuditWriteFailureCounter (not NoOp).

Tests: HealthMonitoring 60 -> 63 (3 new), AuditLog 56 -> 59 (3 new),
full solution green.
This commit is contained in:
Joseph Doherty
2026-05-20 13:22:25 -04:00
parent 82a8bbf225
commit dd3351da93
11 changed files with 261 additions and 4 deletions

View File

@@ -20,7 +20,12 @@ public record SiteHealthReport(
IReadOnlyDictionary<string, string>? DataConnectionEndpoints = null,
IReadOnlyDictionary<string, TagQualityCounts>? DataConnectionTagQuality = null,
int ParkedMessageCount = 0,
IReadOnlyList<NodeStatus>? ClusterNodes = null);
IReadOnlyList<NodeStatus>? ClusterNodes = null,
// Audit Log (#23) M2 Bundle G: per-interval count of FallbackAuditWriter
// primary failures (SQLite throws routed to the drop-oldest ring). Surfaces
// a sustained audit-write outage on /monitoring/health. Defaults to 0 so
// existing producers / tests that don't construct the field stay valid.
int SiteAuditWriteFailures = 0);
/// <summary>
/// Broadcast wrapper used between central nodes to keep per-node