Admin /hosts red-badge + resilience columns + Polly telemetry observer (#164) #132

Merged
dohertj2 merged 1 commits from admin-hosts-refresh-red-badge into v2 2026-04-19 21:38:12 -04:00
Owner

Closes task #164 (Phase 6.1 Stream E.3 remaining). DriverResiliencePipelineBuilder gains optional DriverResilienceStatusTracker + OnRetry/OnOpened/OnClosed wiring. HostStatusService extended with 4 resilience fields + LEFT JOIN + FailureFlagThreshold=3/IsFlagged predicate. Hosts.razor red banner + 3 columns + Flagged per-row badge. 14/14 resilience-builder tests passing (3 new). Admin 72/72. SignalR live push + browser visual review deferred as follow-up.

Closes task #164 (Phase 6.1 Stream E.3 remaining). DriverResiliencePipelineBuilder gains optional DriverResilienceStatusTracker + OnRetry/OnOpened/OnClosed wiring. HostStatusService extended with 4 resilience fields + LEFT JOIN + FailureFlagThreshold=3/IsFlagged predicate. Hosts.razor red banner + 3 columns + Flagged per-row badge. 14/14 resilience-builder tests passing (3 new). Admin 72/72. SignalR live push + browser visual review deferred as follow-up.
dohertj2 added 1 commit 2026-04-19 21:38:03 -04:00
Admin /hosts red-badge + resilience columns + Polly telemetry observer. Closes task #164 (the remaining slice of Phase 6.1 Stream E.3 after the earlier publisher + hub PR). Three cooperating pieces wired together so the operator-facing /hosts table actually reflects the live Polly counters that the pipeline builder is producing. DriverResiliencePipelineBuilder gains an optional DriverResilienceStatusTracker ctor param — when non-null, every built pipeline wires Polly's OnRetry/OnOpened/OnClosed strategy-options callbacks into the tracker. OnRetry → tracker.RecordFailure (so ConsecutiveFailures climbs per retry), OnOpened → tracker.RecordBreakerOpen (stamps LastCircuitBreakerOpenUtc), OnClosed → tracker.RecordSuccess (resets the failure counter once the target recovers). Absent tracker = silent, preserving the unit-test constructor path + any deployment that doesn't care about resilience observability. Cancellation stays excluded from the failure count via the existing ShouldHandle predicate. HostStatusService.HostStatusRow extends with four new fields — ConsecutiveFailures, LastCircuitBreakerOpenUtc, CurrentBulkheadDepth, LastRecycleUtc — populated via a second LEFT JOIN onto DriverInstanceResilienceStatuses keyed on (DriverInstanceId, HostName). LEFT JOIN because brand-new hosts haven't been sampled yet; a missing row means zero failures + never-opened breaker, which is the correct default. New FailureFlagThreshold constant (=3, matches plan decision #143's conservative half-of-breaker convention) + IsFlagged predicate so the UI can pre-warn before the breaker actually trips. Hosts.razor paints three new columns between State and Last-transition — Fail# (bold red when flagged), In-flight (bulkhead-depth proxy), Breaker-opened (relative age). Per-row "Flagged" red badge alongside State when IsFlagged is true. Above the first cluster table, a red alert banner summarises the flagged-host count when ≥1 host is flagged, so operators see the problem before scanning rows. Three new tests in DriverResiliencePipelineBuilderTests — Tracker_RecordsFailure_OnEveryRetry verifies ConsecutiveFailures reaches RetryCount after a transient-forever operation, Tracker_StampsBreakerOpen_WhenBreakerTrips verifies LastBreakerOpenUtc is set after threshold failures on a Write pipeline, Tracker_IsolatesCounters_PerHost verifies one dead host does not leak failure counts into a healthy sibling. Full suite — Core.Tests 14/14 resilience-builder tests passing (11 existing + 3 new), Admin.Tests 72/72 passing, Admin project builds 0 errors. SignalR live push of status changes + browser visual review are deliberately left to a follow-up — this PR keeps the structural change minimal (polling refresh already exists in the page's 10s timer; SignalR would be a structural add that touches hub registration + client subscription). d06cc01a48
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 merged commit 49f6c9484e into v2 2026-04-19 21:38:12 -04:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: dohertj2/lmxopcua#132