Commit Graph

3 Commits

Author SHA1 Message Date
Joseph Doherty
ae8f226e45 Phase 6.1 Stream E.3 partial — in-flight counter feeds CurrentBulkheadDepth
Closes the observer half of #162 that was flagged as "persisted as 0 today"
in PR #105. The Admin /hosts column refresh + FleetStatusHub SignalR push
+ red-badge visual still belong to the visual-compliance pass.

Core.Resilience:
- DriverResilienceStatusTracker gains RecordCallStart + RecordCallComplete
  + CurrentInFlight field on the snapshot record. Concurrent-safe via the
  same ConcurrentDictionary.AddOrUpdate pattern as the other recorder methods.
  Clamps to zero on over-decrement so a stray Complete-without-Start can't
  drive the counter negative.
- CapabilityInvoker gains an optional statusTracker ctor parameter. When
  wired, every ExecuteAsync / ExecuteAsync(void) wraps the pipeline call
  in try / finally that records start/complete — so the counter advances
  cleanly whether the call succeeds, cancels, or throws. Null tracker keeps
  the pre-Phase-6.1 Stream E.3 behaviour exactly.

Server.Hosting:
- ResilienceStatusPublisherHostedService persists CurrentInFlight as the
  DriverInstanceResilienceStatus.CurrentBulkheadDepth column (was 0 before
  this PR). One-line fix on both the insert + update branches.

The in-flight counter is a pragmatic proxy for Polly's internal bulkhead
depth — a future PR wiring Polly telemetry would replace it with the real
value. The shape of the column + the publisher + the Admin /hosts query
doesn't change, so the follow-up is invisible to consumers.

Tests (8 new InFlightCounterTests, all pass):
- Start+Complete nets to zero.
- Nested starts sum; Complete decrements.
- Complete-without-Start clamps to zero.
- Different hosts track independently.
- Concurrent starts (500 parallel) don't lose count.
- CapabilityInvoker observed-mid-call depth == 1 during a pending call.
- CapabilityInvoker exception path still decrements (try/finally).
- CapabilityInvoker without tracker doesn't throw.

Full solution dotnet test: 1243 passing (was 1235, +8). Pre-existing
Client.CLI Subscribe flake unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 15:02:34 -04:00
Joseph Doherty
016122841b Phase 6.1 Stream E.2 partial — ResilienceStatusPublisherHostedService persists tracker snapshots to DB
Closes the HostedService half of Phase 6.1 Stream E.2 flagged as a follow-up
when the DriverResilienceStatusTracker shipped in PR #82. The Admin /hosts
column refresh + SignalR push + red-badge visual (Stream E.3) remain
deferred to the visual-compliance pass — this PR owns the persistence
story alone.

Server.Hosting:
- ResilienceStatusPublisherHostedService : BackgroundService. Samples the
  DriverResilienceStatusTracker every TickInterval (default 5 s) and upserts
  each (DriverInstanceId, HostName) counter pair into
  DriverInstanceResilienceStatus via EF. New rows on first sight; in-place
  updates on subsequent ticks.
- PersistOnceAsync extracted public so tests drive one tick directly —
  matches the ScheduledRecycleHostedService pattern for deterministic
  timing.
- Best-effort persistence: a DB outage logs a warning + continues; the next
  tick retries. Never crashes the app on sample failure. Cancellation
  propagates through cleanly.
- Tracks the bulkhead depth / recycle / footprint columns the entity was
  designed for. CurrentBulkheadDepth currently persisted as 0 — the tracker
  doesn't yet expose live bulkhead depth; a narrower follow-up wires the
  Polly bulkhead-depth observer into the tracker.

Tests (6 new in ResilienceStatusPublisherHostedServiceTests):
- Empty tracker → tick is a no-op, zero rows written.
- Single-host counters → upsert a new row with ConsecutiveFailures + breaker
  timestamp + sampled timestamp.
- Second tick updates the existing row in place (not a second insert).
- Multi-host pairs persist independently.
- Footprint counters (Baseline + Current) round-trip.
- TickCount advances on every PersistOnceAsync call.

Full solution dotnet test: 1225 passing (was 1219, +6). Pre-existing
Client.CLI Subscribe flake unchanged.

Production wiring (Program.cs) example:
  builder.Services.AddSingleton<DriverResilienceStatusTracker>();
  builder.Services.AddHostedService<ResilienceStatusPublisherHostedService>();
  // Tracker gets wired into CapabilityInvoker via OtOpcUaServer resolution
  // + the existing Phase 6.1 layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 14:36:00 -04:00
Joseph Doherty
c4a92f424a Phase 6.1 Stream B.4 follow-up — ScheduledRecycleHostedService drives registered schedulers on a fixed tick
Turns the Phase 6.1 Stream B.4 pure-logic ScheduledRecycleScheduler (shipped
in PR #79) into a running background feature. A Tier C driver registers its
scheduler at startup; the hosted service ticks every TickInterval (default
1 min) and invokes TickAsync on each registered scheduler.

Server.Hosting:
- ScheduledRecycleHostedService : BackgroundService. AddScheduler(s) must be
  called before StartAsync — registering post-start throws
  InvalidOperationException to avoid "some ticks saw my scheduler, some
  didn't" races. ExecuteAsync loops on Task.Delay(TickInterval, _timeProvider,
  stoppingToken) + delegates to a public TickOnceAsync method for one tick.
- TickOnceAsync extracted as the unit-of-work so tests drive it directly
  without needing to synchronize with FakeTimeProvider + BackgroundService
  timing semantics.
- Exception isolation: if one scheduler throws, the loop logs + continues
  to the next scheduler. A flaky supervisor can't take down the tick for
  every other Tier C driver.
- Diagnostics: TickCount + SchedulerCount properties for tests + logs.

Tests (7 new ScheduledRecycleHostedServiceTests, all pass):
- TickOnce before interval doesn't fire; TickCount still advances.
- TickOnce at/after interval fires the underlying scheduler exactly once.
- Multiple ticks accumulate count.
- AddScheduler after StartAsync throws.
- Throwing scheduler doesn't poison its neighbours (logs + continues).
- SchedulerCount matches registrations.
- Empty scheduler list ticks cleanly (no-op + counter advances).

Full solution dotnet test: 1193 passing (was 1186, +7). Pre-existing
Client.CLI Subscribe flake unchanged.

Production wiring (Program.cs):
  builder.Services.AddSingleton<ScheduledRecycleHostedService>();
  builder.Services.AddHostedService(sp => sp.GetRequiredService<ScheduledRecycleHostedService>());
  // During DI configuration, once Tier C drivers + their ScheduledRecycleSchedulers
  // are resolved, call host.AddScheduler(scheduler) for each.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:42:08 -04:00