Phase 6.1 Stream E.3 partial - in-flight counter feeds CurrentBulkheadDepth #107

Merged
dohertj2 merged 1 commits from phase-6-1-stream-e3-inflight-counter into v2 2026-04-19 15:04:30 -04:00
Owner

CapabilityInvoker now records start/complete against the optional tracker; ResilienceStatusPublisherHostedService persists the counter as CurrentBulkheadDepth (was 0 before).

Summary

  • DriverResilienceStatusTracker gains RecordCallStart/Complete + CurrentInFlight snapshot field. Clamps to zero on over-decrement.
  • CapabilityInvoker optional statusTracker ctor param + try/finally around every ExecuteAsync so the counter decrements on success, exception, or cancellation.
  • Publisher persists CurrentInFlight as CurrentBulkheadDepth. A future Polly-telemetry observer can replace the proxy without changing the column shape.

Test plan

  • 8 new InFlightCounterTests: start/complete parity, nested starts, over-decrement clamp, per-host independence, 500-parallel concurrent safety, mid-call observation == 1, exception path decrements, null-tracker no-op.
  • Full solution dotnet test: 1243 passing (was 1235, +8).

🤖 Generated with Claude Code

CapabilityInvoker now records start/complete against the optional tracker; ResilienceStatusPublisherHostedService persists the counter as CurrentBulkheadDepth (was 0 before). ## Summary - `DriverResilienceStatusTracker` gains RecordCallStart/Complete + CurrentInFlight snapshot field. Clamps to zero on over-decrement. - `CapabilityInvoker` optional statusTracker ctor param + try/finally around every ExecuteAsync so the counter decrements on success, exception, or cancellation. - Publisher persists CurrentInFlight as CurrentBulkheadDepth. A future Polly-telemetry observer can replace the proxy without changing the column shape. ## Test plan - [x] 8 new InFlightCounterTests: start/complete parity, nested starts, over-decrement clamp, per-host independence, 500-parallel concurrent safety, mid-call observation == 1, exception path decrements, null-tracker no-op. - [x] Full solution `dotnet test`: 1243 passing (was 1235, +8). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
dohertj2 added 1 commit 2026-04-19 15:04:28 -04:00
Closes the observer half of #162 that was flagged as "persisted as 0 today"
in PR #105. The Admin /hosts column refresh + FleetStatusHub SignalR push
+ red-badge visual still belong to the visual-compliance pass.

Core.Resilience:
- DriverResilienceStatusTracker gains RecordCallStart + RecordCallComplete
  + CurrentInFlight field on the snapshot record. Concurrent-safe via the
  same ConcurrentDictionary.AddOrUpdate pattern as the other recorder methods.
  Clamps to zero on over-decrement so a stray Complete-without-Start can't
  drive the counter negative.
- CapabilityInvoker gains an optional statusTracker ctor parameter. When
  wired, every ExecuteAsync / ExecuteAsync(void) wraps the pipeline call
  in try / finally that records start/complete — so the counter advances
  cleanly whether the call succeeds, cancels, or throws. Null tracker keeps
  the pre-Phase-6.1 Stream E.3 behaviour exactly.

Server.Hosting:
- ResilienceStatusPublisherHostedService persists CurrentInFlight as the
  DriverInstanceResilienceStatus.CurrentBulkheadDepth column (was 0 before
  this PR). One-line fix on both the insert + update branches.

The in-flight counter is a pragmatic proxy for Polly's internal bulkhead
depth — a future PR wiring Polly telemetry would replace it with the real
value. The shape of the column + the publisher + the Admin /hosts query
doesn't change, so the follow-up is invisible to consumers.

Tests (8 new InFlightCounterTests, all pass):
- Start+Complete nets to zero.
- Nested starts sum; Complete decrements.
- Complete-without-Start clamps to zero.
- Different hosts track independently.
- Concurrent starts (500 parallel) don't lose count.
- CapabilityInvoker observed-mid-call depth == 1 during a pending call.
- CapabilityInvoker exception path still decrements (try/finally).
- CapabilityInvoker without tracker doesn't throw.

Full solution dotnet test: 1243 passing (was 1235, +8). Pre-existing
Client.CLI Subscribe flake unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 merged commit 2172d49d2e into v2 2026-04-19 15:04:30 -04:00
dohertj2 referenced this issue from a commit 2026-04-30 08:21:24 -04:00
Harden v2 design against the four findings from the 2026-04-17 Codex adversarial review of the db schema and admin UI: (1) DriverInstance.NamespaceId now enforces a same-cluster invariant in three layers (sp_ValidateDraft cross-table check using the new UX_Namespace_Generation_LogicalId_Cluster composite index, server-side namespace-selection API scoping that prevents bypass via crafted requests, and audit-log entries on cross-cluster attempts) so a draft for cluster A can no longer bind to cluster B's namespace and leak its URI into A's endpoint; (2) the Namespace table moves from cluster-level to generation-versioned with append-only logical-ID identity and locked NamespaceUri/Kind across generations so admins can no longer disable a namespace that a published driver depends on outside the publish/diff/rollback flow, the cluster-create workflow opens an initial draft containing the default namespaces instead of writing namespace rows directly, and the Admin UI Namespaces tab becomes hybrid (read-only over published, click-to-edit opens draft) like the UNS Structure tab; (3) ZTag/SAPID fleet-wide uniqueness moves from per-generation indexes (which silently allow rollback or re-enable to reintroduce duplicates) into a new ExternalIdReservation table that sits outside generation versioning, with sp_PublishGeneration reserving atomically via MERGE under transaction lock so a different EquipmentUuid attempting the same active value rolls the whole publish back, an FleetAdmin-only sp_ReleaseExternalIdReservation as the only path to free a value for reuse with audit trail, and a corresponding Release-reservation operator workflow in the Admin UI; (4) Equipment.EquipmentId is now system-generated as 'EQ-' + first 12 hex chars of EquipmentUuid, never operator-supplied or editable, removed from the Equipment CSV import schema entirely (rows match by EquipmentUuid for updates or create new equipment with auto-generated identifiers when no UUID is supplied), with a new Merge-or-Rebind-equipment operator workflow handling the rare case where two UUIDs need to be reconciled — closing the corruption path where typos and bulk-import renames were minting duplicate identities and breaking downstream UUID-keyed lineage. New decisions #122-125 with explicit "supersedes" notes for the earlier #107 (cluster-level namespace) and #116 (operator-set EquipmentId) frames they revise.
Sign in to join this conversation.