Closes out Stream B per docs/v2/implementation/phase-6-1-resilience-and-observability.md.
Core.Abstractions:
- IDriverSupervisor — process-level supervisor contract a Tier C driver's
out-of-process topology provides (Galaxy Proxy/Supervisor implements this in
a follow-up Driver.Galaxy wiring PR). Concerns: DriverInstanceId + RecycleAsync.
Tier A/B drivers don't implement this; Stream B code asserts tier == C before
ever calling it.
Core.Stability:
- MemoryRecycle — companion to MemoryTracking. On HardBreach, invokes the
supervisor IFF tier == C AND a supervisor is wired. Tier A/B HardBreach logs
a promotion-to-Tier-C recommendation and returns false. Soft/None/Warming
never triggers a recycle at any tier.
- ScheduledRecycleScheduler — Tier C opt-in periodic recycler per decision #67.
Ctor throws for Tier A/B (structural guard — scheduled recycle on an
in-process driver would kill every OPC UA session and every co-hosted
driver). TickAsync(now) advances the schedule by one interval per fire;
RequestRecycleNowAsync drives an ad-hoc recycle without shifting the cron.
- WedgeDetector — demand-aware per decision #147. Classify(state, demand, now)
returns:
* NotApplicable when driver state != Healthy
* Idle when Healthy + no pending work (bulkhead=0 && monitored=0 && historic=0)
* Healthy when Healthy + pending work + progress within threshold
* Faulted when Healthy + pending work + no progress within threshold
Threshold clamps to min 60 s. DemandSignal.HasPendingWork ORs the three counters.
The three false-wedge cases the plan calls out all stay Healthy: idle
subscription-only, slow historian backfill making progress, write-only burst
with drained bulkhead.
Tests (22 new, all pass):
- MemoryRecycleTests (7): Tier C hard-breach requests recycle; Tier A/B
hard-breach never requests; Tier C without supervisor no-ops; soft-breach
at every tier never requests; None/Warming never request.
- ScheduledRecycleSchedulerTests (6): ctor throws for A/B; zero/negative
interval throws; tick before due no-ops; tick at/after due fires once and
advances; RequestRecycleNow fires immediately without shifting schedule;
multiple fires across ticks advance one interval each.
- WedgeDetectorTests (9): threshold clamp to 60 s; unhealthy driver always
NotApplicable; idle subscription stays Idle; pending+fresh progress stays
Healthy; pending+stale progress is Faulted; MonitoredItems active but no
publish is Faulted; MonitoredItems active with fresh publish stays Healthy;
historian backfill with fresh progress stays Healthy; write-only burst with
empty bulkhead is Idle; HasPendingWork theory for any non-zero counter.
Full solution dotnet test: 989 passing (baseline 906, +83 for Phase 6.1 so far).
Pre-existing Client.CLI Subscribe flake unchanged.
Stream B complete. Next up: Stream C (health endpoints + structured logging).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stream B.1 — registry invariant:
- DriverTypeMetadata gains a required `DriverTier Tier` field. Every registered
driver type must declare its stability tier so the downstream MemoryTracking,
MemoryRecycle, and resilience-policy layers can resolve the right defaults.
Stamped-at-registration-time enforcement makes the "every driver type has a
non-null Tier" compliance check structurally impossible to fail.
- DriverTypeRegistry API unchanged; one new property on the record.
Stream B.2 — MemoryTracking (Core.Stability):
- Tier-agnostic tracker per decision #146: captures baseline as the median of
samples collected during a post-init warmup window (default 5 min), then
classifies each subsequent sample with the hybrid formula
`soft = max(multiplier × baseline, baseline + floor)`, `hard = 2 × soft`.
- Per-tier constants wired: Tier A mult=3 floor=50 MB, Tier B mult=3 floor=100 MB,
Tier C mult=2 floor=500 MB.
- Never kills. Hard-breach action returns HardBreach; the supervisor that acts
on that signal (MemoryRecycle) is Tier C only per decisions #74, #145 and
lands in the next B.3 commit on this branch.
- Two phases: WarmingUp (samples collected, Warming returned) and Steady
(baseline captured, soft/hard checks active). Transition is automatic when
the warmup window elapses.
Tests (15 new, all pass):
- Warming phase returns Warming until the window elapses.
- Window-elapsed captures median baseline + transitions to Steady.
- Per-tier constants match decision #146 table exactly.
- Soft threshold uses max() — small baseline → floor wins; large baseline →
multiplier wins.
- Hard = 2 × soft.
- Sample below soft = None; at soft = SoftBreach; at/above hard = HardBreach.
- DriverTypeRegistry: theory asserts Tier round-trips for A/B/C.
Full solution dotnet test: 963 passing (baseline 906, +57 net for Phase 6.1
Stream A + Stream B.1/B.2). Pre-existing Client.CLI Subscribe flake unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>