diff --git a/docs/v2/multi-host-dispatch.md b/docs/v2/multi-host-dispatch.md new file mode 100644 index 0000000..8e8081f --- /dev/null +++ b/docs/v2/multi-host-dispatch.md @@ -0,0 +1,46 @@ +# Multi-host dispatch — per-PLC circuit breakers + +Phase 6.1 decision #144 / task #135. Motivation: a single DriverInstance that fronts N PLCs (Modbus with multiple slaves, AB CIP with multiple ControlLogix chassis, etc.) must not let one dead PLC trip the resilience breaker for its healthy siblings. + +This note documents the shipped contract so future driver authors don't re-derive it. + +## Contract + +The resilience pipeline keys on `(DriverInstanceId, HostName, DriverCapability)`. One dead PLC opens only the pipeline keyed on its HostName; healthy sibling PLCs keep their own pipelines intact. + +Three participants: + +1. **`DriverResiliencePipelineBuilder.GetOrCreate(driverInstanceId, hostName, capability, options)`** — the pipeline cache. First call per key builds a Polly pipeline (timeout → retry → breaker). Subsequent calls return the cached instance. Covered by `DriverResiliencePipelineBuilderTests.Pipeline_IsIsolated_PerHost`. + +2. **`CapabilityInvoker.ExecuteAsync(capability, hostName, callSite, ct)`** — takes `hostName` per-call. Threads it straight through to the pipeline builder. Covered by `CapabilityInvokerTests`. + +3. **`IPerCallHostResolver.ResolveHost(fullReference)`** — an optional interface a multi-device driver implements. `DriverNodeManager.ResolveHostFor` calls it on every capability dispatch so the host flowing into the invoker comes from the tag's per-PLC metadata, not the driver instance. Single-device drivers don't implement it — `DriverNodeManager` falls back to `DriverInstanceId` as the hostname, which still flows through the same `(instance, host, capability)` key shape (one pipeline per single-device instance). + +End-to-end `dead PLC, healthy PLC` scenario proven by `PerCallHostResolverDispatchTests.DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver`. + +## Driver author checklist + +To light up per-PLC circuit breakers on a multi-device driver: + +1. **Options model** — extend the driver's options type with an explicit device list. See `AbCipDriverOptions.Devices : IReadOnlyList`. +2. **Tag → device mapping** — parse the tag's `DeviceId` from `TagConfig`. The driver's per-tag definition records the device HostAddress alongside the wire address. See `AbCipTagDefinition.DeviceHostAddress`. +3. **`IPerCallHostResolver`** — implement it on the driver. `ResolveHost(fullReference)` looks up the tag's definition and returns the device HostAddress. Unknown references should return a deterministic fallback (e.g. the first configured device's host) rather than throw — the invoker handles the mislookup at capability level when the actual read surfaces `BadNodeIdUnknown`. +4. **Health surface** — `IHostConnectivityProbe.GetHostStatuses()` returns one `HostConnectivityStatus` per configured device so the Admin UI fleet page lights the per-PLC status distinctly. +5. **Transport per device** — one network connection per PLC, serialized per device via `SemaphoreSlim` (or equivalent). Do not share a transport across PLCs; the breaker-isolation guarantee disappears if they share a queue. + +## Current fleet status (2026-04-24) + +| Driver | Per-tag device | `IPerCallHostResolver` | Per-PLC breaker isolation | +|---|---|---|---| +| AB CIP | ✅ `DeviceId` | ✅ | ✅ live | +| AB Legacy | 1 device / instance | — (not needed) | trivial | +| Modbus | 1 device / instance today | — | trivial — multi-device refactor tracked separately | +| S7 | 1 device / instance today | — | trivial — same | +| TwinCAT | 1 device / instance today | — | trivial — same | +| FOCAS | 1 CNC / instance | — (not needed) | trivial | +| Galaxy | 1 Galaxy Host / instance | — (not needed) | trivial — Host recycle runs per instance | +| OPC UA Client | 1 upstream / instance | — (not needed) | trivial | + +"Trivial" above means the pipeline key ends up as `(DriverInstanceId, DriverInstanceId, capability)` via `DriverNodeManager.ResolveHostFor`'s fallback — one pipeline per driver instance, which is correct for single-device drivers. + +Extending Modbus / S7 / TwinCAT to multi-device follows the AB CIP template verbatim; it's per-driver surgery (schema row + options model + resolver implementation + transport fan-out) rather than shared-infrastructure work.