From e5d1c9c9b98ca5618ebb464816d0608265695e3f Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Fri, 24 Apr 2026 19:01:47 -0400 Subject: [PATCH] =?UTF-8?q?Phase=206.1=20multi-host=20dispatch=20=E2=80=94?= =?UTF-8?q?=20document=20shipped=20contract=20+=20per-driver=20status?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Task #127 / decision #144. The resilience infrastructure for per-PLC circuit breakers is shipped and fully tested — the task description's "current pipeline keys on DriverInstanceId only" was stale. The actual state: - `DriverResiliencePipelineBuilder` keys on `(DriverInstanceId, HostName, DriverCapability)`. - `CapabilityInvoker.ExecuteAsync` takes `hostName` per call. - `IPerCallHostResolver` is the driver-side hook; AB CIP implements it. - `PerCallHostResolverDispatchTests.DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver` proves the end-to-end isolation. Remaining work is per-driver adoption, not shared infrastructure: - AB CIP: live + tested - Galaxy / FOCAS / OPC UA Client / AB Legacy: 1 device per instance by design, trivially isolated - Modbus / S7 / TwinCAT: single-device today; multi-device refactor is per-driver surgery (Device row + options + resolver + transport fan-out), not a shared-infra change Shipping docs/v2/multi-host-dispatch.md as the canonical reference: contract + driver-author checklist + current fleet-wide status table. Future driver authors follow the AB CIP template. No code change in this commit — doc-only. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/v2/multi-host-dispatch.md | 46 ++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) create mode 100644 docs/v2/multi-host-dispatch.md diff --git a/docs/v2/multi-host-dispatch.md b/docs/v2/multi-host-dispatch.md new file mode 100644 index 0000000..8e8081f --- /dev/null +++ b/docs/v2/multi-host-dispatch.md @@ -0,0 +1,46 @@ +# Multi-host dispatch — per-PLC circuit breakers + +Phase 6.1 decision #144 / task #135. Motivation: a single DriverInstance that fronts N PLCs (Modbus with multiple slaves, AB CIP with multiple ControlLogix chassis, etc.) must not let one dead PLC trip the resilience breaker for its healthy siblings. + +This note documents the shipped contract so future driver authors don't re-derive it. + +## Contract + +The resilience pipeline keys on `(DriverInstanceId, HostName, DriverCapability)`. One dead PLC opens only the pipeline keyed on its HostName; healthy sibling PLCs keep their own pipelines intact. + +Three participants: + +1. **`DriverResiliencePipelineBuilder.GetOrCreate(driverInstanceId, hostName, capability, options)`** — the pipeline cache. First call per key builds a Polly pipeline (timeout → retry → breaker). Subsequent calls return the cached instance. Covered by `DriverResiliencePipelineBuilderTests.Pipeline_IsIsolated_PerHost`. + +2. **`CapabilityInvoker.ExecuteAsync(capability, hostName, callSite, ct)`** — takes `hostName` per-call. Threads it straight through to the pipeline builder. Covered by `CapabilityInvokerTests`. + +3. **`IPerCallHostResolver.ResolveHost(fullReference)`** — an optional interface a multi-device driver implements. `DriverNodeManager.ResolveHostFor` calls it on every capability dispatch so the host flowing into the invoker comes from the tag's per-PLC metadata, not the driver instance. Single-device drivers don't implement it — `DriverNodeManager` falls back to `DriverInstanceId` as the hostname, which still flows through the same `(instance, host, capability)` key shape (one pipeline per single-device instance). + +End-to-end `dead PLC, healthy PLC` scenario proven by `PerCallHostResolverDispatchTests.DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver`. + +## Driver author checklist + +To light up per-PLC circuit breakers on a multi-device driver: + +1. **Options model** — extend the driver's options type with an explicit device list. See `AbCipDriverOptions.Devices : IReadOnlyList`. +2. **Tag → device mapping** — parse the tag's `DeviceId` from `TagConfig`. The driver's per-tag definition records the device HostAddress alongside the wire address. See `AbCipTagDefinition.DeviceHostAddress`. +3. **`IPerCallHostResolver`** — implement it on the driver. `ResolveHost(fullReference)` looks up the tag's definition and returns the device HostAddress. Unknown references should return a deterministic fallback (e.g. the first configured device's host) rather than throw — the invoker handles the mislookup at capability level when the actual read surfaces `BadNodeIdUnknown`. +4. **Health surface** — `IHostConnectivityProbe.GetHostStatuses()` returns one `HostConnectivityStatus` per configured device so the Admin UI fleet page lights the per-PLC status distinctly. +5. **Transport per device** — one network connection per PLC, serialized per device via `SemaphoreSlim` (or equivalent). Do not share a transport across PLCs; the breaker-isolation guarantee disappears if they share a queue. + +## Current fleet status (2026-04-24) + +| Driver | Per-tag device | `IPerCallHostResolver` | Per-PLC breaker isolation | +|---|---|---|---| +| AB CIP | ✅ `DeviceId` | ✅ | ✅ live | +| AB Legacy | 1 device / instance | — (not needed) | trivial | +| Modbus | 1 device / instance today | — | trivial — multi-device refactor tracked separately | +| S7 | 1 device / instance today | — | trivial — same | +| TwinCAT | 1 device / instance today | — | trivial — same | +| FOCAS | 1 CNC / instance | — (not needed) | trivial | +| Galaxy | 1 Galaxy Host / instance | — (not needed) | trivial — Host recycle runs per instance | +| OPC UA Client | 1 upstream / instance | — (not needed) | trivial | + +"Trivial" above means the pipeline key ends up as `(DriverInstanceId, DriverInstanceId, capability)` via `DriverNodeManager.ResolveHostFor`'s fallback — one pipeline per driver instance, which is correct for single-device drivers. + +Extending Modbus / S7 / TwinCAT to multi-device follows the AB CIP template verbatim; it's per-driver surgery (schema row + options model + resolver implementation + transport fan-out) rather than shared-infrastructure work.