Phase 6.1 multi-host dispatch — document shipped contract + per-driver status

Task #127 / decision #144. The resilience infrastructure for per-PLC
circuit breakers is shipped and fully tested — the task description's
"current pipeline keys on DriverInstanceId only" was stale. The actual
state:

- `DriverResiliencePipelineBuilder` keys on
  `(DriverInstanceId, HostName, DriverCapability)`.
- `CapabilityInvoker.ExecuteAsync` takes `hostName` per call.
- `IPerCallHostResolver` is the driver-side hook; AB CIP implements it.
- `PerCallHostResolverDispatchTests.DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver`
  proves the end-to-end isolation.

Remaining work is per-driver adoption, not shared infrastructure:
- AB CIP: live + tested
- Galaxy / FOCAS / OPC UA Client / AB Legacy: 1 device per instance by
  design, trivially isolated
- Modbus / S7 / TwinCAT: single-device today; multi-device refactor is
  per-driver surgery (Device row + options + resolver + transport
  fan-out), not a shared-infra change

Shipping docs/v2/multi-host-dispatch.md as the canonical reference:
contract + driver-author checklist + current fleet-wide status table.
Future driver authors follow the AB CIP template.

No code change in this commit — doc-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-04-24 19:01:47 -04:00
parent bd6568bcbd
commit e5d1c9c9b9

View File

@@ -0,0 +1,46 @@
# Multi-host dispatch — per-PLC circuit breakers
Phase 6.1 decision #144 / task #135. Motivation: a single DriverInstance that fronts N PLCs (Modbus with multiple slaves, AB CIP with multiple ControlLogix chassis, etc.) must not let one dead PLC trip the resilience breaker for its healthy siblings.
This note documents the shipped contract so future driver authors don't re-derive it.
## Contract
The resilience pipeline keys on `(DriverInstanceId, HostName, DriverCapability)`. One dead PLC opens only the pipeline keyed on its HostName; healthy sibling PLCs keep their own pipelines intact.
Three participants:
1. **`DriverResiliencePipelineBuilder.GetOrCreate(driverInstanceId, hostName, capability, options)`** — the pipeline cache. First call per key builds a Polly pipeline (timeout → retry → breaker). Subsequent calls return the cached instance. Covered by `DriverResiliencePipelineBuilderTests.Pipeline_IsIsolated_PerHost`.
2. **`CapabilityInvoker.ExecuteAsync(capability, hostName, callSite, ct)`** — takes `hostName` per-call. Threads it straight through to the pipeline builder. Covered by `CapabilityInvokerTests`.
3. **`IPerCallHostResolver.ResolveHost(fullReference)`** — an optional interface a multi-device driver implements. `DriverNodeManager.ResolveHostFor` calls it on every capability dispatch so the host flowing into the invoker comes from the tag's per-PLC metadata, not the driver instance. Single-device drivers don't implement it — `DriverNodeManager` falls back to `DriverInstanceId` as the hostname, which still flows through the same `(instance, host, capability)` key shape (one pipeline per single-device instance).
End-to-end `dead PLC, healthy PLC` scenario proven by `PerCallHostResolverDispatchTests.DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver`.
## Driver author checklist
To light up per-PLC circuit breakers on a multi-device driver:
1. **Options model** — extend the driver's options type with an explicit device list. See `AbCipDriverOptions.Devices : IReadOnlyList<AbCipDeviceConfig>`.
2. **Tag → device mapping** — parse the tag's `DeviceId` from `TagConfig`. The driver's per-tag definition records the device HostAddress alongside the wire address. See `AbCipTagDefinition.DeviceHostAddress`.
3. **`IPerCallHostResolver`** — implement it on the driver. `ResolveHost(fullReference)` looks up the tag's definition and returns the device HostAddress. Unknown references should return a deterministic fallback (e.g. the first configured device's host) rather than throw — the invoker handles the mislookup at capability level when the actual read surfaces `BadNodeIdUnknown`.
4. **Health surface**`IHostConnectivityProbe.GetHostStatuses()` returns one `HostConnectivityStatus` per configured device so the Admin UI fleet page lights the per-PLC status distinctly.
5. **Transport per device** — one network connection per PLC, serialized per device via `SemaphoreSlim` (or equivalent). Do not share a transport across PLCs; the breaker-isolation guarantee disappears if they share a queue.
## Current fleet status (2026-04-24)
| Driver | Per-tag device | `IPerCallHostResolver` | Per-PLC breaker isolation |
|---|---|---|---|
| AB CIP | ✅ `DeviceId` | ✅ | ✅ live |
| AB Legacy | 1 device / instance | — (not needed) | trivial |
| Modbus | 1 device / instance today | — | trivial — multi-device refactor tracked separately |
| S7 | 1 device / instance today | — | trivial — same |
| TwinCAT | 1 device / instance today | — | trivial — same |
| FOCAS | 1 CNC / instance | — (not needed) | trivial |
| Galaxy | 1 Galaxy Host / instance | — (not needed) | trivial — Host recycle runs per instance |
| OPC UA Client | 1 upstream / instance | — (not needed) | trivial |
"Trivial" above means the pipeline key ends up as `(DriverInstanceId, DriverInstanceId, capability)` via `DriverNodeManager.ResolveHostFor`'s fallback — one pipeline per driver instance, which is correct for single-device drivers.
Extending Modbus / S7 / TwinCAT to multi-device follows the AB CIP template verbatim; it's per-driver surgery (schema row + options model + resolver implementation + transport fan-out) rather than shared-infrastructure work.