# OtOpcUa — FixedTree-injection follow-ups (design) **Date:** 2026-06-26 **Status:** ✅ Implemented (2026-06-26) — all five follow-ups (A–E) built via subagent-driven development (16 commits `c2c368dc`..`0074f37a` on `feat/focas-fixedtree-equipment-injection`; every task spec+code reviewed, high-risk tasks with serial Opus reviews). Offline suites green: Runtime.Tests 331, OpcUaServer.Tests 319, FOCAS 248 + AbLegacy/composer additions; `dotnet build` 0 errors, production `src/` 0 warnings (TreatWarningsAsErrors). No DB migration and no deployment-artifact wire-format change were needed (E is projection-only — the columns + `Devices` array were already serialized). Live wonder re-validation of the single-device FOCAS path is optional/user-gated (the base feature's live path is unchanged by these follow-ups). **Companion to:** [`2026-06-26-otopcua-fixedtree-equipment-injection-design.md`](2026-06-26-otopcua-fixedtree-equipment-injection-design.md) (the base feature — ✅ built + live-validated on `wonder-app-vd03`). This design works through the follow-ups that feature's review chain surfaced. **Branch:** continue on `feat/focas-fixedtree-equipment-injection` (stacked on `fix/focas-poll-io-serialization`, local/unpushed — standing rule is "commit/push only when asked"). --- ## Scope (locked with the user 2026-06-26) The user selected **all five** items below. The base feature's explicit non-goals — discovered-**alarm** injection and **writable** discovered nodes — remain out of scope (locked design decisions, untouched). | # | Follow-up | Size | Notes | |---|---|---|---| | A | Hardcoded 30 s discovery timeout → injectable | trivial | behavior-preserving | | B | Re-discovery opt-in/policy gate per driver | moderate | back-compat default | | C | Config-unchanged driver→equipment rebind re-triggers discovery | moderate | reverses a deliberate `won't-fix` | | D | De-dup the double `SetDesiredSubscriptions` during redeploy | small | one extra unsub/resub blip today | | E | Lift the ≥1-authored-tag requirement + multi-device-per-driver | largest | **projection-only, no DB migration** | ## Key discovery that shapes E The Config-DB **already** models the equipment→driver(→device) association as first-class data — no schema change is needed for E: - `Equipment.DriverInstanceId` (`string?`, made nullable by migration `20260608104706_NullableEquipmentDriverInstanceId`) - `Equipment.DeviceId` (`string?`) — FK to a multi-device driver's device - `Device` is a first-class entity (`DeviceId`, `DriverInstanceId`, schemaless `DeviceConfig` JSON with host) - The AdminUI equipment editor already exposes an optional driver pick (`EquipmentInput.DriverInstanceId`) The **only** gap is the runtime projection: `EquipmentNode` is `(EquipmentId, DisplayName, UnsLineId)` and drops `DriverInstanceId`/`DeviceId`, so the injector (`DriverHostActor.HandleDiscoveredNodes`) can only resolve the equipment by inferring it from authored `EquipmentTags` — hence the ≥1-tag requirement. E closes that gap in the projection + resolver, not the schema. --- ## A. Discovery timeout → injectable `DriverInstanceActor.HandleRediscoverAsync` hardcodes `new CancellationTokenSource(TimeSpan.FromSeconds(30))` while the rediscover interval + attempt-cap are already constructor parameters. Add a `rediscoverDiscoverTimeout` (`TimeSpan`, default `TimeSpan.FromSeconds(30)`) to the ctor and the `Props` factory; use the field instead of the literal. Pure consistency fix; default preserves behavior. ## B. Re-discovery opt-in / policy gate **Problem:** `StartDiscovery()` runs the bounded retry loop for **every** `ITagDiscovery` driver on every (re)connect. FOCAS needs it (its `FixedTreeCache` fills ~0–2 s *after* connect, so a single early pass would capture an empty/partial tree). A driver that browses its full shape **synchronously** inside `DiscoverAsync` (OpcUaClient, TwinCAT, AB) needs at most **one** pass — the 15×2 s retry is wasted (potentially heavy) network I/O. **Decision:** a per-driver **policy**, declared in code (driver "heaviness" is a property of the driver *type*, needs no DB/AdminUI plumbing, lowest risk). Add a default-implemented member to `ITagDiscovery`: ```csharp public enum DiscoveryRediscoverPolicy { UntilStable, Once, Never } public interface ITagDiscovery { DiscoveryRediscoverPolicy RediscoverPolicy => DiscoveryRediscoverPolicy.UntilStable; // default = today's behavior Task DiscoverAsync(IAddressSpaceBuilder builder, CancellationToken cancellationToken); } ``` - **`UntilStable`** (default, unchanged) — today's loop: retry every `_rediscoverInterval` up to `_rediscoverMaxAttempts` or until the captured signature is non-empty and stable. - **`Once`** — kick exactly one discovery pass on connect, emit one `DiscoveredNodesReady`, then stop. - **`Never`** — no post-connect discovery kick at all. **Driver assignments:** - **FOCAS** → `UntilStable` (explicit; it genuinely needs the retry). - **OpcUaClient, TwinCAT, AbCip, AbLegacy** → `Once` (they discover synchronously in `DiscoverAsync`; one pass on connect injects their tree, the retry loop only added cost). Any driver not overriding the default keeps `UntilStable`, so this is a strict no-regression change. **Mechanism:** `DriverInstanceActor.StartDiscovery` reads `((ITagDiscovery)_driver).RediscoverPolicy`. `Never` → return without scheduling. `Once`/`UntilStable` → schedule the first `RediscoverTick`; `HandleRediscoverAsync` stops after the first pass when the policy is `Once` (instead of evaluating stop-on-stable). **Alternative considered (rejected):** a per-instance JSON flag parsed by the host + AdminUI. More flexible but adds artifact/AdminUI plumbing for a knob whose correct value is type-uniform. ## C. Config-unchanged rebind re-triggers discovery **Problem:** when a redeploy rebinds a driver to a new equipment **without** a `DriverConfig` change, `PushDesiredSubscriptions`' re-inject tail correctly **drops** the stale cached plan (a stale `EQ-1`-scoped graft under `EQ-2` would be worse), but `ReconcileDrivers` only restarts a child on a `DriverConfig` change — so a config-unchanged child is never reconnected and the FixedTree stays absent under the new equipment until the driver's next natural reconnect/restart. The base feature deliberately did **not** add a re-trigger here, to avoid coupling the subscription pass to driver-lifecycle control. This follow-up reverses that — but cleanly, because the trigger is a **discovery** action, not lifecycle control (no stop/restart), and it is idempotent. **Decision:** add a `DriverInstanceActor.TriggerRediscovery` message. In the re-inject tail, the two branches that `Remove` a cached plan because of a rebind/loss also `Tell` that driver's child `TriggerRediscovery`. The child kicks a fresh `RediscoverTick` (current `_initGeneration`) **iff it is in the `Connected` state**; otherwise it no-ops (its eventual reconnect re-discovers anyway). The discovery pass re-emits `DiscoveredNodesReady`, which resolves against the **new** composition (`_lastComposition`) and grafts under the new equipment. The re-trigger honors B's policy (`Never` drivers do not re-discover; `Once`/`UntilStable` run their normal pass(es)). Update the inline comment at the drop site and the follow-up note in the base design doc to record the new behavior. ## D. De-dup the double `SetDesiredSubscriptions` **Problem:** during an in-process redeploy, a cached driver receives two `SetDesiredSubscriptions`: the bulk authored-only send in `PushDesiredSubscriptions`, then the authored∪discovered union from `ApplyDiscoveredPlan` (the re-inject tail). The first send forces the child to unsubscribe the whole handle (authored tags included) then the second re-subscribes — one extra blip per cached driver per redeploy. **Decision:** in the bulk loop, **skip** the send for any driver that has a `_discoveredByDriver` entry — the re-inject tail sends their complete union. **Critical fallback:** the re-inject tail can still *drop* a cached plan (rebind/loss, see C); when it does, it must send the **authored-only** set for that driver so its authored subscriptions are not lost. Net invariant: every driver receives exactly **one** `SetDesiredSubscriptions` per redeploy. ## E. Lift the ≥1-authored-tag requirement + multi-device No DB migration — projection + resolver only. **E1 — projection (`AddressSpaceComposer`).** Extend `EquipmentNode`: ```csharp public sealed record EquipmentNode( string EquipmentId, string DisplayName, string UnsLineId, string? DriverInstanceId = null, // from Equipment.DriverInstanceId string? DeviceId = null, // from Equipment.DeviceId string? DeviceHost = null); // resolved at projection time: Equipment.DeviceId -> Device.DeviceConfig host ``` `DriverInstanceId`/`DeviceId` are copied straight off the `Equipment` row. `DeviceHost` is resolved by joining `Equipment.DeviceId → Device` and parsing the host out of that `Device`'s schemaless `DeviceConfig` JSON, so the resolver can match it against a discovered device-host folder without re-reading the DB. All three are nullable; existing single-equipment behavior is unaffected when they're null. **E2 — resolver (`DriverHostActor.HandleDiscoveredNodes`, and the redeploy re-inject tail).** Replace tag-only equipment resolution with: > **candidates** = { equipments where `EquipmentNode.DriverInstanceId == driverId` } **∪** > { equipments inferred from authored `EquipmentTags` for `driverId` } (keeps today's path working) - **0 candidates** → log Info, skip (unchanged). - **1 candidate** → graft all discovered nodes under it (today's single-device behavior, now also works with **zero** authored tags because the equipment-level `DriverInstanceId` resolves it). The device-host folder is still collapsed (single device). - **>1 candidates (multi-device)** → partition discovered nodes by their device-host folder segment (`DiscoveredNode.FolderPathSegments[1]`) and graft each device's subtree under the equipment whose `DeviceHost` matches that segment. A device-host with no matching equipment is **warn-skipped** (its subtree is not grafted) rather than mis-grafted. The mapper's existing device-host collapse already disables itself when ≥2 distinct device-host segments are present, so multi-device paths retain the device-host level and don't collide. **⚠️ Implementation risk (E2 multi-device only):** the partition join is a **host string** match — the driver's emitted device-host folder segment (FOCAS uses `device.HostAddress`) must equal the equipment's projected `DeviceHost` (parsed from `DeviceConfig`). Both ultimately derive from the same device configuration, but the string forms must be normalized to match (e.g. `host:port`). The warn-skip fallback makes a mismatch *safe* (no mis-graft, authored tags + single-device paths unaffected); a normalization helper + a unit test pin the formats together. Single-device deployments (the validated FOCAS `z-34184` case) take the "1 candidate" path and are **not** exposed to this risk. --- ## Data flow (unchanged) E changes only *which equipment* a discovered node is grafted under and *whether* an equipment with no authored tags participates. Once the NodeId is assigned, the materialize → subscribe → poll → push value path is exactly the base feature's path; B/C/D change *when/how often* discovery runs and *how many* subscription pushes occur, not the value path. ## Error handling - B `Never` driver → no discovery, authored tags unaffected. - C re-trigger on a non-`Connected` child → no-op (safe; reconnect re-discovers). - D dropped-plan fallback → authored-only send, so a rebind/loss never strands a driver's authored subscriptions. - E multi-device unmatched device-host → Warning + skip that device's subtree; other devices + authored tags unaffected. >1 candidate with no `DeviceHost` data anywhere → falls back to the base feature's warn+skip (no regression). ## Testing - **A:** ctor/`Props` wires the timeout; default is 30 s (assert via a short injected timeout in an existing rediscover test). - **B:** `Never` → no `DiscoveredNodesReady`; `Once` → exactly one even when the captured set would keep growing; `UntilStable` → today's loop (regression). FOCAS reports `UntilStable`; the four network drivers report `Once`. - **C:** rebind drop branch `Tell`s `TriggerRediscovery`; `Connected` child re-discovers and re-emits; non-`Connected` child no-ops; re-trigger respects a `Never` policy. - **D:** single-send invariant — a cached driver gets exactly one `SetDesiredSubscriptions` on redeploy (union when applied; authored-only when the plan is dropped). - **E1:** `EquipmentNode` projection carries `DriverInstanceId`/`DeviceId`/`DeviceHost`; `DeviceHost` resolves via the `Device` join + `DeviceConfig` host parse; nulls when unset. - **E2:** tag-less graft (driver-level link, 0 authored tags); single-candidate unchanged (collapse retained); multi-device partition maps each device-host to the right equipment; unmatched device-host → warn-skip; host-string normalization. - **Regression:** Runtime.Tests, OpcUaServer.Tests, and the FOCAS suite stay green; the validated single-device FOCAS injection path is unchanged. - Live wonder re-validation of the single-device FOCAS path is **optional** and user-gated (the base feature is already live-validated; these follow-ups don't alter that path's runtime behavior). ## Scope / non-goals - **In:** A–E above. - **Out (still locked):** discovered-**alarm** injection; **writable** discovered nodes. ## Touched code (anticipated) - `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/ITagDiscovery.cs` — `DiscoveryRediscoverPolicy` enum + default member. - Driver classes (`FocasDriver`, `OpcUaClientDriver`, `TwinCATDriver`, `AbCipDriver`, `AbLegacyDriver`) — override `RediscoverPolicy`. - `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs` — injectable timeout (A); policy-gated `StartDiscovery`/`HandleRediscoverAsync` (B); `TriggerRediscovery` message (C). - `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs` — resolver union + multi-device partition (E2); re-trigger on rebind drop (C); bulk-send skip + dropped-plan fallback (D). - `src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/AddressSpaceComposer.cs` — `EquipmentNode` projection (E1). - `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DiscoveredNodeMapper.cs` — multi-device partition support (E2). - Tests under `tests/.../Runtime.Tests` and `tests/.../OpcUaServer.Tests`. ## Task tracking Implementation tasks to be generated by writing-plans from this design.