Files
lmxopcua/docs/plans/2026-06-26-otopcua-fixedtree-equipment-injection-design.md
T

15 KiB
Raw Blame History

OtOpcUa — dynamic injection of driver-discovered FixedTree nodes into the Equipment projection (design)

Date: 2026-06-26 Status: Implemented (2026-06-26) — 11 tasks, offline-complete on branch feat/focas-fixedtree-equipment-injection (solution build 0 errors / 0 warnings; Runtime.Tests 312, OpcUaServer.Tests 304, FOCAS 247 + an end-to-end injection+value-flow test, all green). Live wonder validation pending.

Follow-ups surfaced during the review chain — ALL RESOLVED 2026-06-26 (design 2026-06-26-otopcua-fixedtree-followups-design.md, plan 2026-06-26-otopcua-fixedtree-followups.md; 16 commits c2c368dc..0074f37a on this branch, every task spec+code reviewed; full offline suite green):

  • Config-unchanged driver→equipment rebind now re-triggers discovery (follow-up C): the redeploy re-inject tail drops the stale plan AND Tells the driver child a new DriverInstanceActor.TriggerRediscovery (a discovery action — not lifecycle control — idempotent, child no-ops if not Connected), so the FixedTree re-grafts under the new equipment on the next pass instead of waiting for the next natural reconnect.
  • Multi-device-per-driver mapping implemented (follow-up E): EquipmentNode now carries DriverInstanceId/DeviceId/DeviceHost (projection-only — the columns + the Devices array were already in the artifact, no DB migration / no wire change), so equipment resolves via the driver binding without authored tags (≥1-tag requirement lifted), and a driver bound to multiple devices partitions its discovered tree by normalized device-host folder, grafting each device's subtree under the equipment whose DeviceHost matches (unmatched hosts warn-skip, never mis-graft).
  • Per-(re)connect re-discovery is now policy-gated (follow-up B): ITagDiscovery.RediscoverPolicy (UntilStable/Once/Never, default UntilStable) — FOCAS stays UntilStable (its FixedTree cache fills asynchronously after connect); the synchronous-discovery drivers (OpcUaClient/TwinCAT/AbCip/AbLegacy/Modbus/S7/Galaxy) are Once, dropping the wasteful 15× retry. The hardcoded 30 s per-pass discovery timeout is now injectable too (follow-up A).
  • The OPC-node-layer seed→serve gap (recording-sink-only e2e) was closed by the live wonder deploy of the base feature (validated 2026-06-26; see the deployment record). Companion to: 2026-06-25-otopcua-equipment-dataplane-investigation.md (symptom #1 — live FOCAS values — FIXED + deployed; this design addresses symptom #2). Base branch: fix/focas-poll-io-serialization (this feature builds on the now-deployed driver-host bootstrap re-spawn + FOCAS I/O fixes; that branch is ahead of master and not yet merged).

Problem

Deployed FOCAS equipment serves only its authored config tags (parts-count/parts-required). The driver's FixedTree (Identity / Axes / Spindle / Program / Timers — the auto-discovered CNC structure) never appears under the served Equipment/UNS address space.

Root cause (confirmed in the investigation, H2): the served Equipment tree is built purely from Config-DB entities (AddressSpaceComposer.ComposeAddressSpaceApplier → node manager). The only code that emits FixedTree nodes is ITagDiscovery.DiscoverAsync (each driver implements it), reachable only through GenericDriverNodeManager.BuildAddressSpaceAsync — which has no runtime caller (its referenced host method OpcUaApplicationHost.PopulateAddressSpaces no longer exists). So DiscoverAsync/ITagDiscovery is dead for serving: every served node is config-driven, and nothing surfaces a driver's discovered hierarchy.

Surfacing FixedTree under the Equipment node is therefore a new dynamic-node-injection capability, and it must solve a timing problem: composition runs at deploy/apply time (before the driver connects), but the FixedTree shape (axis count, spindle presence, which sections exist) is capability-discovered ~02 s after the driver connects (FocasDriver populates state.FixedTreeCache in its bootstrap loop).

Goal

After a driver connects, dynamically graft its discovered FixedTree nodes into the served Equipment projection under a driver-named subfolder, e.g.:

ns=2;s=EQ-3686c0272279                         (equipment "z-34184")
  ├── parts-count          (authored config tag — unchanged)
  ├── parts-required       (authored config tag — unchanged)
  └── FOCAS                 (NEW — driver-named discovered subfolder)
       ├── Identity/{SeriesNumber, Version, MaxAxes, CncType, MtType, AxisCount}
       ├── Axes/{<axis>/{AbsolutePosition, MachinePosition, RelativePosition, DistanceToGo}, FeedRate/Actual, SpindleSpeed/Actual}
       ├── Spindle/{<name>/{Load, MaxRpm}}      (capability-gated)
       ├── Program/{Name, ONumber, Number, MainNumber, Sequence, BlockCount}   (capability-gated)
       ├── OperationMode/{Mode, ModeText}        (capability-gated)
       └── Timers/{PowerOnSeconds, OperatingSeconds, CuttingSeconds, CycleSeconds}  (capability-gated)

Read-only value nodes carrying live values (e.g. EQ-…/FOCAS/Axes/X/AbsolutePosition reads Good).

Decisions (locked with the user 2026-06-26)

Decision Choice
Driver scope Generic — keyed off the shared ITagDiscovery interface (FOCAS, Galaxy, Modbus all implement it). FOCAS is the first/test consumer; others get it for free. Zero per-driver code changes.
Tree placement Under a driver-named subfolderEQ-…/FOCAS/… (collision-safe vs. authored tags; self-describing).
Device-host folder Collapse the single device-host level → EQ-…/FOCAS/Identity/… (not EQ-…/FOCAS/10.201.31.5:8193/Identity/…), valid because today's deployment is strictly 1:1 driver↔equipment↔device.
Model-change notification Emit GeneralModelChangeEvent after a runtime add so already-connected OPC UA clients can refresh their browse.
Multi-device-per-driver Deferred at base-feature time; implemented as follow-up E (2026-06-26) — EquipmentNode.DeviceHost partition.
Discovered alarms Out of scope — this feature surfaces value nodes only; alarms continue to come via the config path.
Writable discovered nodes Out of scope — FixedTree is read-only CNC state.

Approach (chosen): runtime post-connect injection via the actor pipeline

Treat discovered FixedTree nodes as "synthetic equipment tags" injected at runtime, reusing the existing materialize → subscribe → poll → push pipeline end-to-end. Only three new pieces; no driver changes (each driver's existing DiscoverAsync is reused verbatim via a capturing builder).

Rejected alternatives:

  • Composition-time pre-projection — can't author the right nodes before the driver discovers capabilities; defeats the purpose.
  • Resurrect GenericDriverNodeManager as a 2nd namespace (ns=3) — puts FixedTree in a separate tree (not under the equipment node), and that namespace's value-routing is also dead; more dead code to revive, wrong location.
  • Cheap baseline: author a Config-DB Tag row per FixedTree signal — no new code, but static (can't adapt to per-CNC capabilities) and per-signal × per-machine manual authoring. User chose to build the dynamic feature instead.

Components

1. CapturingAddressSpaceBuilder (new — runtime)

An IAddressSpaceBuilder implementation that records the streamed tree instead of creating OPC UA nodes. After a driver's DiscoverAsync(builder) returns, it exposes a flat IReadOnlyList<DiscoveredNode>:

DiscoveredNode {
  IReadOnlyList<string> FolderPathSegments,   // e.g. ["FOCAS", "<deviceHost>", "Identity"]
  string BrowseName, string DisplayName,
  string FullReference,                        // == DriverAttributeInfo.FullName (the driver ref + routing key)
  DriverDataType DataType, bool IsArray, uint? ArrayDim,
  bool Writable, bool IsHistorized
}
  • Folder(browse, display) returns a child capturing scope; Variable(...) records a node and returns an IVariableHandle whose FullReference is DriverAttributeInfo.FullName.
  • MarkAsAlarmCondition(...) returns a no-op sink; AddProperty(...) is ignored — value nodes only.

2. DriverInstanceActor — post-connect discovery (bounded retry)

On entering Connected, kick a bounded re-discovery:

  1. Run DiscoverAsync(capturingBuilder) against the live IDriver it owns.
  2. Tell the parent DriverHostActor a new message DiscoveredNodesReady(DriverInstanceId, IReadOnlyList<DiscoveredNode>).
  3. Because FOCAS suppresses FixedTree until FixedTreeCache populates (~02 s), retry every ~2 s up to a cap (~30 s) or until the captured set stops growing, then stop. DiscoverAsync reads the in-memory cache (no extra wire I/O), so retries are cheap. Re-runs on every reconnect (downstream is idempotent).

(Drivers whose discovery is ready immediately — e.g. Galaxy/Modbus — satisfy this on the first attempt.)

3. DriverHostActor — injection handler

On DiscoveredNodesReady(id, nodes):

  1. Find the equipment bound to the driver instance: composition.EquipmentNodes where DriverInstanceId == id.
    • 0 matches → log Info, skip. >1 match → log Warning, skip (multi-device follow-up).
  2. Dedup discovered FullReferences against authored EquipmentTags for that driver (never double-create parts-count, etc.).
  3. Map each remaining node to a NodeId EQ-…/FOCAS/<collapsed-path>/<name> via EquipmentNodeIds.Variable(...) (collapse the single device-host folder level).
  4. Cache the mapped result in _discoveredByDriver[id] (survives redeploys — see Lifecycle).
  5. Update _nodeIdByDriverRef[(id, FullReference)] for each.
  6. Tell OpcUaPublishActor a new MaterialiseDiscoveredNodes(equipmentId, "FOCAS", nodes).
  7. Merge the new refs into the driver's desired set and re-Tell DriverInstanceActor.SetDesiredSubscriptions(union, interval, alarmRefs) — the existing live path immediately re-subscribes (the actor self-Tells Subscribe when already Connected).

4. OpcUaPublishActor / node manager — incremental materialize

New message MaterialiseDiscoveredNodes(equipmentId, driverSubfolder, nodes):

  • Idempotent EnsureFolder / EnsureVariable calls (the node manager already supports incremental add under Lock via AddChild + AddPredefinedNode; EnsureVariable early-returns if the node exists).
  • Variables materialize read-only (no OnWriteValue).
  • After adding, emit a GeneralModelChangeEvent so connected clients can refresh their browse (the full-rebuild path does not emit one; runtime adds should).

Data flow (value path — fully reused)

SetDesiredSubscriptions(union)  →  DriverInstanceActor subscribes the FixedTree refs
  →  PollGroupEngine polls each ref via FocasDriver.ReadAsync
  →  TryReadFixedTree (cache lookup, NO extra wire I/O)
  →  onChange → AttributeValuePublished(FullReference)
  →  DriverHostActor.ForwardToMux
  →  _nodeIdByDriverRef[(driverId, ref)] → AttributeValueUpdate(nodeId, value, quality, ts)
  →  OtOpcUaNodeManager writes the node value

The routing key is consistent by construction: the capturing builder records handle.FullReference, which is exactly the ref the driver publishes (AttributeValuePublished.FullReference) and the ref TryReadFixedTree matches (reference.StartsWith(state.Options.HostAddress + "/")).

Lifecycle / re-injection robustness (the timing problem, solved)

  • First connect: driver connects → ~02 s later FixedTreeCache populates → bounded re-discovery catches it → inject.
  • Redeploy with a structural RebuildAddressSpace: the full teardown wipes injected nodes and PushDesiredSubscriptions rebuilds _nodeIdByDriverRef from authored tags only. Fix: after every PushDesiredSubscriptions, DriverHostActor re-applies its cached _discoveredByDriver (re-materialize + re-map + re-merge refs) — so FixedTree survives redeploys without re-querying the driver.
  • Process restart: _discoveredByDriver is lost, but RestoreApplied re-spawns drivers → each reconnects → post-connect re-discovery re-injects (same ~02 s delay). Consistent with the symptom-#1 restore behavior already deployed.
  • Idempotent throughout: EnsureFolder/EnsureVariable early-return if present; _nodeIdByDriverRef is set-based; SetDesiredSubscriptions is idempotent.

Error handling

  • Discovery throws / driver not ready → bounded retry, then give up quietly (Info); authored tags unaffected.
  • No equipment bound to the driver instance, or ambiguous (multi-equipment) → Warning, skip injection.
  • A FixedTree ref that fails to read at poll time → flows the same recoverable BadCommunicationError push as any equipment tag (the symptom-#1 fix) — observable, not silent.

Testing

  • Unit:
    • CapturingAddressSpaceBuilder records the tree + refs from a fake ITagDiscovery (folders, nested variables, no-op alarm sink, ignored properties).
    • Injector mapping: discovered nodes → EQ-…/FOCAS/… NodeIds; dedup against authored tags; device-host-folder collapse.
    • DriverInstanceActor bounded post-connect re-discovery (set becomes non-empty on the Nth attempt; stops on cap / no-growth).
    • DriverHostActor DiscoveredNodesReady handling + re-inject-after-PushDesiredSubscriptions.
    • Read-only materialization (no write callback).
  • Integration (docker-dev): a fake ITagDiscovery driver exposing a delayed discovery set → assert nodes appear under the equipment and carry values; verify survival across a redeploy + a process restart.
  • Live (wonder, following the symptom-#1 pattern): deploy the current Host + this change, browse EQ-3686c0272279/FOCAS/Identity/SeriesNumber and …/Axes/X/AbsolutePosition, confirm Good values. The live deploy is not blocking for the build (macro/axes values may be 0 on the idle machine — assert status, not magnitude); confirm the live-deploy step with the user at execution time.

Scope / non-goals

  • In: read-only value nodes for any ITagDiscovery driver; 1:1 driver↔equipment; survives redeploy/restart; generic mechanism with FOCAS as the first consumer.
  • Out (documented follow-ups): discovered alarms injection; multi-device-per-driver-instance mapping; writable discovered nodes.

Touched code (anticipated)

  • src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.csDiscoveredNodesReady handler, _discoveredByDriver cache, re-inject after PushDesiredSubscriptions, desired-set merge.
  • src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs — post-connect bounded re-discovery + new message.
  • src/Server/ZB.MOM.WW.OtOpcUa.Runtime/OpcUa/OpcUaPublishActor.csMaterialiseDiscoveredNodes receive.
  • src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OtOpcUaNodeManager.csGeneralModelChangeEvent emit on runtime add (verify existing helper).
  • New: CapturingAddressSpaceBuilder + DiscoveredNode DTO (runtime), EquipmentNodeIds reuse for mapping.
  • Tests under tests/...Runtime.Tests / tests/...OpcUaServer.Tests and a fake ITagDiscovery test double.

Task tracking

Umbrella native task #14 (FixedTree feature). Implementation tasks to be generated by writing-plans from this design.