203 lines
14 KiB
Markdown
203 lines
14 KiB
Markdown
# OtOpcUa — dynamic injection of driver-discovered FixedTree nodes into the Equipment projection (design)
|
||
|
||
**Date:** 2026-06-26
|
||
**Status:** ✅ Implemented (2026-06-26) — 11 tasks, offline-complete on branch `feat/focas-fixedtree-equipment-injection` (solution build 0 errors / 0 warnings; Runtime.Tests 312, OpcUaServer.Tests 304, FOCAS 247 + an end-to-end injection+value-flow test, all green). Live wonder validation pending.
|
||
|
||
**Follow-ups surfaced during the review chain (not blocking):**
|
||
- Config-unchanged driver→equipment **rebind** drops the cached plan but does not itself re-trigger discovery (`ReconcileDrivers` only restarts a child on a `DriverConfig` change) → the FixedTree subtree is absent under the new equipment until the driver's next reconnect/restart re-discovers it.
|
||
- **Multi-device-per-driver** equipment mapping is deferred (today strictly 1:1; the equipment is resolved from authored `EquipmentTags`, so a driver needs ≥1 authored tag for its FixedTree to graft — `EquipmentNode` carries no `DriverInstanceId`).
|
||
- Per-(re)connect re-discovery runs for **every `ITagDiscovery` driver** (Galaxy/OpcUaClient/TwinCAT too), bounded by stop-on-stable + an attempt cap; narrowing/opt-in for heavy network drivers is a follow-up.
|
||
- The end-to-end test asserts the recording-sink contract, not the real `OtOpcUaNodeManager` `BadWaitingForInitialData`→Good seed-overwrite at the OPC node layer — that is covered by the live wonder deploy.
|
||
**Companion to:** [`2026-06-25-otopcua-equipment-dataplane-investigation.md`](2026-06-25-otopcua-equipment-dataplane-investigation.md) (symptom #1 — live FOCAS values — FIXED + deployed; this design addresses **symptom #2**).
|
||
**Base branch:** `fix/focas-poll-io-serialization` (this feature builds on the now-deployed driver-host bootstrap re-spawn + FOCAS I/O fixes; that branch is ahead of `master` and not yet merged).
|
||
|
||
---
|
||
|
||
## Problem
|
||
|
||
Deployed FOCAS equipment serves only its **authored** config tags (`parts-count`/`parts-required`). The driver's
|
||
**FixedTree** (Identity / Axes / Spindle / Program / Timers — the auto-discovered CNC structure) **never appears** under
|
||
the served Equipment/UNS address space.
|
||
|
||
**Root cause (confirmed in the investigation, H2):** the served Equipment tree is built **purely from Config-DB entities**
|
||
(`AddressSpaceComposer.Compose` → `AddressSpaceApplier` → node manager). The only code that emits FixedTree nodes is
|
||
`ITagDiscovery.DiscoverAsync` (each driver implements it), reachable **only** through `GenericDriverNodeManager.BuildAddressSpaceAsync`
|
||
— which has **no runtime caller** (its referenced host method `OpcUaApplicationHost.PopulateAddressSpaces` no longer exists).
|
||
So `DiscoverAsync`/`ITagDiscovery` is **dead for serving**: every served node is config-driven, and nothing surfaces a
|
||
driver's discovered hierarchy.
|
||
|
||
Surfacing FixedTree under the Equipment node is therefore a **new dynamic-node-injection capability**, and it must solve a
|
||
**timing problem**: composition runs at deploy/apply time (before the driver connects), but the FixedTree shape
|
||
(axis count, spindle presence, which sections exist) is **capability-discovered ~0–2 s after the driver connects**
|
||
(`FocasDriver` populates `state.FixedTreeCache` in its bootstrap loop).
|
||
|
||
## Goal
|
||
|
||
After a driver connects, dynamically graft its discovered FixedTree nodes into the served Equipment projection under a
|
||
driver-named subfolder, e.g.:
|
||
|
||
```
|
||
ns=2;s=EQ-3686c0272279 (equipment "z-34184")
|
||
├── parts-count (authored config tag — unchanged)
|
||
├── parts-required (authored config tag — unchanged)
|
||
└── FOCAS (NEW — driver-named discovered subfolder)
|
||
├── Identity/{SeriesNumber, Version, MaxAxes, CncType, MtType, AxisCount}
|
||
├── Axes/{<axis>/{AbsolutePosition, MachinePosition, RelativePosition, DistanceToGo}, FeedRate/Actual, SpindleSpeed/Actual}
|
||
├── Spindle/{<name>/{Load, MaxRpm}} (capability-gated)
|
||
├── Program/{Name, ONumber, Number, MainNumber, Sequence, BlockCount} (capability-gated)
|
||
├── OperationMode/{Mode, ModeText} (capability-gated)
|
||
└── Timers/{PowerOnSeconds, OperatingSeconds, CuttingSeconds, CycleSeconds} (capability-gated)
|
||
```
|
||
|
||
Read-only value nodes carrying live values (e.g. `EQ-…/FOCAS/Axes/X/AbsolutePosition` reads Good).
|
||
|
||
## Decisions (locked with the user 2026-06-26)
|
||
|
||
| Decision | Choice |
|
||
|---|---|
|
||
| Driver scope | **Generic** — keyed off the shared `ITagDiscovery` interface (FOCAS, Galaxy, Modbus all implement it). FOCAS is the first/test consumer; others get it for free. **Zero per-driver code changes.** |
|
||
| Tree placement | **Under a driver-named subfolder** — `EQ-…/FOCAS/…` (collision-safe vs. authored tags; self-describing). |
|
||
| Device-host folder | **Collapse** the single device-host level → `EQ-…/FOCAS/Identity/…` (not `EQ-…/FOCAS/10.201.31.5:8193/Identity/…`), valid because today's deployment is strictly 1:1 driver↔equipment↔device. |
|
||
| Model-change notification | **Emit `GeneralModelChangeEvent`** after a runtime add so already-connected OPC UA clients can refresh their browse. |
|
||
| Multi-device-per-driver | **Deferred** (documented follow-up) — today is 1:1. |
|
||
| Discovered alarms | **Out of scope** — this feature surfaces value nodes only; alarms continue to come via the config path. |
|
||
| Writable discovered nodes | **Out of scope** — FixedTree is read-only CNC state. |
|
||
|
||
## Approach (chosen): runtime post-connect injection via the actor pipeline
|
||
|
||
Treat discovered FixedTree nodes as **"synthetic equipment tags" injected at runtime**, reusing the existing
|
||
materialize → subscribe → poll → push pipeline end-to-end. Only three new pieces; **no driver changes** (each driver's
|
||
existing `DiscoverAsync` is reused verbatim via a capturing builder).
|
||
|
||
**Rejected alternatives:**
|
||
- *Composition-time pre-projection* — can't author the right nodes before the driver discovers capabilities; defeats the purpose.
|
||
- *Resurrect `GenericDriverNodeManager` as a 2nd namespace (ns=3)* — puts FixedTree in a separate tree (not **under** the equipment node), and that namespace's value-routing is also dead; more dead code to revive, wrong location.
|
||
- *Cheap baseline: author a Config-DB Tag row per FixedTree signal* — no new code, but static (can't adapt to per-CNC capabilities) and per-signal × per-machine manual authoring. User chose to build the dynamic feature instead.
|
||
|
||
## Components
|
||
|
||
### 1. `CapturingAddressSpaceBuilder` (new — runtime)
|
||
An `IAddressSpaceBuilder` implementation that **records** the streamed tree instead of creating OPC UA nodes. After a
|
||
driver's `DiscoverAsync(builder)` returns, it exposes a flat `IReadOnlyList<DiscoveredNode>`:
|
||
|
||
```
|
||
DiscoveredNode {
|
||
IReadOnlyList<string> FolderPathSegments, // e.g. ["FOCAS", "<deviceHost>", "Identity"]
|
||
string BrowseName, string DisplayName,
|
||
string FullReference, // == DriverAttributeInfo.FullName (the driver ref + routing key)
|
||
DriverDataType DataType, bool IsArray, uint? ArrayDim,
|
||
bool Writable, bool IsHistorized
|
||
}
|
||
```
|
||
|
||
- `Folder(browse, display)` returns a child capturing scope; `Variable(...)` records a node and returns an
|
||
`IVariableHandle` whose `FullReference` is `DriverAttributeInfo.FullName`.
|
||
- `MarkAsAlarmCondition(...)` returns a **no-op** sink; `AddProperty(...)` is **ignored** — value nodes only.
|
||
|
||
### 2. `DriverInstanceActor` — post-connect discovery (bounded retry)
|
||
On entering `Connected`, kick a bounded re-discovery:
|
||
1. Run `DiscoverAsync(capturingBuilder)` against the live `IDriver` it owns.
|
||
2. `Tell` the parent `DriverHostActor` a new message `DiscoveredNodesReady(DriverInstanceId, IReadOnlyList<DiscoveredNode>)`.
|
||
3. Because FOCAS suppresses FixedTree until `FixedTreeCache` populates (~0–2 s), **retry** every ~2 s up to a cap
|
||
(~30 s) **or until the captured set stops growing**, then stop. `DiscoverAsync` reads the in-memory cache (no extra
|
||
wire I/O), so retries are cheap. Re-runs on every reconnect (downstream is idempotent).
|
||
|
||
*(Drivers whose discovery is ready immediately — e.g. Galaxy/Modbus — satisfy this on the first attempt.)*
|
||
|
||
### 3. `DriverHostActor` — injection handler
|
||
On `DiscoveredNodesReady(id, nodes)`:
|
||
1. Find the equipment bound to the driver instance: `composition.EquipmentNodes` where `DriverInstanceId == id`.
|
||
- 0 matches → log Info, skip. >1 match → log Warning, skip (multi-device follow-up).
|
||
2. **Dedup** discovered `FullReference`s against authored `EquipmentTags` for that driver (never double-create
|
||
`parts-count`, etc.).
|
||
3. Map each remaining node to a NodeId `EQ-…/FOCAS/<collapsed-path>/<name>` via `EquipmentNodeIds.Variable(...)`
|
||
(collapse the single device-host folder level).
|
||
4. **Cache** the mapped result in `_discoveredByDriver[id]` (survives redeploys — see Lifecycle).
|
||
5. Update `_nodeIdByDriverRef[(id, FullReference)]` for each.
|
||
6. `Tell` `OpcUaPublishActor` a new `MaterialiseDiscoveredNodes(equipmentId, "FOCAS", nodes)`.
|
||
7. Merge the new refs into the driver's desired set and re-`Tell`
|
||
`DriverInstanceActor.SetDesiredSubscriptions(union, interval, alarmRefs)` — the existing **live path** immediately
|
||
re-subscribes (the actor self-`Tell`s `Subscribe` when already `Connected`).
|
||
|
||
### 4. `OpcUaPublishActor` / node manager — incremental materialize
|
||
New message `MaterialiseDiscoveredNodes(equipmentId, driverSubfolder, nodes)`:
|
||
- Idempotent `EnsureFolder` / `EnsureVariable` calls (the node manager already supports incremental add under `Lock`
|
||
via `AddChild` + `AddPredefinedNode`; `EnsureVariable` early-returns if the node exists).
|
||
- Variables materialize **read-only** (no `OnWriteValue`).
|
||
- After adding, emit a `GeneralModelChangeEvent` so connected clients can refresh their browse (the full-rebuild path
|
||
does not emit one; runtime adds should).
|
||
|
||
## Data flow (value path — fully reused)
|
||
|
||
```
|
||
SetDesiredSubscriptions(union) → DriverInstanceActor subscribes the FixedTree refs
|
||
→ PollGroupEngine polls each ref via FocasDriver.ReadAsync
|
||
→ TryReadFixedTree (cache lookup, NO extra wire I/O)
|
||
→ onChange → AttributeValuePublished(FullReference)
|
||
→ DriverHostActor.ForwardToMux
|
||
→ _nodeIdByDriverRef[(driverId, ref)] → AttributeValueUpdate(nodeId, value, quality, ts)
|
||
→ OtOpcUaNodeManager writes the node value
|
||
```
|
||
|
||
The routing key is **consistent by construction**: the capturing builder records `handle.FullReference`, which is exactly
|
||
the ref the driver publishes (`AttributeValuePublished.FullReference`) and the ref `TryReadFixedTree` matches
|
||
(`reference.StartsWith(state.Options.HostAddress + "/")`).
|
||
|
||
## Lifecycle / re-injection robustness (the timing problem, solved)
|
||
|
||
- **First connect:** driver connects → ~0–2 s later `FixedTreeCache` populates → bounded re-discovery catches it → inject.
|
||
- **Redeploy with a structural `RebuildAddressSpace`:** the full teardown wipes injected nodes and `PushDesiredSubscriptions`
|
||
rebuilds `_nodeIdByDriverRef` from authored tags only. **Fix:** after every `PushDesiredSubscriptions`, `DriverHostActor`
|
||
**re-applies its cached `_discoveredByDriver`** (re-materialize + re-map + re-merge refs) — so FixedTree survives
|
||
redeploys without re-querying the driver.
|
||
- **Process restart:** `_discoveredByDriver` is lost, but `RestoreApplied` re-spawns drivers → each reconnects →
|
||
post-connect re-discovery re-injects (same ~0–2 s delay). Consistent with the symptom-#1 restore behavior already
|
||
deployed.
|
||
- **Idempotent throughout:** `EnsureFolder`/`EnsureVariable` early-return if present; `_nodeIdByDriverRef` is set-based;
|
||
`SetDesiredSubscriptions` is idempotent.
|
||
|
||
## Error handling
|
||
|
||
- Discovery throws / driver not ready → bounded retry, then give up quietly (Info); authored tags unaffected.
|
||
- No equipment bound to the driver instance, or ambiguous (multi-equipment) → Warning, skip injection.
|
||
- A FixedTree ref that fails to read at poll time → flows the same recoverable `BadCommunicationError` push as any
|
||
equipment tag (the symptom-#1 fix) — observable, not silent.
|
||
|
||
## Testing
|
||
|
||
- **Unit:**
|
||
- `CapturingAddressSpaceBuilder` records the tree + refs from a fake `ITagDiscovery` (folders, nested variables,
|
||
no-op alarm sink, ignored properties).
|
||
- Injector mapping: discovered nodes → `EQ-…/FOCAS/…` NodeIds; dedup against authored tags; device-host-folder collapse.
|
||
- `DriverInstanceActor` bounded post-connect re-discovery (set becomes non-empty on the Nth attempt; stops on cap / no-growth).
|
||
- `DriverHostActor` `DiscoveredNodesReady` handling + re-inject-after-`PushDesiredSubscriptions`.
|
||
- Read-only materialization (no write callback).
|
||
- **Integration (docker-dev):** a fake `ITagDiscovery` driver exposing a *delayed* discovery set → assert nodes appear
|
||
under the equipment and carry values; verify survival across a redeploy + a process restart.
|
||
- **Live (wonder, following the symptom-#1 pattern):** deploy the current Host + this change, browse
|
||
`EQ-3686c0272279/FOCAS/Identity/SeriesNumber` and `…/Axes/X/AbsolutePosition`, confirm Good values. The live deploy is
|
||
**not** blocking for the build (macro/axes values may be 0 on the idle machine — assert status, not magnitude); confirm
|
||
the live-deploy step with the user at execution time.
|
||
|
||
## Scope / non-goals
|
||
|
||
- **In:** read-only value nodes for any `ITagDiscovery` driver; 1:1 driver↔equipment; survives redeploy/restart; generic
|
||
mechanism with FOCAS as the first consumer.
|
||
- **Out (documented follow-ups):** discovered **alarms** injection; multi-device-per-driver-instance mapping; writable
|
||
discovered nodes.
|
||
|
||
## Touched code (anticipated)
|
||
|
||
- `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs` — `DiscoveredNodesReady` handler, `_discoveredByDriver`
|
||
cache, re-inject after `PushDesiredSubscriptions`, desired-set merge.
|
||
- `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs` — post-connect bounded re-discovery + new message.
|
||
- `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/OpcUa/OpcUaPublishActor.cs` — `MaterialiseDiscoveredNodes` receive.
|
||
- `src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OtOpcUaNodeManager.cs` — `GeneralModelChangeEvent` emit on runtime add (verify
|
||
existing helper).
|
||
- New: `CapturingAddressSpaceBuilder` + `DiscoveredNode` DTO (runtime), `EquipmentNodeIds` reuse for mapping.
|
||
- Tests under `tests/...Runtime.Tests` / `tests/...OpcUaServer.Tests` and a fake `ITagDiscovery` test double.
|
||
|
||
## Task tracking
|
||
|
||
Umbrella native task **#14** (FixedTree feature). Implementation tasks to be generated by writing-plans from this design.
|