docs(otopcua): record FixedTree-under-Equipment injection feature (design, plan, status)

This commit is contained in:
Joseph Doherty
2026-06-26 09:11:40 -04:00
parent 25ccd25b6b
commit 37cac5dee5
5 changed files with 1002 additions and 3 deletions
@@ -73,8 +73,10 @@ Read live `OtOpcUaConfig` on `wonder-sql-vd03` (query run on-box so the SQL pass
**Refuted by this read:** prime-suspect `DriverInstanceId` attribution mismatch (matches exactly) and H5 blank-`deviceHostAddress` (present). The deployed config is **clean**. ⇒ symptom #1 is a pure value-flow-plumbing break. New live leads: **`PollGroupId=NULL`** on both tags (is a poll group required to subscribe/poll?) and the **resolver-registration** path (equipment-tag refs are "resolver-produced, not seeded at `InitializeAsync`" per `FocasDriver.cs:247` — does poll-time `TryResolve` of the JSON-blob ref ever succeed?). A second offline subagent trace of DriverHostActor↔DriverInstanceActor↔PollGroupEngine↔resolver is running to pin the exact broken link.
### ⚠️ FixedTree feature (symptom #2 — user chose "build the feature") — ARCHITECTURE REALITY
Mapped the composition pipeline. Two address-space paths exist: (1) **Equipment/UNS projection** `AddressSpaceComposer.Compose` (config entities only) → `AddressSpaceApplier.MaterialiseEquipmentTags` → the served `ns=2` tree where `EQ-…` lives; (2) **raw-driver namespace** `GenericDriverNodeManager.BuildAddressSpaceAsync``driver.DiscoverAsync(IAddressSpaceBuilder)`. **Path 2 is DEAD: `BuildAddressSpaceAsync` has no runtime caller and `OpcUaApplicationHost.PopulateAddressSpaces` (its referenced caller) no longer exists.** Even `GalaxyDriver.DiscoverAsync` (`:588`) is reachable only via that dead path — Galaxy surfaces its hierarchy by being **authored as config equipment/tags**, not via discovery. ⇒ In the current Equipment-kind model **every served node is config-driven; `ITagDiscovery`/`DiscoverAsync` is legacy/dead for serving.** So "build the FixedTree feature" is NOT re-wiring an existing path — it's a **new dynamic-node-injection capability** into the Equipment projection, and it must solve a **timing problem**: composition runs at deploy/apply (before the driver connects), but FixedTree data only exists after the driver's async `FixedTreeCache` bootstrap. The far cheaper alternative that yields the same visible result is to **author FixedTree signals as config Tag rows** (each bound to a FOCAS fixed-tree reference) — same mechanism every other equipment tag uses. **Recommend re-confirming scope with the user given this cost delta before building.**
### FixedTree feature (symptom #2)BUILT 2026-06-26 (architecture reality below)
Mapped the composition pipeline. Two address-space paths exist: (1) **Equipment/UNS projection** `AddressSpaceComposer.Compose` (config entities only) → `AddressSpaceApplier.MaterialiseEquipmentTags` → the served `ns=2` tree where `EQ-…` lives; (2) **raw-driver namespace** `GenericDriverNodeManager.BuildAddressSpaceAsync``driver.DiscoverAsync(IAddressSpaceBuilder)`. **Path 2 is DEAD: `BuildAddressSpaceAsync` has no runtime caller and `OpcUaApplicationHost.PopulateAddressSpaces` (its referenced caller) no longer exists.** Even `GalaxyDriver.DiscoverAsync` (`:588`) is reachable only via that dead path — Galaxy surfaces its hierarchy by being **authored as config equipment/tags**, not via discovery. ⇒ In the current Equipment-kind model **every served node is config-driven; `ITagDiscovery`/`DiscoverAsync` is legacy/dead for serving.** So "build the FixedTree feature" is NOT re-wiring an existing path — it's a **new dynamic-node-injection capability** into the Equipment projection, and it must solve a **timing problem**: composition runs at deploy/apply (before the driver connects), but FixedTree data only exists after the driver's async `FixedTreeCache` bootstrap. The far cheaper alternative that yields the same visible result is to **author FixedTree signals as config Tag rows** (each bound to a FOCAS fixed-tree reference) — same mechanism every other equipment tag uses. (The user chose to **build the dynamic feature** over the config-rows alternative.)
**✅ BUILT (2026-06-26).** Implemented as a generic **post-connect `ITagDiscovery` injection pipeline**: when a driver reaches `Connected`, `DriverInstanceActor` runs bounded re-discovery into a capturing `IAddressSpaceBuilder` and ships `DiscoveredNodesReady` to `DriverHostActor`, which maps the nodes under the equipment (`EQ-…/FOCAS/…`, read-only), extends the `_nodeIdByDriverRef` routing map, and tells `OpcUaPublishActor` to incrementally materialise them — reusing the existing materialize→subscribe→poll→push pipeline (no full rebuild). Survives redeploys (re-applied at the tail of `PushDesiredSubscriptions`) and restarts (re-discovered on reconnect). Design: [`2026-06-26-otopcua-fixedtree-equipment-injection-design.md`](2026-06-26-otopcua-fixedtree-equipment-injection-design.md); implementation plan (11 bite-sized tasks, all green): [`2026-06-26-otopcua-fixedtree-equipment-injection.md`](2026-06-26-otopcua-fixedtree-equipment-injection.md). **Offline-complete** on branch `feat/focas-fixedtree-equipment-injection` (solution build 0 errors / 0 warnings; Runtime.Tests 312, OpcUaServer.Tests 304, FOCAS 247 + an end-to-end injection+value-flow test, all green). The review chain caught + fixed three real defects (a `DriverDataType→OPC-UA-type` string mismatch, a `Server.ReportEvent`-under-lock deadlock, and a `ConfigureAwait(false)` off-actor-context crash for async drivers). **Live wonder validation pending** (deploy the current host + browse `ns=2;s=EQ-3686c0272279/FOCAS/Identity/SeriesNumber`, `…/FOCAS/Axes/X/AbsolutePosition`).
### 🎯 ROOT CAUSE — symptom #1 (CONFIRMED, 2026-06-25, 2nd subagent trace + code verify)
**The FOCAS poll read hangs forever because (1) all wire I/O for a device shares one socket with NO serialization, and (2) the steady-state read has NO timeout.**
@@ -149,7 +151,7 @@ After the self-contained overlay (current Host) + two light single-DLL FOCAS swa
4. **`FlexibleStringConverter`** on the FOCAS config `Series` — the AdminUI persists the enum as a number (`"series":6`); the factory now tolerates number-or-string instead of throwing → stub.
5. **Scheme-less host tolerance** in `FocasHostAddress.TryParse` — the AdminUI persists `hostAddress` as a bare `ip:port`; `TryParse` now accepts it (canonical `focas://` unchanged) instead of failing init.
- FOCAS test suite **247 green**; each fix carries a regression test.
- **Follow-up (product quality):** the AdminUI authors FOCAS configs (`series` as number, `hostAddress` without `focas://`) that the driver only now tolerates — the AdminUI↔driver config-format mismatch is worth reconciling at the source. Also: the shared `AddZbSerilog` not setting static `Serilog.Log.Logger` is a latent gap across all 3 apps. And the FixedTree-under-Equipment feature (task #14) is still pending.
- **Follow-up (product quality):** the AdminUI authors FOCAS configs (`series` as number, `hostAddress` without `focas://`) that the driver only now tolerates — the AdminUI↔driver config-format mismatch is worth reconciling at the source. Also: the shared `AddZbSerilog` not setting static `Serilog.Log.Logger` is a latent gap across all 3 apps. The FixedTree-under-Equipment feature (task #14) is now **BUILT** (offline-complete; see the 2026-06-26 design + implementation-plan docs above) — live wonder validation pending.
## Phase 2 — Get OtOpcUa runtime logs on wonder
Make the Host emit driver-level logs so the data plane is observable. Options (least invasive first): point the service at a Serilog file sink via config/env, or temporarily run with `DOTNET_ENVIRONMENT=Development` (file sink + dev errors — cf. MxGateway note), or add a console capture. Preserve `appsettings*`/`data\`; restore the env after. Then read: did `InitializeAsync` start the FixedTree loop, does the bootstrap throw (and on which call), is `ReadAsync` invoked for the equipment tags, what does it return.