11 KiB
FixedTree → Equipment injection — RESUME / work-left handoff
Date: 2026-06-26 Purpose: survive a context compaction; let a fresh session continue without re-deriving state.
TL;DR
The FixedTree-under-Equipment dynamic-injection feature is BUILT, offline-complete, AND
✅ LIVE-VALIDATED on wonder (2026-06-26) — 11 tasks, all reviewed, full offline suite green, final
integration review = ready to merge, and the real OPC injection confirmed on wonder-app-vd03 (57 nodes
grafted under EQ-3686c0272279, all reading Good live values). It lives on a local, unpushed branch.
The only substantive thing left is the user's decision on push/PR/merge (§1). A few documented non-blocking
follow-ups remain (§3).
Git state (exact)
- Branch:
feat/focas-fixedtree-equipment-injection(in the main working dir/Users/dohertj2/Desktop/OtOpcUa, NOT a worktree). - Base: branched off
fix/focas-poll-io-serialization(the symptom-#1 data-plane fix — itself ahead ofmaster, pushed to gitea with its own open PR, NOT merged). So this feature stacks on an unmerged branch. - Commits: 14, range
da55c69..37cac5de(10 task commits + 4 review-fix/docs commits). All local — nothing pushed. - User decision (2026-06-26): finishing-a-development-branch → "Keep as-is." Do NOT push/merge/discard without an explicit new go-ahead. Standing rule: commit/push only when asked.
- Untouched pre-existing working-tree edits (leave alone; never stage):
CLAUDE.md,docker-dev/docker-compose.yml,pending.md,stillpending.md,docs/plans/2026-06-19-followups-batch.md.tasks.json. - This RESUME doc itself is currently uncommitted (a working artifact).
What the feature does
Generic post-connect ITagDiscovery injection (NOT FOCAS-special-cased). On driver Connect:
DriverInstanceActor runs bounded re-discovery (Timers single-tick, generation-guarded, stop-on-stable +
attempt cap, re-kicks on reconnect) into a capturing IAddressSpaceBuilder → ships DiscoveredNodesReady
→ DriverHostActor resolves the equipment via authored EquipmentTags, maps the nodes under
EQ-…/FOCAS/… (read-only; single device-host folder collapsed) via DiscoveredNodeMapper, extends
_nodeIdByDriverRef, caches the plan, Tells OpcUaPublishActor.MaterialiseDiscoveredNodes →
AddressSpaceApplier → sink EnsureFolder/EnsureVariable + RaiseNodesAddedModelChange (NodeAdded), and
re-sends SetDesiredSubscriptions(authored ∪ FixedTree refs) so values flow through the existing
poll→push path. Survives redeploys (re-applied at the tail of PushDesiredSubscriptions from the cache)
and restarts (re-discovered on reconnect).
Verification (offline) — all green as of 2026-06-26
dotnet build ZB.MOM.WW.OtOpcUa.slnx→ 0 errors, 0 warnings (TreatWarningsAsErrors on).dotnet test … --filter "FullyQualifiedName~Runtime.Tests"→ 312 passed.dotnet test … --filter "FullyQualifiedName~OpcUaServer.Tests"→ 304 passed.dotnet test … --filter "FullyQualifiedName~FOCAS"→ 324 passed, 10 skipped (the skips are live-wire integration tests needing the physical CNC — expected).- Final integration review: ready to merge (3 non-blocking Minors — see Follow-ups).
- Known env limitation (not a failure): the net48
Driver.Historian.Wonderware.Testscan't run its testhost on macOS — run the filtered suites above, not a full-solutiondotnet test.
Key files / anchors
- Design:
docs/plans/2026-06-26-otopcua-fixedtree-equipment-injection-design.md(status = Implemented; has the follow-ups). - Plan + task journal:
docs/plans/2026-06-26-otopcua-fixedtree-equipment-injection.md(+.md.tasks.json, all tasks completed). - Investigation plan (symptom #2 marked BUILT):
docs/plans/2026-06-25-otopcua-equipment-dataplane-investigation.md. - Deployment doc (FixedTree section added):
docs/deployments/wonder-app-vd03-makino-z-34184.md. - New code:
src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DiscoveredNode.cs,CapturingAddressSpaceBuilder.cs,DiscoveredNodeMapper.cssrc/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/DiscoveredInjection.cs(DTOs)- modified:
DriverInstanceActor.cs,DriverHostActor.cs,OpcUaPublishActor.cs,AddressSpaceApplier.cs,OtOpcUaNodeManager.cs,IOpcUaAddressSpaceSink.cs(+SdkAddressSpaceSink.cs,DeferredAddressSpaceSink.cs) - tests:
tests/Server/…Runtime.Tests/Drivers/{CapturingAddressSpaceBuilderTests,DiscoveredNodeMapperTests,DriverInstanceActorDiscoveryTests,DriverHostActorDiscoveryTests,DiscoveryInjectionEndToEndTests}.cs,…OpcUaServer.Tests/NodeManagerModelChangeOnAddTests.cs, edits toAddressSpaceApplierTests.cs/OpcUaPublishActorTests.cs.
- Memory:
…/memory/wonder-otopcua-focas-and-akka-roles.md(RESUME-ANCHOR bullet updated to record this feature; read it for the broader wonder/FOCAS context + box-access recipe).
WORK LEFT (prioritized)
1. Decide the git endgame (user-gated)
Pick one, only on explicit user go-ahead:
- Push + PR —
git push -u origin feat/focas-fixedtree-equipment-injection; PR base isfix/focas-poll-io-serialization(stacked) ormaster(will show both features' commits). gitea repo:lmxopcua. - Merge locally into
fix/focas-poll-io-serialization(folds both features onto one branch/PR). - Keep waiting until after live validation (current state).
2. Live wonder validation — ✅ DONE 2026-06-26
Validated live on wonder-app-vd03. Built a full self-contained Host overlay from this branch @
37cac5de, deployed to E:\ApiInstall\OtOpcUa (stop → backup E:\ApiInstall\OtOpcUa_bak-20260626111416
→ robocopy overlay preserving appsettings*.json + pki\ → restart). Baseline before deploy: only
parts-count/parts-required under EQ-3686c0272279. After deploy + FOCAS reconnect: the host log
recorded injected 57 discovered node(s) … under EQ-3686c0272279 / materialised … (folders=14, vars=57),
no exceptions. CLI browse showed the full FOCAS/ subtree (Identity/Axes X-Y-Z-B-C-AA+Actual/Spindle/
Program/OperationMode/Timers), idempotent across repeats, device-host folder collapsed. Sample reads all
Good: Identity/SeriesNumber=G431, CncType=31, AxisCount=7, Axes/X/AbsolutePosition=2801574 (live),
OperationMode/ModeText=TJOG; authored tags still Good (no regression). /healthz 200 Healthy throughout.
Result recorded in docs/deployments/wonder-app-vd03-makino-z-34184.md. The substantive remaining work
is now the git endgame (§1) only. Original recipe retained below for reference:
The offline e2e asserts the recording-sink contract, NOT the real OtOpcUaNodeManager seed→overwrite at
the OPC node layer. Live validation closes that gap. Recipe (mirrors the symptom-#1 deploy):
- Build the current Host self-contained:
dotnet publish src/…/ZB.MOM.WW.OtOpcUa.Host…csproj -c Release -r win-x64 --self-contained true -p:PublishSingleFile=false. Must be a full self-contained publish-overlay, NOT a DLL swap — the box is self-contained (DLL swaps crashed: FileNotFound / "Could not resolve CoreCLR path"). Note: deploying the current Host already happened for symptom #1; if the box is at the symptom-#1 build, this feature's DLLs (Runtime + OpcUaServer + Commons + the new Runtime/Drivers files) must be included in the overlay — so a fresh full overlay from THIS branch is the safe path. - Box access: servecli
:2222, key~/.ssh/servecli_wonder, userdohertj2; drive viascratchpad/wonder-ps.sh(base64 PS over cmd PTY); SFTP rootC:\Users\dohertj2\Desktop\win64. ServiceOtOpcUaHost. Overlay ontoE:\ApiInstall\OtOpcUapreservingpki\+appsettings*.json+data\; back up first; auto-rollback if unhealthy. - Restart
OtOpcUaHost; confirm member Up w/ ADMIN+DRIVER (roles env already set),/healthzHealthy, OPC:4840listening. - The FOCAS driver connects → ~0–2 s later FixedTree populates → injection fires. Validate via the OtOpcUa CLI (
src/Client/…Client.CLI) againstopc.tcp://wonder-app-vd03.zmr.zimmer.com:4840/OtOpcUa(Security None, anonymous):browse --recursive→ expect aFOCASsubfolder underns=2;s=EQ-3686c0272279withIdentity/,Axes/, etc.read ns=2;s=EQ-3686c0272279/FOCAS/Identity/SeriesNumber→ expect Good (a real string).read ns=2;s=EQ-3686c0272279/FOCAS/Axes/X/AbsolutePosition→ expect Good (value may be 0 on idle machine — assert STATUS, not magnitude).- The authored
parts-count/parts-requiredshould remain Good (symptom #1 fix).
- If a value reads Bad, the symptom-#1 self-healing applies (recoverable
BadCommunicationError, observable in Serilog atC:\Windows\System32\logs\otopcua-<date>.log). The Akka→Serilog bridge (from symptom #1) makesDriverHost/DriverInstance/discovery logs visible.
3. Non-blocking follow-ups
✅ ALL FIXEDTREE FOLLOW-UPS (A–E) IMPLEMENTED 2026-06-26 — design+plan
2026-06-26-otopcua-fixedtree-followups{-design,}.md; 16 commits c2c368dc..0074f37a on this branch
(every task spec+code reviewed; offline suites green). Resolved:
- ✅ Config-unchanged rebind now re-triggers discovery (
TriggerRediscovery) — follow-up C. - ✅ Multi-device-per-driver implemented via
EquipmentNode.DeviceHostpartition; ≥1-authored-tag requirement lifted (driver-binding resolution) — follow-up E (projection-only, no migration / no artifact wire change). - ✅ Per-(re)connect re-discovery policy-gated (
ITagDiscovery.RediscoverPolicyUntilStable/Once/Never; synchronous drivers → Once) — follow-up B. - ✅ Double
SetDesiredSubscriptionsper redeploy de-duped (one send per driver) — follow-up D. - ✅ Per-pass
DiscoverAsynctimeout made injectable — follow-up A.
Still open (out of scope for the FixedTree follow-ups — separate cross-cutting work):
- Cross-cutting (from symptom #1, all 3 apps): shared
AddZbSerilogdoesn't set the staticSerilog.Log.Logger; AdminUI persists FOCAS config in formats (series-as-number, scheme-less host) the driver only now tolerates — reconcile at the AdminUI source.
Context that's easy to lose
- 3 real defects were caught + fixed by the review chain during the build:
DriverDataType.ToString()≠ OPC type string (Float64→"Double");Server.ReportEventunder the nodeLock(deadlock);ConfigureAwait(false)in the discovery handler (off-actor-context crash for async drivers like Galaxy sharing the node). All have regression tests. - The plan's Task-3 instruction "keep ReportEvent inside lock" was itself a defect; the plan doc was corrected.
- The execution used subagent-driven-development (fresh implementer per task + spec/code reviews; high-risk tasks got Opus reviews, serial). Single-writer discipline was enforced (no concurrent
dotnetbuilds → no obj/bin or git-index races).