111 lines
11 KiB
Markdown
111 lines
11 KiB
Markdown
# FixedTree → Equipment injection — RESUME / work-left handoff
|
||
|
||
**Date:** 2026-06-26
|
||
**Purpose:** survive a context compaction; let a fresh session continue without re-deriving state.
|
||
|
||
---
|
||
|
||
## TL;DR
|
||
|
||
The **FixedTree-under-Equipment dynamic-injection feature is BUILT, offline-complete, AND
|
||
✅ LIVE-VALIDATED on wonder (2026-06-26)** — 11 tasks, all reviewed, full offline suite green, final
|
||
integration review = ready to merge, and the real OPC injection confirmed on `wonder-app-vd03` (57 nodes
|
||
grafted under `EQ-3686c0272279`, all reading Good live values). It lives on a **local, unpushed** branch.
|
||
The only substantive thing left is the user's decision on push/PR/merge (§1). A few documented non-blocking
|
||
follow-ups remain (§3).
|
||
|
||
## Git state (exact)
|
||
|
||
- **Branch:** `feat/focas-fixedtree-equipment-injection` (in the main working dir `/Users/dohertj2/Desktop/OtOpcUa`, NOT a worktree).
|
||
- **Base:** branched off `fix/focas-poll-io-serialization` (the symptom-#1 data-plane fix — itself ahead of `master`, pushed to gitea with its own open PR, NOT merged). So this feature **stacks on an unmerged branch**.
|
||
- **Commits:** 14, range `da55c69`..`37cac5de` (10 task commits + 4 review-fix/docs commits). All **local — nothing pushed.**
|
||
- **User decision (2026-06-26):** finishing-a-development-branch → **"Keep as-is."** Do NOT push/merge/discard without an explicit new go-ahead. Standing rule: **commit/push only when asked.**
|
||
- **Untouched pre-existing working-tree edits** (leave alone; never stage): `CLAUDE.md`, `docker-dev/docker-compose.yml`, `pending.md`, `stillpending.md`, `docs/plans/2026-06-19-followups-batch.md.tasks.json`.
|
||
- This RESUME doc itself is currently **uncommitted** (a working artifact).
|
||
|
||
## What the feature does
|
||
|
||
Generic post-connect `ITagDiscovery` injection (NOT FOCAS-special-cased). On driver Connect:
|
||
`DriverInstanceActor` runs bounded re-discovery (Timers single-tick, generation-guarded, stop-on-stable +
|
||
attempt cap, re-kicks on reconnect) into a capturing `IAddressSpaceBuilder` → ships `DiscoveredNodesReady`
|
||
→ `DriverHostActor` resolves the equipment via authored `EquipmentTags`, maps the nodes under
|
||
`EQ-…/FOCAS/…` (read-only; single device-host folder collapsed) via `DiscoveredNodeMapper`, extends
|
||
`_nodeIdByDriverRef`, caches the plan, Tells `OpcUaPublishActor.MaterialiseDiscoveredNodes` →
|
||
`AddressSpaceApplier` → sink `EnsureFolder`/`EnsureVariable` + `RaiseNodesAddedModelChange` (NodeAdded), and
|
||
re-sends `SetDesiredSubscriptions(authored ∪ FixedTree refs)` so values flow through the existing
|
||
poll→push path. Survives redeploys (re-applied at the tail of `PushDesiredSubscriptions` from the cache)
|
||
and restarts (re-discovered on reconnect).
|
||
|
||
## Verification (offline) — all green as of 2026-06-26
|
||
|
||
- `dotnet build ZB.MOM.WW.OtOpcUa.slnx` → **0 errors, 0 warnings** (TreatWarningsAsErrors on).
|
||
- `dotnet test … --filter "FullyQualifiedName~Runtime.Tests"` → **312 passed**.
|
||
- `dotnet test … --filter "FullyQualifiedName~OpcUaServer.Tests"` → **304 passed**.
|
||
- `dotnet test … --filter "FullyQualifiedName~FOCAS"` → **324 passed, 10 skipped** (the skips are live-wire integration tests needing the physical CNC — expected).
|
||
- Final integration review: **ready to merge** (3 non-blocking Minors — see Follow-ups).
|
||
- Known env limitation (not a failure): the net48 `Driver.Historian.Wonderware.Tests` can't run its testhost on macOS — run the **filtered** suites above, not a full-solution `dotnet test`.
|
||
|
||
## Key files / anchors
|
||
|
||
- Design: `docs/plans/2026-06-26-otopcua-fixedtree-equipment-injection-design.md` (status = Implemented; has the follow-ups).
|
||
- Plan + task journal: `docs/plans/2026-06-26-otopcua-fixedtree-equipment-injection.md` (+ `.md.tasks.json`, all tasks completed).
|
||
- Investigation plan (symptom #2 marked BUILT): `docs/plans/2026-06-25-otopcua-equipment-dataplane-investigation.md`.
|
||
- Deployment doc (FixedTree section added): `docs/deployments/wonder-app-vd03-makino-z-34184.md`.
|
||
- New code:
|
||
- `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DiscoveredNode.cs`, `CapturingAddressSpaceBuilder.cs`, `DiscoveredNodeMapper.cs`
|
||
- `src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/DiscoveredInjection.cs` (DTOs)
|
||
- modified: `DriverInstanceActor.cs`, `DriverHostActor.cs`, `OpcUaPublishActor.cs`, `AddressSpaceApplier.cs`, `OtOpcUaNodeManager.cs`, `IOpcUaAddressSpaceSink.cs` (+ `SdkAddressSpaceSink.cs`, `DeferredAddressSpaceSink.cs`)
|
||
- tests: `tests/Server/…Runtime.Tests/Drivers/{CapturingAddressSpaceBuilderTests,DiscoveredNodeMapperTests,DriverInstanceActorDiscoveryTests,DriverHostActorDiscoveryTests,DiscoveryInjectionEndToEndTests}.cs`, `…OpcUaServer.Tests/NodeManagerModelChangeOnAddTests.cs`, edits to `AddressSpaceApplierTests.cs`/`OpcUaPublishActorTests.cs`.
|
||
- Memory: `…/memory/wonder-otopcua-focas-and-akka-roles.md` (RESUME-ANCHOR bullet updated to record this feature; read it for the broader wonder/FOCAS context + box-access recipe).
|
||
|
||
## WORK LEFT (prioritized)
|
||
|
||
### 1. Decide the git endgame (user-gated)
|
||
Pick one, only on explicit user go-ahead:
|
||
- **Push + PR** — `git push -u origin feat/focas-fixedtree-equipment-injection`; PR base is `fix/focas-poll-io-serialization` (stacked) or `master` (will show both features' commits). gitea repo: `lmxopcua`.
|
||
- **Merge locally** into `fix/focas-poll-io-serialization` (folds both features onto one branch/PR).
|
||
- Keep waiting until after live validation (current state).
|
||
|
||
### 2. Live wonder validation — ✅ DONE 2026-06-26
|
||
**Validated live on `wonder-app-vd03`.** Built a full self-contained Host overlay from this branch @
|
||
`37cac5de`, deployed to `E:\ApiInstall\OtOpcUa` (stop → backup `E:\ApiInstall\OtOpcUa_bak-20260626111416`
|
||
→ robocopy overlay preserving `appsettings*.json` + `pki\` → restart). Baseline before deploy: only
|
||
`parts-count`/`parts-required` under `EQ-3686c0272279`. After deploy + FOCAS reconnect: the host log
|
||
recorded `injected 57 discovered node(s) … under EQ-3686c0272279` / `materialised … (folders=14, vars=57)`,
|
||
no exceptions. CLI browse showed the full `FOCAS/` subtree (Identity/Axes X-Y-Z-B-C-AA+Actual/Spindle/
|
||
Program/OperationMode/Timers), idempotent across repeats, device-host folder collapsed. Sample reads all
|
||
Good: `Identity/SeriesNumber=G431`, `CncType=31`, `AxisCount=7`, `Axes/X/AbsolutePosition=2801574` (live),
|
||
`OperationMode/ModeText=TJOG`; authored tags still Good (no regression). `/healthz` 200 Healthy throughout.
|
||
Result recorded in `docs/deployments/wonder-app-vd03-makino-z-34184.md`. **The substantive remaining work
|
||
is now the git endgame (§1) only.** Original recipe retained below for reference:
|
||
|
||
The offline e2e asserts the recording-sink contract, NOT the real `OtOpcUaNodeManager` seed→overwrite at
|
||
the OPC node layer. Live validation closes that gap. Recipe (mirrors the symptom-#1 deploy):
|
||
1. Build the current Host self-contained: `dotnet publish src/…/ZB.MOM.WW.OtOpcUa.Host…csproj -c Release -r win-x64 --self-contained true -p:PublishSingleFile=false`. **Must be a full self-contained publish-overlay, NOT a DLL swap** — the box is self-contained (DLL swaps crashed: FileNotFound / "Could not resolve CoreCLR path"). Note: deploying the current Host already happened for symptom #1; if the box is at the symptom-#1 build, this feature's DLLs (Runtime + OpcUaServer + Commons + the new Runtime/Drivers files) must be included in the overlay — so a fresh full overlay from THIS branch is the safe path.
|
||
2. Box access: servecli `:2222`, key `~/.ssh/servecli_wonder`, user `dohertj2`; drive via `scratchpad/wonder-ps.sh` (base64 PS over cmd PTY); SFTP root `C:\Users\dohertj2\Desktop\win64`. Service `OtOpcUaHost`. Overlay onto `E:\ApiInstall\OtOpcUa` **preserving `pki\` + `appsettings*.json` + `data\`**; back up first; auto-rollback if unhealthy.
|
||
3. Restart `OtOpcUaHost`; confirm member Up w/ ADMIN+DRIVER (roles env already set), `/healthz` Healthy, OPC `:4840` listening.
|
||
4. The FOCAS driver connects → ~0–2 s later FixedTree populates → injection fires. Validate via the OtOpcUa CLI (`src/Client/…Client.CLI`) against `opc.tcp://wonder-app-vd03.zmr.zimmer.com:4840/OtOpcUa` (Security None, anonymous):
|
||
- `browse --recursive` → expect a `FOCAS` subfolder under `ns=2;s=EQ-3686c0272279` with `Identity/`, `Axes/`, etc.
|
||
- `read ns=2;s=EQ-3686c0272279/FOCAS/Identity/SeriesNumber` → expect Good (a real string).
|
||
- `read ns=2;s=EQ-3686c0272279/FOCAS/Axes/X/AbsolutePosition` → expect Good (value may be 0 on idle machine — assert STATUS, not magnitude).
|
||
- The authored `parts-count`/`parts-required` should remain Good (symptom #1 fix).
|
||
5. If a value reads Bad, the symptom-#1 self-healing applies (recoverable `BadCommunicationError`, observable in Serilog at `C:\Windows\System32\logs\otopcua-<date>.log`). The Akka→Serilog bridge (from symptom #1) makes `DriverHost`/`DriverInstance`/discovery logs visible.
|
||
|
||
### 3. Non-blocking follow-ups
|
||
**✅ ALL FIXEDTREE FOLLOW-UPS (A–E) IMPLEMENTED 2026-06-26** — design+plan
|
||
`2026-06-26-otopcua-fixedtree-followups{-design,}.md`; 16 commits `c2c368dc`..`0074f37a` on this branch
|
||
(every task spec+code reviewed; offline suites green). Resolved:
|
||
- ✅ Config-unchanged rebind now re-triggers discovery (`TriggerRediscovery`) — follow-up C.
|
||
- ✅ Multi-device-per-driver implemented via `EquipmentNode.DeviceHost` partition; ≥1-authored-tag requirement lifted (driver-binding resolution) — follow-up E (projection-only, no migration / no artifact wire change).
|
||
- ✅ Per-(re)connect re-discovery policy-gated (`ITagDiscovery.RediscoverPolicy` UntilStable/Once/Never; synchronous drivers → Once) — follow-up B.
|
||
- ✅ Double `SetDesiredSubscriptions` per redeploy de-duped (one send per driver) — follow-up D.
|
||
- ✅ Per-pass `DiscoverAsync` timeout made injectable — follow-up A.
|
||
|
||
**Still open (out of scope for the FixedTree follow-ups — separate cross-cutting work):**
|
||
- Cross-cutting (from symptom #1, all 3 apps): shared `AddZbSerilog` doesn't set the static `Serilog.Log.Logger`; AdminUI persists FOCAS config in formats (series-as-number, scheme-less host) the driver only now tolerates — reconcile at the AdminUI source.
|
||
|
||
## Context that's easy to lose
|
||
- 3 real defects were caught + fixed by the review chain during the build: `DriverDataType.ToString()` ≠ OPC type string (`Float64`→`"Double"`); `Server.ReportEvent` under the node `Lock` (deadlock); `ConfigureAwait(false)` in the discovery handler (off-actor-context crash for async drivers like Galaxy sharing the node). All have regression tests.
|
||
- The plan's Task-3 instruction "keep ReportEvent inside lock" was itself a defect; the plan doc was corrected.
|
||
- The execution used subagent-driven-development (fresh implementer per task + spec/code reviews; high-risk tasks got Opus reviews, serial). Single-writer discipline was enforced (no concurrent `dotnet` builds → no obj/bin or git-index races).
|