Files
lmxopcua/docs/plans/2026-06-25-focas-pdu-v3-implementation-plan.md
T

157 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# FOCAS PDU v3 — implementation plan (finish real 30i/31i-B support)
**Date:** 2026-06-25
**Companion to:** [`2026-06-25-focas-pdu-v3-30i-b-support.md`](2026-06-25-focas-pdu-v3-30i-b-support.md) (finding + live captures + per-command validation) and [`../deployments/wonder-app-vd03-makino-z-34184.md`](../deployments/wonder-app-vd03-makino-z-34184.md) (the deployment this unblocks).
**Goal:** make the managed `WireFocasClient` fully interoperate with a real FANUC 30i/31i-B (FOCAS Ethernet **PDU v3**), then light up the live OPC UA data on `wonder-app-vd03` for the Makino Pro 5 (`Z-34184`, `10.201.31.5`).
## ⏳ Time-boxed asset — capture/validate live FIRST
`10.201.31.5:8193` (real 31i-B) is reachable from the dev box and the wonder host **right now**. Every
per-command v3 framing fix needs a live capture + live re-validation. **Do all captures in Phase 1 while
access lasts**; the parser fixes + unit tests can be finished offline against the captured bytes later.
Tools already in place: `scripts/focas/capture-initiate.py <host>` (initiate only — extend it) and the
throwaway status harness pattern (see the finding doc).
## Current status — DONE (2026-06-25), live-validated on the real 31i-B (`10.201.31.5`)
The Phase-1 capture corrected the diagnosis (4 of 6 "framing failures" were not framing problems).
All tractable phases are implemented + unit-tested (FOCAS suite **234 green**, full solution builds 0
errors) + re-validated live. The full corrected diagnosis + per-surface evidence is in the companion
finding doc's **Resolution** section.
| Phase | Item | State |
|---|---|---|
| 0 | inbound PDU-version gate `{1,3}` | DONE — macros + all status reads work on v3 |
| 1 | capture every v3 data PDU | DONE — 20 fixtures under `tests/.../Fixtures/v3/` |
| 2 | servo "hang" → CT-bound reads | DONE — `ReadExactlyAsync` dispose-on-cancel; servo answers in 0 ms (no real wire hang) |
| 3 | request-version policy | DONE — keep emitting v1 (CNC accepts it); no command needed v3 requests |
| 4 | servo + alarms framing | DONE — servo 8-byte stride + names from 0x0089; alarms already correct (read `#3080` live) |
| 5 | timer v3 struct | DONE — `ParseTimer` little-endian {minute, msec} |
| 6 | dynamic axis iteration | DONE — driver poll already 1-based; added a `ReadDynamicAsync` contract guard |
| 7 | PMC framing | DONE — `end = start + width - 1`; **parameter framing BLOCKED** (EW_FUNC across 14 variants — see finding) |
| 8 | probe truthfulness | DONE — `FocasDriverProbe` runs a real wire session (initiate + cnc_statinfo) |
| 9 | docs + version matrix | DONE — this plan, the finding doc, `FOCAS.md`, `focas-version-matrix.md` |
| 10 | deploy to wonder + e2e | DONE (binary live, driver speaks v3) — but live-tag verify BLOCKED by a separate OtOpcUa data-plane issue (see below) |
| 11 | commit + push | DONE — `feat/focas-pdu-v3` @ `5f0a5286` committed + pushed to gitea |
**Only genuinely open v3 item:** `cnc_rdparam` (EW_FUNC on every framing — needs a reference FWLIB
trace or is restricted on this control). Deferred; the deployed config uses macros, not parameters.
## Phase 10 outcome — v3 binary LIVE, but a separate OtOpcUa data-plane issue blocks tag values
Deployed the Release driver DLL to `E:\ApiInstall\OtOpcUa\` (backup `_focasbak-pre-v3-20260625T164909.dll`),
restarted `OtOpcUaHost` (clean), and re-deployed in the AdminUI (deployment `12e0d528`, Sealed/In-sync).
**`DRIVER STATUS: HEALTHY` now reflects a real FOCAS v3 session** (the rewritten probe does initiate +
`cnc_statinfo`) — i.e. the deployed binary genuinely speaks v3 to the Makino, which was impossible before.
**However**, the live OPC UA equipment tags (`parts-count`/`parts-required` = `MACRO:3901/3902`) still read
`Bad_WaitingForInitialData` via `read` and a 30 s `subscribe`, and a recursive browse shows ONLY the two
UNS-projected macro tags — **no FixedTree (Identity/Axes/Timers/…) nodes** — identical to before the v3 fix,
and unchanged by host-restart / re-deploy / driver-`Restart`. A box-side watch saw no 250 ms-cadence
connection to the CNC (only the periodic probe), so the driver's **data poll loop isn't running** while its
probe loop is. This is a **separate, pre-existing OtOpcUa data-plane / Equipment-projection issue**, not a
FOCAS-protocol problem (the wire client is proven by the healthy real-session probe + exhaustive dev-box
reads). Follow-on: investigate why the driver's DiscoverAsync FixedTree build + equipment-tag value source
don't run/surface on this single fused admin+driver node (poll-group engine / monitored-item sampling /
whether the Equipment projection exposes driver FixedTree auto-nodes at all).
---
## Phase 1 — Capture every v3 data-PDU response (live, do first)
- Extend `scripts/focas/capture-initiate.py` (or add a C# capture mode to `Driver.FOCAS.Cli`) to: run the
two-socket initiate, then send each `0x21` data request (command IDs in
`docs/v2/implementation/focas-wire-protocol.md`) and dump the raw v3 response: `cnc_rdtimer`,
`cnc_rdsvmeter`, `pmc_rdpmcrng` (R100), `cnc_rdparam` (e.g. 1320), `cnc_rdalmmsg2`, `cnc_rddynamic2`
(axis 1 — a known-good — as the v3 reference layout).
- Save raw bytes as fixtures under `tests/Drivers/.../Fixtures/v3/` for offline unit tests.
- **Acceptance:** raw v3 response bytes captured + checked in for all six commands.
## Phase 2 — Safety: `cnc_rdsvmeter` must never hang
- Root cause: the read blocks waiting for a body length the v3 response never satisfies, and the wait
doesn't observe the `CancellationToken`. A hang here can wedge the FixedTree poll loop.
- Make the wire read honor the per-operation timeout/CT **regardless of framing** (the socket read path in
`FocasWireClient` must be CT-bound), so a bad parse fails fast as `BadCommunicationError`/timeout.
- **Acceptance:** `GetServoLoadsAsync` returns or fails within the timeout on the live 31i-B; a unit test
proves a truncated/oversized body length cancels rather than blocks. **Gating:** FixedTree must not be
enabled on a v3 control until this lands (capability probe could otherwise hang at init).
## Phase 3 — Decide request-version policy
- We currently emit v1 requests and accept v1/v3 responses; macro + most status reads work that way.
- If any Phase 57 command turns out to need v3-framed *requests*, thread the version negotiated from the
initiate response onto `FocasWireClient` and have `BuildPdu` emit it. Otherwise keep emitting v1.
- **Acceptance:** documented decision; negotiated version plumbed only if a command requires it.
## Phase 4 — Servo load + alarms v3 framing
- Diff captured `cnc_rdsvmeter` / `cnc_rdalmmsg2` bytes vs the v1 struct assumptions in `FocasWireModels.cs`
+ the `ParseServoLoad` / alarm parsers; fix offsets/strides for v3.
- **Acceptance:** servo-load % values are plausible; `ReadAlarmsAsync` returns the real active-alarm set;
unit tests over the Phase-1 fixtures; live re-validation.
## Phase 5 — Timer v3 struct
- Diff captured `cnc_rdtimer` bytes; fix the timer struct parse (running machine must show non-zero
PowerOn/Operating; Cutting sane).
- **Acceptance:** all four timers plausible on the live machine; fixture unit test; matches the
FixedTree `Timers/*` node expectations.
## Phase 6 — Dynamic axis iteration (1-based)
- FixedTree currently probes axis 0 → `EW_4`. Iterate `1..AxesCount` (from `cnc_sysinfo`); never request 0.
- **Acceptance:** every configured axis (per sysinfo `AxesCount`) yields a `FocasDynamicSnapshot`; no `EW_4`.
## Phase 7 — PMC + Parameter v3 framing
- Diff captured `pmc_rdpmcrng` (R100) + `cnc_rdparam` (1320) bytes vs the v1 `IODBPMC0` / `IODBPSD` shapes;
fix v3 parsing. Confirm whether the failures are framing or genuine CNC restriction (PMC path / param
presence) — macro working proves the envelope is fine, so suspect struct offsets first.
- **Acceptance:** `R100` reads a plausible value (or a *correct* status if genuinely restricted); a known
parameter reads its value; fixture unit tests; live re-validation.
## Phase 8 — Probe truthfulness
- `FocasDriverProbe` Phase-2 degrades to `Ok=true` ("TCP only") when FWLIB is absent → HEALTHY off a bare
socket. Replace with a wire-client probe: open `WireFocasClient` + one sample read (e.g. sysinfo). Keep
the TCP preflight for fast rejection.
- **Acceptance:** probe reports unhealthy when the CNC TCP-accepts but FOCAS reads fail; HEALTHY only on a
real session + read.
## Phase 9 — Docs + version matrix
- Add a real-hardware row to `docs/v2/focas-version-matrix.md`: 30i/31i-B → PDU v3; record which command
families are validated. Update `docs/drivers/FOCAS.md` + this plan's status as phases land.
## Phase 10 — Deploy to wonder + end-to-end verify
- Optional: set the device series to `ThirtyOne_i` (sysinfo says CncType 31; capability ranges identical to
`Thirty_i`, so cosmetic).
- Rebuild a self-contained win-x64 publish of `ZB.MOM.WW.OtOpcUa.Host` (or swap just
`ZB.MOM.WW.OtOpcUa.Driver.FOCAS.dll`) into `E:\ApiInstall\OtOpcUa\` on `wonder-app-vd03`, preserving
`appsettings*.json` + `data\`; restart `OtOpcUaHost`. (Access: servecli `:2222`, key
`~/.ssh/servecli_wonder` — see the deployment doc + memory.)
- **Re-run a deployment** in the AdminUI afterward — FixedTree nodes are emitted at `DiscoverAsync`, so the
address space must be rebuilt to surface them.
- **Acceptance (via the OtOpcUa CLI client → `opc.tcp://wonder-app-vd03.zmr.zimmer.com:4840/OtOpcUa`):**
`ns=2;s=EQ-3686c0272279/parts-count` + `/parts-required` read **Good**; FixedTree Identity/Axes/Program
nodes present with live values; (timers/servo-load good once Phases 45 land).
## Phase 11 — Commit + push
- Commit source + tests + docs on a branch `feat/focas-pdu-v3` (keep it separate from the unrelated
pre-existing local edits in the tree). Push to gitea per the repo's flow. The Akka-roles host fix is a
separate concern (see deployment doc) — note it but it's a box config change, not repo code.
---
## Test strategy
- **Offline (CI-safe):** unit tests over the Phase-1 captured v3 byte fixtures for every parser
(`FocasWireProtocolTests` + new `FocasWireModels`/parse tests). Keep the docker mock (v1) green.
- **Live (env-gated):** the `Driver.FOCAS.Cli` (`probe`/`read`) + the status harness, against
`10.201.31.5`. Gate behind an env var / `[Trait]` so CI without a CNC skips.
## Sequencing notes
- Phase 1 (capture) unblocks 4/5/7. Phase 2 (servo-load safety) gates enabling FixedTree on v3. Phases 47
are independent and parallelizable once captures exist. Phase 10 depends on whichever surfaces you want
live (macro tags already work after Phase 0, so a minimal deploy could happen now; full FixedTree wants
Phases 2/5/6).
- **Keep emitting v1 requests** unless Phase 3 proves otherwise — it's validated and minimal.
## File map
- `src/Drivers/.../Wire/FocasWireProtocol.cs` — version gate (done), request-version policy (Phase 3).
- `src/Drivers/.../Wire/FocasWireClient.cs` — CT-bound reads (Phase 2), per-command requests.
- `src/Drivers/.../Wire/FocasWireModels.cs` + parse helpers — per-command v3 struct fixes (Phases 47).
- `src/Drivers/.../FocasDriver.cs` — FixedTree axis iteration (Phase 6), FixedTree enable gating (Phase 2).
- `src/Drivers/.../FocasDriverProbe.cs` — wire-client probe (Phase 8).
- `scripts/focas/capture-initiate.py` — extend to data PDUs (Phase 1).
- `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/` + `Fixtures/v3/` — fixtures + parser tests.
- `docs/v2/focas-version-matrix.md`, `docs/drivers/FOCAS.md` — docs (Phase 9).