feat(focas): real FANUC 30i/31i-B PDU-v3 support (live-validated on a 31i-B)

First real FOCAS hardware contact (Makino Pro 5 / 31i-B @ 10.201.31.5). A full
v3 data-PDU capture corrected the initial diagnosis: the v3 block envelope is
identical to v1, so only specific payload structs / request math / one client
robustness gap were wrong — not "framing rewrites".

Fixes (all re-validated live through the fixed driver):
- version gate: accept inbound PDU {1,3}, keep emitting v1 (FocasWireProtocol).
- cnc_rdtimer: 8-byte {minute,msec} payload is little-endian (ParseTimer) — the
  only decode with an in-range msec field.
- pmc_rdpmcrng: request range widened to the data-type byte width
  (end = start + width - 1) so a Word/Long isn't truncated to 0 values
  (was spurious BadOutOfRange); decode extracted to ParsePmcRange.
- cnc_rdsvmeter: per-axis LOADELM is 8 bytes (not 12) and names come from the
  0x0089 block — ParseServoMeters fixes the misaligned 655360 garbage. Also the
  "hang" was NetworkStream.ReadAsync not aborting a stalled socket: ReadExactlyAsync
  now disposes the stream on cancellation so a stalled peer can't wedge a poll loop.
- cnc_rddynamic2: contract guard rejecting axis < 1 (driver poll already 1-based).
- FocasDriverProbe: run a real wire session (initiate + cnc_statinfo) instead of
  degrading to Ok=true "TCP reachability only" when FWLIB is absent — a bare TCP
  listener no longer reports HEALTHY.

cnc_rdparam (0x000e) is unsupported on this control — EW_FUNC across 14
request-framing variants x 4 known-present params; needs a reference FWLIB trace
or is restricted. Deferred (deployed config uses macros, not parameters).

Tests: FOCAS suite 234 green (+16), full solution builds 0 errors. Raw v3
captures checked in under tests/.../Fixtures/v3/. Capture tools under scripts/focas/.

Docs: docs/plans/2026-06-25-focas-pdu-v3-{30i-b-support,implementation-plan}.md,
docs/drivers/FOCAS.md, docs/v2/focas-version-matrix.md,
docs/deployments/wonder-app-vd03-makino-z-34184.md.
This commit is contained in:
Joseph Doherty
2026-06-25 16:41:42 -04:00
parent fd01448ac4
commit 5f0a52864c
36 changed files with 1567 additions and 177 deletions
@@ -0,0 +1,139 @@
# FOCAS PDU v3 — implementation plan (finish real 30i/31i-B support)
**Date:** 2026-06-25
**Companion to:** [`2026-06-25-focas-pdu-v3-30i-b-support.md`](2026-06-25-focas-pdu-v3-30i-b-support.md) (finding + live captures + per-command validation) and [`../deployments/wonder-app-vd03-makino-z-34184.md`](../deployments/wonder-app-vd03-makino-z-34184.md) (the deployment this unblocks).
**Goal:** make the managed `WireFocasClient` fully interoperate with a real FANUC 30i/31i-B (FOCAS Ethernet **PDU v3**), then light up the live OPC UA data on `wonder-app-vd03` for the Makino Pro 5 (`Z-34184`, `10.201.31.5`).
## ⏳ Time-boxed asset — capture/validate live FIRST
`10.201.31.5:8193` (real 31i-B) is reachable from the dev box and the wonder host **right now**. Every
per-command v3 framing fix needs a live capture + live re-validation. **Do all captures in Phase 1 while
access lasts**; the parser fixes + unit tests can be finished offline against the captured bytes later.
Tools already in place: `scripts/focas/capture-initiate.py <host>` (initiate only — extend it) and the
throwaway status harness pattern (see the finding doc).
## Current status — DONE (2026-06-25), live-validated on the real 31i-B (`10.201.31.5`)
The Phase-1 capture corrected the diagnosis (4 of 6 "framing failures" were not framing problems).
All tractable phases are implemented + unit-tested (FOCAS suite **234 green**, full solution builds 0
errors) + re-validated live. The full corrected diagnosis + per-surface evidence is in the companion
finding doc's **Resolution** section.
| Phase | Item | State |
|---|---|---|
| 0 | inbound PDU-version gate `{1,3}` | DONE — macros + all status reads work on v3 |
| 1 | capture every v3 data PDU | DONE — 20 fixtures under `tests/.../Fixtures/v3/` |
| 2 | servo "hang" → CT-bound reads | DONE — `ReadExactlyAsync` dispose-on-cancel; servo answers in 0 ms (no real wire hang) |
| 3 | request-version policy | DONE — keep emitting v1 (CNC accepts it); no command needed v3 requests |
| 4 | servo + alarms framing | DONE — servo 8-byte stride + names from 0x0089; alarms already correct (read `#3080` live) |
| 5 | timer v3 struct | DONE — `ParseTimer` little-endian {minute, msec} |
| 6 | dynamic axis iteration | DONE — driver poll already 1-based; added a `ReadDynamicAsync` contract guard |
| 7 | PMC framing | DONE — `end = start + width - 1`; **parameter framing BLOCKED** (EW_FUNC across 14 variants — see finding) |
| 8 | probe truthfulness | DONE — `FocasDriverProbe` runs a real wire session (initiate + cnc_statinfo) |
| 9 | docs + version matrix | DONE — this plan, the finding doc, `FOCAS.md`, `focas-version-matrix.md` |
| 10 | deploy to wonder + e2e | PENDING — awaiting go-ahead (production box) |
| 11 | commit + push | commit DONE on `feat/focas-pdu-v3`; push PENDING go-ahead |
**Only genuinely open v3 item:** `cnc_rdparam` (EW_FUNC on every framing — needs a reference FWLIB
trace or is restricted on this control). Deferred; the deployed config uses macros, not parameters.
---
## Phase 1 — Capture every v3 data-PDU response (live, do first)
- Extend `scripts/focas/capture-initiate.py` (or add a C# capture mode to `Driver.FOCAS.Cli`) to: run the
two-socket initiate, then send each `0x21` data request (command IDs in
`docs/v2/implementation/focas-wire-protocol.md`) and dump the raw v3 response: `cnc_rdtimer`,
`cnc_rdsvmeter`, `pmc_rdpmcrng` (R100), `cnc_rdparam` (e.g. 1320), `cnc_rdalmmsg2`, `cnc_rddynamic2`
(axis 1 — a known-good — as the v3 reference layout).
- Save raw bytes as fixtures under `tests/Drivers/.../Fixtures/v3/` for offline unit tests.
- **Acceptance:** raw v3 response bytes captured + checked in for all six commands.
## Phase 2 — Safety: `cnc_rdsvmeter` must never hang
- Root cause: the read blocks waiting for a body length the v3 response never satisfies, and the wait
doesn't observe the `CancellationToken`. A hang here can wedge the FixedTree poll loop.
- Make the wire read honor the per-operation timeout/CT **regardless of framing** (the socket read path in
`FocasWireClient` must be CT-bound), so a bad parse fails fast as `BadCommunicationError`/timeout.
- **Acceptance:** `GetServoLoadsAsync` returns or fails within the timeout on the live 31i-B; a unit test
proves a truncated/oversized body length cancels rather than blocks. **Gating:** FixedTree must not be
enabled on a v3 control until this lands (capability probe could otherwise hang at init).
## Phase 3 — Decide request-version policy
- We currently emit v1 requests and accept v1/v3 responses; macro + most status reads work that way.
- If any Phase 57 command turns out to need v3-framed *requests*, thread the version negotiated from the
initiate response onto `FocasWireClient` and have `BuildPdu` emit it. Otherwise keep emitting v1.
- **Acceptance:** documented decision; negotiated version plumbed only if a command requires it.
## Phase 4 — Servo load + alarms v3 framing
- Diff captured `cnc_rdsvmeter` / `cnc_rdalmmsg2` bytes vs the v1 struct assumptions in `FocasWireModels.cs`
+ the `ParseServoLoad` / alarm parsers; fix offsets/strides for v3.
- **Acceptance:** servo-load % values are plausible; `ReadAlarmsAsync` returns the real active-alarm set;
unit tests over the Phase-1 fixtures; live re-validation.
## Phase 5 — Timer v3 struct
- Diff captured `cnc_rdtimer` bytes; fix the timer struct parse (running machine must show non-zero
PowerOn/Operating; Cutting sane).
- **Acceptance:** all four timers plausible on the live machine; fixture unit test; matches the
FixedTree `Timers/*` node expectations.
## Phase 6 — Dynamic axis iteration (1-based)
- FixedTree currently probes axis 0 → `EW_4`. Iterate `1..AxesCount` (from `cnc_sysinfo`); never request 0.
- **Acceptance:** every configured axis (per sysinfo `AxesCount`) yields a `FocasDynamicSnapshot`; no `EW_4`.
## Phase 7 — PMC + Parameter v3 framing
- Diff captured `pmc_rdpmcrng` (R100) + `cnc_rdparam` (1320) bytes vs the v1 `IODBPMC0` / `IODBPSD` shapes;
fix v3 parsing. Confirm whether the failures are framing or genuine CNC restriction (PMC path / param
presence) — macro working proves the envelope is fine, so suspect struct offsets first.
- **Acceptance:** `R100` reads a plausible value (or a *correct* status if genuinely restricted); a known
parameter reads its value; fixture unit tests; live re-validation.
## Phase 8 — Probe truthfulness
- `FocasDriverProbe` Phase-2 degrades to `Ok=true` ("TCP only") when FWLIB is absent → HEALTHY off a bare
socket. Replace with a wire-client probe: open `WireFocasClient` + one sample read (e.g. sysinfo). Keep
the TCP preflight for fast rejection.
- **Acceptance:** probe reports unhealthy when the CNC TCP-accepts but FOCAS reads fail; HEALTHY only on a
real session + read.
## Phase 9 — Docs + version matrix
- Add a real-hardware row to `docs/v2/focas-version-matrix.md`: 30i/31i-B → PDU v3; record which command
families are validated. Update `docs/drivers/FOCAS.md` + this plan's status as phases land.
## Phase 10 — Deploy to wonder + end-to-end verify
- Optional: set the device series to `ThirtyOne_i` (sysinfo says CncType 31; capability ranges identical to
`Thirty_i`, so cosmetic).
- Rebuild a self-contained win-x64 publish of `ZB.MOM.WW.OtOpcUa.Host` (or swap just
`ZB.MOM.WW.OtOpcUa.Driver.FOCAS.dll`) into `E:\ApiInstall\OtOpcUa\` on `wonder-app-vd03`, preserving
`appsettings*.json` + `data\`; restart `OtOpcUaHost`. (Access: servecli `:2222`, key
`~/.ssh/servecli_wonder` — see the deployment doc + memory.)
- **Re-run a deployment** in the AdminUI afterward — FixedTree nodes are emitted at `DiscoverAsync`, so the
address space must be rebuilt to surface them.
- **Acceptance (via the OtOpcUa CLI client → `opc.tcp://wonder-app-vd03.zmr.zimmer.com:4840/OtOpcUa`):**
`ns=2;s=EQ-3686c0272279/parts-count` + `/parts-required` read **Good**; FixedTree Identity/Axes/Program
nodes present with live values; (timers/servo-load good once Phases 45 land).
## Phase 11 — Commit + push
- Commit source + tests + docs on a branch `feat/focas-pdu-v3` (keep it separate from the unrelated
pre-existing local edits in the tree). Push to gitea per the repo's flow. The Akka-roles host fix is a
separate concern (see deployment doc) — note it but it's a box config change, not repo code.
---
## Test strategy
- **Offline (CI-safe):** unit tests over the Phase-1 captured v3 byte fixtures for every parser
(`FocasWireProtocolTests` + new `FocasWireModels`/parse tests). Keep the docker mock (v1) green.
- **Live (env-gated):** the `Driver.FOCAS.Cli` (`probe`/`read`) + the status harness, against
`10.201.31.5`. Gate behind an env var / `[Trait]` so CI without a CNC skips.
## Sequencing notes
- Phase 1 (capture) unblocks 4/5/7. Phase 2 (servo-load safety) gates enabling FixedTree on v3. Phases 47
are independent and parallelizable once captures exist. Phase 10 depends on whichever surfaces you want
live (macro tags already work after Phase 0, so a minimal deploy could happen now; full FixedTree wants
Phases 2/5/6).
- **Keep emitting v1 requests** unless Phase 3 proves otherwise — it's validated and minimal.
## File map
- `src/Drivers/.../Wire/FocasWireProtocol.cs` — version gate (done), request-version policy (Phase 3).
- `src/Drivers/.../Wire/FocasWireClient.cs` — CT-bound reads (Phase 2), per-command requests.
- `src/Drivers/.../Wire/FocasWireModels.cs` + parse helpers — per-command v3 struct fixes (Phases 47).
- `src/Drivers/.../FocasDriver.cs` — FixedTree axis iteration (Phase 6), FixedTree enable gating (Phase 2).
- `src/Drivers/.../FocasDriverProbe.cs` — wire-client probe (Phase 8).
- `scripts/focas/capture-initiate.py` — extend to data PDUs (Phase 1).
- `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/` + `Fixtures/v3/` — fixtures + parser tests.
- `docs/v2/focas-version-matrix.md`, `docs/drivers/FOCAS.md` — docs (Phase 9).