Files
lmxopcua/docs/plans/2026-06-25-focas-pdu-v3-implementation-plan.md
T
Joseph Doherty 5f0a52864c feat(focas): real FANUC 30i/31i-B PDU-v3 support (live-validated on a 31i-B)
First real FOCAS hardware contact (Makino Pro 5 / 31i-B @ 10.201.31.5). A full
v3 data-PDU capture corrected the initial diagnosis: the v3 block envelope is
identical to v1, so only specific payload structs / request math / one client
robustness gap were wrong — not "framing rewrites".

Fixes (all re-validated live through the fixed driver):
- version gate: accept inbound PDU {1,3}, keep emitting v1 (FocasWireProtocol).
- cnc_rdtimer: 8-byte {minute,msec} payload is little-endian (ParseTimer) — the
  only decode with an in-range msec field.
- pmc_rdpmcrng: request range widened to the data-type byte width
  (end = start + width - 1) so a Word/Long isn't truncated to 0 values
  (was spurious BadOutOfRange); decode extracted to ParsePmcRange.
- cnc_rdsvmeter: per-axis LOADELM is 8 bytes (not 12) and names come from the
  0x0089 block — ParseServoMeters fixes the misaligned 655360 garbage. Also the
  "hang" was NetworkStream.ReadAsync not aborting a stalled socket: ReadExactlyAsync
  now disposes the stream on cancellation so a stalled peer can't wedge a poll loop.
- cnc_rddynamic2: contract guard rejecting axis < 1 (driver poll already 1-based).
- FocasDriverProbe: run a real wire session (initiate + cnc_statinfo) instead of
  degrading to Ok=true "TCP reachability only" when FWLIB is absent — a bare TCP
  listener no longer reports HEALTHY.

cnc_rdparam (0x000e) is unsupported on this control — EW_FUNC across 14
request-framing variants x 4 known-present params; needs a reference FWLIB trace
or is restricted. Deferred (deployed config uses macros, not parameters).

Tests: FOCAS suite 234 green (+16), full solution builds 0 errors. Raw v3
captures checked in under tests/.../Fixtures/v3/. Capture tools under scripts/focas/.

Docs: docs/plans/2026-06-25-focas-pdu-v3-{30i-b-support,implementation-plan}.md,
docs/drivers/FOCAS.md, docs/v2/focas-version-matrix.md,
docs/deployments/wonder-app-vd03-makino-z-34184.md.
2026-06-25 16:41:42 -04:00

9.8 KiB
Raw Blame History

FOCAS PDU v3 — implementation plan (finish real 30i/31i-B support)

Date: 2026-06-25 Companion to: 2026-06-25-focas-pdu-v3-30i-b-support.md (finding + live captures + per-command validation) and ../deployments/wonder-app-vd03-makino-z-34184.md (the deployment this unblocks). Goal: make the managed WireFocasClient fully interoperate with a real FANUC 30i/31i-B (FOCAS Ethernet PDU v3), then light up the live OPC UA data on wonder-app-vd03 for the Makino Pro 5 (Z-34184, 10.201.31.5).

Time-boxed asset — capture/validate live FIRST

10.201.31.5:8193 (real 31i-B) is reachable from the dev box and the wonder host right now. Every per-command v3 framing fix needs a live capture + live re-validation. Do all captures in Phase 1 while access lasts; the parser fixes + unit tests can be finished offline against the captured bytes later. Tools already in place: scripts/focas/capture-initiate.py <host> (initiate only — extend it) and the throwaway status harness pattern (see the finding doc).

Current status — DONE (2026-06-25), live-validated on the real 31i-B (10.201.31.5)

The Phase-1 capture corrected the diagnosis (4 of 6 "framing failures" were not framing problems). All tractable phases are implemented + unit-tested (FOCAS suite 234 green, full solution builds 0 errors) + re-validated live. The full corrected diagnosis + per-surface evidence is in the companion finding doc's Resolution section.

Phase Item State
0 inbound PDU-version gate {1,3} DONE — macros + all status reads work on v3
1 capture every v3 data PDU DONE — 20 fixtures under tests/.../Fixtures/v3/
2 servo "hang" → CT-bound reads DONE — ReadExactlyAsync dispose-on-cancel; servo answers in 0 ms (no real wire hang)
3 request-version policy DONE — keep emitting v1 (CNC accepts it); no command needed v3 requests
4 servo + alarms framing DONE — servo 8-byte stride + names from 0x0089; alarms already correct (read #3080 live)
5 timer v3 struct DONE — ParseTimer little-endian {minute, msec}
6 dynamic axis iteration DONE — driver poll already 1-based; added a ReadDynamicAsync contract guard
7 PMC framing DONE — end = start + width - 1; parameter framing BLOCKED (EW_FUNC across 14 variants — see finding)
8 probe truthfulness DONE — FocasDriverProbe runs a real wire session (initiate + cnc_statinfo)
9 docs + version matrix DONE — this plan, the finding doc, FOCAS.md, focas-version-matrix.md
10 deploy to wonder + e2e PENDING — awaiting go-ahead (production box)
11 commit + push commit DONE on feat/focas-pdu-v3; push PENDING go-ahead

Only genuinely open v3 item: cnc_rdparam (EW_FUNC on every framing — needs a reference FWLIB trace or is restricted on this control). Deferred; the deployed config uses macros, not parameters.


Phase 1 — Capture every v3 data-PDU response (live, do first)

  • Extend scripts/focas/capture-initiate.py (or add a C# capture mode to Driver.FOCAS.Cli) to: run the two-socket initiate, then send each 0x21 data request (command IDs in docs/v2/implementation/focas-wire-protocol.md) and dump the raw v3 response: cnc_rdtimer, cnc_rdsvmeter, pmc_rdpmcrng (R100), cnc_rdparam (e.g. 1320), cnc_rdalmmsg2, cnc_rddynamic2 (axis 1 — a known-good — as the v3 reference layout).
  • Save raw bytes as fixtures under tests/Drivers/.../Fixtures/v3/ for offline unit tests.
  • Acceptance: raw v3 response bytes captured + checked in for all six commands.

Phase 2 — Safety: cnc_rdsvmeter must never hang

  • Root cause: the read blocks waiting for a body length the v3 response never satisfies, and the wait doesn't observe the CancellationToken. A hang here can wedge the FixedTree poll loop.
  • Make the wire read honor the per-operation timeout/CT regardless of framing (the socket read path in FocasWireClient must be CT-bound), so a bad parse fails fast as BadCommunicationError/timeout.
  • Acceptance: GetServoLoadsAsync returns or fails within the timeout on the live 31i-B; a unit test proves a truncated/oversized body length cancels rather than blocks. Gating: FixedTree must not be enabled on a v3 control until this lands (capability probe could otherwise hang at init).

Phase 3 — Decide request-version policy

  • We currently emit v1 requests and accept v1/v3 responses; macro + most status reads work that way.
  • If any Phase 57 command turns out to need v3-framed requests, thread the version negotiated from the initiate response onto FocasWireClient and have BuildPdu emit it. Otherwise keep emitting v1.
  • Acceptance: documented decision; negotiated version plumbed only if a command requires it.

Phase 4 — Servo load + alarms v3 framing

  • Diff captured cnc_rdsvmeter / cnc_rdalmmsg2 bytes vs the v1 struct assumptions in FocasWireModels.cs
    • the ParseServoLoad / alarm parsers; fix offsets/strides for v3.
  • Acceptance: servo-load % values are plausible; ReadAlarmsAsync returns the real active-alarm set; unit tests over the Phase-1 fixtures; live re-validation.

Phase 5 — Timer v3 struct

  • Diff captured cnc_rdtimer bytes; fix the timer struct parse (running machine must show non-zero PowerOn/Operating; Cutting sane).
  • Acceptance: all four timers plausible on the live machine; fixture unit test; matches the FixedTree Timers/* node expectations.

Phase 6 — Dynamic axis iteration (1-based)

  • FixedTree currently probes axis 0 → EW_4. Iterate 1..AxesCount (from cnc_sysinfo); never request 0.
  • Acceptance: every configured axis (per sysinfo AxesCount) yields a FocasDynamicSnapshot; no EW_4.

Phase 7 — PMC + Parameter v3 framing

  • Diff captured pmc_rdpmcrng (R100) + cnc_rdparam (1320) bytes vs the v1 IODBPMC0 / IODBPSD shapes; fix v3 parsing. Confirm whether the failures are framing or genuine CNC restriction (PMC path / param presence) — macro working proves the envelope is fine, so suspect struct offsets first.
  • Acceptance: R100 reads a plausible value (or a correct status if genuinely restricted); a known parameter reads its value; fixture unit tests; live re-validation.

Phase 8 — Probe truthfulness

  • FocasDriverProbe Phase-2 degrades to Ok=true ("TCP only") when FWLIB is absent → HEALTHY off a bare socket. Replace with a wire-client probe: open WireFocasClient + one sample read (e.g. sysinfo). Keep the TCP preflight for fast rejection.
  • Acceptance: probe reports unhealthy when the CNC TCP-accepts but FOCAS reads fail; HEALTHY only on a real session + read.

Phase 9 — Docs + version matrix

  • Add a real-hardware row to docs/v2/focas-version-matrix.md: 30i/31i-B → PDU v3; record which command families are validated. Update docs/drivers/FOCAS.md + this plan's status as phases land.

Phase 10 — Deploy to wonder + end-to-end verify

  • Optional: set the device series to ThirtyOne_i (sysinfo says CncType 31; capability ranges identical to Thirty_i, so cosmetic).
  • Rebuild a self-contained win-x64 publish of ZB.MOM.WW.OtOpcUa.Host (or swap just ZB.MOM.WW.OtOpcUa.Driver.FOCAS.dll) into E:\ApiInstall\OtOpcUa\ on wonder-app-vd03, preserving appsettings*.json + data\; restart OtOpcUaHost. (Access: servecli :2222, key ~/.ssh/servecli_wonder — see the deployment doc + memory.)
  • Re-run a deployment in the AdminUI afterward — FixedTree nodes are emitted at DiscoverAsync, so the address space must be rebuilt to surface them.
  • Acceptance (via the OtOpcUa CLI client → opc.tcp://wonder-app-vd03.zmr.zimmer.com:4840/OtOpcUa): ns=2;s=EQ-3686c0272279/parts-count + /parts-required read Good; FixedTree Identity/Axes/Program nodes present with live values; (timers/servo-load good once Phases 45 land).

Phase 11 — Commit + push

  • Commit source + tests + docs on a branch feat/focas-pdu-v3 (keep it separate from the unrelated pre-existing local edits in the tree). Push to gitea per the repo's flow. The Akka-roles host fix is a separate concern (see deployment doc) — note it but it's a box config change, not repo code.

Test strategy

  • Offline (CI-safe): unit tests over the Phase-1 captured v3 byte fixtures for every parser (FocasWireProtocolTests + new FocasWireModels/parse tests). Keep the docker mock (v1) green.
  • Live (env-gated): the Driver.FOCAS.Cli (probe/read) + the status harness, against 10.201.31.5. Gate behind an env var / [Trait] so CI without a CNC skips.

Sequencing notes

  • Phase 1 (capture) unblocks 4/5/7. Phase 2 (servo-load safety) gates enabling FixedTree on v3. Phases 47 are independent and parallelizable once captures exist. Phase 10 depends on whichever surfaces you want live (macro tags already work after Phase 0, so a minimal deploy could happen now; full FixedTree wants Phases 2/5/6).
  • Keep emitting v1 requests unless Phase 3 proves otherwise — it's validated and minimal.

File map

  • src/Drivers/.../Wire/FocasWireProtocol.cs — version gate (done), request-version policy (Phase 3).
  • src/Drivers/.../Wire/FocasWireClient.cs — CT-bound reads (Phase 2), per-command requests.
  • src/Drivers/.../Wire/FocasWireModels.cs + parse helpers — per-command v3 struct fixes (Phases 47).
  • src/Drivers/.../FocasDriver.cs — FixedTree axis iteration (Phase 6), FixedTree enable gating (Phase 2).
  • src/Drivers/.../FocasDriverProbe.cs — wire-client probe (Phase 8).
  • scripts/focas/capture-initiate.py — extend to data PDUs (Phase 1).
  • tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/ + Fixtures/v3/ — fixtures + parser tests.
  • docs/v2/focas-version-matrix.md, docs/drivers/FOCAS.md — docs (Phase 9).