feat(focas): real FANUC 30i/31i-B PDU-v3 support (live-validated on a 31i-B)

First real FOCAS hardware contact (Makino Pro 5 / 31i-B @ 10.201.31.5). A full
v3 data-PDU capture corrected the initial diagnosis: the v3 block envelope is
identical to v1, so only specific payload structs / request math / one client
robustness gap were wrong — not "framing rewrites".

Fixes (all re-validated live through the fixed driver):
- version gate: accept inbound PDU {1,3}, keep emitting v1 (FocasWireProtocol).
- cnc_rdtimer: 8-byte {minute,msec} payload is little-endian (ParseTimer) — the
  only decode with an in-range msec field.
- pmc_rdpmcrng: request range widened to the data-type byte width
  (end = start + width - 1) so a Word/Long isn't truncated to 0 values
  (was spurious BadOutOfRange); decode extracted to ParsePmcRange.
- cnc_rdsvmeter: per-axis LOADELM is 8 bytes (not 12) and names come from the
  0x0089 block — ParseServoMeters fixes the misaligned 655360 garbage. Also the
  "hang" was NetworkStream.ReadAsync not aborting a stalled socket: ReadExactlyAsync
  now disposes the stream on cancellation so a stalled peer can't wedge a poll loop.
- cnc_rddynamic2: contract guard rejecting axis < 1 (driver poll already 1-based).
- FocasDriverProbe: run a real wire session (initiate + cnc_statinfo) instead of
  degrading to Ok=true "TCP reachability only" when FWLIB is absent — a bare TCP
  listener no longer reports HEALTHY.

cnc_rdparam (0x000e) is unsupported on this control — EW_FUNC across 14
request-framing variants x 4 known-present params; needs a reference FWLIB trace
or is restricted. Deferred (deployed config uses macros, not parameters).

Tests: FOCAS suite 234 green (+16), full solution builds 0 errors. Raw v3
captures checked in under tests/.../Fixtures/v3/. Capture tools under scripts/focas/.

Docs: docs/plans/2026-06-25-focas-pdu-v3-{30i-b-support,implementation-plan}.md,
docs/drivers/FOCAS.md, docs/v2/focas-version-matrix.md,
docs/deployments/wonder-app-vd03-makino-z-34184.md.
This commit is contained in:
Joseph Doherty
2026-06-25 16:41:42 -04:00
parent fd01448ac4
commit 5f0a52864c
36 changed files with 1567 additions and 177 deletions
@@ -1,9 +1,9 @@
using System.Diagnostics;
using System.Net.Sockets;
using System.Runtime.InteropServices;
using System.Text.Json;
using System.Text.Json.Serialization;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Wire;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS;
@@ -11,35 +11,29 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS;
/// Two-phase Test-Connect probe for the <see cref="FocasDriverOptions"/>-shaped driver config.
/// Phase 1: bare TCP connect to the first device's FOCAS Ethernet address + port to quickly
/// reject unreachable targets (preserves the original "Connect failed" / "timed out"
/// messages). Phase 2: attempts the FANUC FWLIB handle handshake — allocates a CNC handle via
/// <c>cnc_allclibhndl3(host, port, timeoutSec, out handle)</c> and immediately frees it with
/// <c>cnc_freelibhndl</c>. A handle that allocates (<c>EW_OK</c>) confirms the remote endpoint
/// is a real FOCAS CNC, not just a TCP listener.
/// messages). Phase 2: a real FOCAS session via the managed <see cref="FocasWireClient"/> — the
/// two-socket initiate handshake plus one sample read (<c>cnc_statinfo</c>). A handshake +
/// read that succeeds confirms the remote endpoint is a real FOCAS CNC, not just a TCP
/// listener.
/// <para>
/// The P/Invoke is issued directly (it does NOT route through
/// <see cref="UnimplementedFocasClientFactory"/>, whose <c>EnsureUsable()</c> throws by
/// design) so the handshake works on a real Windows+FWLIB host and degrades everywhere else.
/// The synchronous native call can block, so it runs on a worker bounded by a linked CTS
/// (<c>ct</c> + <c>CancelAfter(timeout)</c>) — the probe always returns within the timeout
/// budget even if FWLIB hangs.
/// <b>Why a wire-client probe (not FWLIB).</b> The pure-managed wire client is the driver's
/// only read backend (the FWLIB / out-of-process paths were retired in the Wire migration), so
/// the probe must exercise the same path the driver actually uses. The previous probe issued
/// the <c>cnc_allclibhndl3</c> FWLIB P/Invoke and, on any host without the native library (the
/// normal case — macOS dev boxes, Linux CI, and the Windows hosts that run the managed client),
/// degraded to <c>Ok=true</c> "TCP reachability only". That made every bare TCP listener look
/// HEALTHY — exactly how a Makino 31i-B looked "healthy" while no FOCAS data flowed. The wire
/// probe reports HEALTHY only on a genuine FOCAS session + read. See
/// <c>docs/plans/2026-06-25-focas-pdu-v3-30i-b-support.md</c> (Phase 8).
/// </para>
/// <para>
/// <b>Degrade guard (the crux).</b> On a host without the FWLIB native library — this dev box
/// (macOS) and the Linux CI containers — the <c>cnc_allclibhndl3</c> P/Invoke fails to bind
/// and throws <see cref="DllNotFoundException"/> (or a related load failure:
/// <see cref="TypeInitializationException"/>, <see cref="NotSupportedException"/>,
/// <see cref="BadImageFormatException"/>, <see cref="EntryPointNotFoundException"/>). Those are
/// caught and the probe falls back to <c>Ok=true</c> with a "FWLIB absent — TCP reachability
/// only" note, so the probe is never worse than the original TCP-only behaviour on FWLIB-less
/// hosts. The happy path and the FWLIB-present CNC-error path are live-verify deferred (no CNC
/// and no FWLIB on the rig).
/// The wire client honours the linked CTS (<c>ct</c> + <c>CancelAfter(timeout)</c>) and its
/// reads are abort-bounded (see <see cref="FocasWireProtocol"/>), so the probe always returns
/// within the timeout budget even against a host that accepts TCP then stalls.
/// </para>
/// </summary>
public sealed class FocasDriverProbe : IDriverProbe
{
/// <summary>FANUC FWLIB return code for success (<c>EW_OK</c>).</summary>
private const short EwOk = 0;
private static readonly JsonSerializerOptions _opts = new()
{
PropertyNameCaseInsensitive = true,
@@ -83,75 +77,32 @@ public sealed class FocasDriverProbe : IDriverProbe
return new(false, ex.Message, null);
}
// Phase 2: FOCAS handle handshake via cnc_allclibhndl3. The native call is synchronous and
// can block, so run it on a worker bounded by a linked CTS = ct + CancelAfter(timeout).
using var handshakeCts = CancellationTokenSource.CreateLinkedTokenSource(ct);
// Phase 2: real FOCAS session via the managed wire client — initiate handshake + one
// sample read. Bounded by a linked CTS = ct + CancelAfter(budget); the wire reads are
// abort-bounded so a TCP-accept-then-stall host can't hold the probe past the budget.
using var sessionCts = CancellationTokenSource.CreateLinkedTokenSource(ct);
var budget = timeout > TimeSpan.Zero ? timeout : TimeSpan.FromSeconds(1);
handshakeCts.CancelAfter(budget);
sessionCts.CancelAfter(budget);
try
{
var (degraded, rc) = await Task.Run(
() => TryAllocateAndFreeHandle(host, port, budget),
handshakeCts.Token);
await using var wire = new FocasWireClient();
await wire.ConnectAsync(host, port, budget, sessionCts.Token).ConfigureAwait(false);
var status = await wire.ReadStatusAsync(sessionCts.Token, budget).ConfigureAwait(false);
sw.Stop();
if (degraded)
{
// FWLIB absent / cannot load — never worse than the original TCP-only probe.
return new(
true,
$"Reachable at {host}:{port} (FOCAS handshake unavailable on this host — " +
"FWLIB absent, TCP reachability only)",
sw.Elapsed);
}
if (rc == EwOk)
return new(true, "FOCAS handle OK", sw.Elapsed);
// FWLIB present but the remote returned an error — reachable TCP but not a CNC.
return new(false, $"Reachable at {host}:{port} but FOCAS handshake failed: focas_rc={rc}", null);
return status.IsOk
? new(true, $"FOCAS session OK at {host}:{port} (cnc_statinfo)", sw.Elapsed)
: new(false, $"Reachable at {host}:{port} but FOCAS read failed: EW_{status.Rc}", null);
}
catch (OperationCanceledException)
{
// The caller cancelled, or the Task.Run was cancelled before the native call started.
// (A native cnc_allclibhndl3 that is already running is bounded by the timeoutSeconds
// argument passed into it, not by handshakeCts — see TryAllocateAndFreeHandle.)
return new(false, $"Probe timed out after {timeout.TotalSeconds:F0}s.", null);
}
}
/// <summary>
/// Attempts the FWLIB handle handshake against <paramref name="host"/>/<paramref name="port"/>.
/// On success the handle is freed immediately. Returns <c>degraded=true</c> when the native
/// library cannot be loaded (FWLIB absent — the dev/CI reality); otherwise
/// <c>degraded=false</c> with the FWLIB return code (<c>EW_OK</c> = handle allocated).
/// </summary>
private static (bool degraded, short rc) TryAllocateAndFreeHandle(string host, int port, TimeSpan timeout)
{
var timeoutSeconds = (int)Math.Ceiling(timeout.TotalSeconds);
if (timeoutSeconds <= 0) timeoutSeconds = 1;
ushort handle = 0;
try
catch (FocasWireException ex)
{
var rc = NativeFwlib.cnc_allclibhndl3(host, (ushort)port, timeoutSeconds, out handle);
return (degraded: false, rc);
}
catch (DllNotFoundException) { return (degraded: true, rc: default); }
catch (TypeInitializationException) { return (degraded: true, rc: default); }
catch (NotSupportedException) { return (degraded: true, rc: default); }
catch (BadImageFormatException) { return (degraded: true, rc: default); }
catch (EntryPointNotFoundException) { return (degraded: true, rc: default); }
finally
{
// Best-effort free if a handle was actually allocated (incl. after a timeout race).
if (handle != 0)
{
try { NativeFwlib.cnc_freelibhndl(handle); }
catch { /* best-effort — never let teardown hide the probe result */ }
}
// TCP-reachable but the FOCAS initiate/read failed — a listener that is not a CNC.
return new(false, $"Reachable at {host}:{port} but FOCAS session failed: {ex.Message}", null);
}
}
@@ -166,28 +117,4 @@ public sealed class FocasDriverProbe : IDriverProbe
return (parsed.Host, parsed.Port);
}
/// <summary>
/// Minimal P/Invoke surface for the two FANUC FWLIB entry points the probe needs:
/// <c>cnc_allclibhndl3</c> to allocate a CNC handle against a host/port, and
/// <c>cnc_freelibhndl</c> to release it. The native library (<c>fwlib32.dll</c> /
/// <c>fwlib64.dll</c> on Windows, <c>libfwlib32.so</c> on Linux) is only present on a host
/// with the FANUC FWLIB redistributable installed. On every other host the JIT fails to
/// bind these entry points and throws <see cref="DllNotFoundException"/> — caught by the
/// probe's degrade guard.
/// </summary>
private static class NativeFwlib
{
private const string Library = "fwlib32";
[DllImport(Library, CallingConvention = CallingConvention.Cdecl, CharSet = CharSet.Ansi)]
internal static extern short cnc_allclibhndl3(
[MarshalAs(UnmanagedType.LPStr)] string ipaddr,
ushort port,
int timeout,
out ushort handle);
[DllImport(Library, CallingConvention = CallingConvention.Cdecl)]
internal static extern short cnc_freelibhndl(ushort handle);
}
}