feat(focas): real FANUC 30i/31i-B PDU-v3 support (live-validated on a 31i-B)

First real FOCAS hardware contact (Makino Pro 5 / 31i-B @ 10.201.31.5). A full
v3 data-PDU capture corrected the initial diagnosis: the v3 block envelope is
identical to v1, so only specific payload structs / request math / one client
robustness gap were wrong — not "framing rewrites".

Fixes (all re-validated live through the fixed driver):
- version gate: accept inbound PDU {1,3}, keep emitting v1 (FocasWireProtocol).
- cnc_rdtimer: 8-byte {minute,msec} payload is little-endian (ParseTimer) — the
  only decode with an in-range msec field.
- pmc_rdpmcrng: request range widened to the data-type byte width
  (end = start + width - 1) so a Word/Long isn't truncated to 0 values
  (was spurious BadOutOfRange); decode extracted to ParsePmcRange.
- cnc_rdsvmeter: per-axis LOADELM is 8 bytes (not 12) and names come from the
  0x0089 block — ParseServoMeters fixes the misaligned 655360 garbage. Also the
  "hang" was NetworkStream.ReadAsync not aborting a stalled socket: ReadExactlyAsync
  now disposes the stream on cancellation so a stalled peer can't wedge a poll loop.
- cnc_rddynamic2: contract guard rejecting axis < 1 (driver poll already 1-based).
- FocasDriverProbe: run a real wire session (initiate + cnc_statinfo) instead of
  degrading to Ok=true "TCP reachability only" when FWLIB is absent — a bare TCP
  listener no longer reports HEALTHY.

cnc_rdparam (0x000e) is unsupported on this control — EW_FUNC across 14
request-framing variants x 4 known-present params; needs a reference FWLIB trace
or is restricted. Deferred (deployed config uses macros, not parameters).

Tests: FOCAS suite 234 green (+16), full solution builds 0 errors. Raw v3
captures checked in under tests/.../Fixtures/v3/. Capture tools under scripts/focas/.

Docs: docs/plans/2026-06-25-focas-pdu-v3-{30i-b-support,implementation-plan}.md,
docs/drivers/FOCAS.md, docs/v2/focas-version-matrix.md,
docs/deployments/wonder-app-vd03-makino-z-34184.md.
This commit is contained in:
Joseph Doherty
2026-06-25 16:41:42 -04:00
parent fd01448ac4
commit 5f0a52864c
36 changed files with 1567 additions and 177 deletions
@@ -1,9 +1,9 @@
using System.Diagnostics;
using System.Net.Sockets;
using System.Runtime.InteropServices;
using System.Text.Json;
using System.Text.Json.Serialization;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Wire;
namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS;
@@ -11,35 +11,29 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS;
/// Two-phase Test-Connect probe for the <see cref="FocasDriverOptions"/>-shaped driver config.
/// Phase 1: bare TCP connect to the first device's FOCAS Ethernet address + port to quickly
/// reject unreachable targets (preserves the original "Connect failed" / "timed out"
/// messages). Phase 2: attempts the FANUC FWLIB handle handshake — allocates a CNC handle via
/// <c>cnc_allclibhndl3(host, port, timeoutSec, out handle)</c> and immediately frees it with
/// <c>cnc_freelibhndl</c>. A handle that allocates (<c>EW_OK</c>) confirms the remote endpoint
/// is a real FOCAS CNC, not just a TCP listener.
/// messages). Phase 2: a real FOCAS session via the managed <see cref="FocasWireClient"/> — the
/// two-socket initiate handshake plus one sample read (<c>cnc_statinfo</c>). A handshake +
/// read that succeeds confirms the remote endpoint is a real FOCAS CNC, not just a TCP
/// listener.
/// <para>
/// The P/Invoke is issued directly (it does NOT route through
/// <see cref="UnimplementedFocasClientFactory"/>, whose <c>EnsureUsable()</c> throws by
/// design) so the handshake works on a real Windows+FWLIB host and degrades everywhere else.
/// The synchronous native call can block, so it runs on a worker bounded by a linked CTS
/// (<c>ct</c> + <c>CancelAfter(timeout)</c>) — the probe always returns within the timeout
/// budget even if FWLIB hangs.
/// <b>Why a wire-client probe (not FWLIB).</b> The pure-managed wire client is the driver's
/// only read backend (the FWLIB / out-of-process paths were retired in the Wire migration), so
/// the probe must exercise the same path the driver actually uses. The previous probe issued
/// the <c>cnc_allclibhndl3</c> FWLIB P/Invoke and, on any host without the native library (the
/// normal case — macOS dev boxes, Linux CI, and the Windows hosts that run the managed client),
/// degraded to <c>Ok=true</c> "TCP reachability only". That made every bare TCP listener look
/// HEALTHY — exactly how a Makino 31i-B looked "healthy" while no FOCAS data flowed. The wire
/// probe reports HEALTHY only on a genuine FOCAS session + read. See
/// <c>docs/plans/2026-06-25-focas-pdu-v3-30i-b-support.md</c> (Phase 8).
/// </para>
/// <para>
/// <b>Degrade guard (the crux).</b> On a host without the FWLIB native library — this dev box
/// (macOS) and the Linux CI containers — the <c>cnc_allclibhndl3</c> P/Invoke fails to bind
/// and throws <see cref="DllNotFoundException"/> (or a related load failure:
/// <see cref="TypeInitializationException"/>, <see cref="NotSupportedException"/>,
/// <see cref="BadImageFormatException"/>, <see cref="EntryPointNotFoundException"/>). Those are
/// caught and the probe falls back to <c>Ok=true</c> with a "FWLIB absent — TCP reachability
/// only" note, so the probe is never worse than the original TCP-only behaviour on FWLIB-less
/// hosts. The happy path and the FWLIB-present CNC-error path are live-verify deferred (no CNC
/// and no FWLIB on the rig).
/// The wire client honours the linked CTS (<c>ct</c> + <c>CancelAfter(timeout)</c>) and its
/// reads are abort-bounded (see <see cref="FocasWireProtocol"/>), so the probe always returns
/// within the timeout budget even against a host that accepts TCP then stalls.
/// </para>
/// </summary>
public sealed class FocasDriverProbe : IDriverProbe
{
/// <summary>FANUC FWLIB return code for success (<c>EW_OK</c>).</summary>
private const short EwOk = 0;
private static readonly JsonSerializerOptions _opts = new()
{
PropertyNameCaseInsensitive = true,
@@ -83,75 +77,32 @@ public sealed class FocasDriverProbe : IDriverProbe
return new(false, ex.Message, null);
}
// Phase 2: FOCAS handle handshake via cnc_allclibhndl3. The native call is synchronous and
// can block, so run it on a worker bounded by a linked CTS = ct + CancelAfter(timeout).
using var handshakeCts = CancellationTokenSource.CreateLinkedTokenSource(ct);
// Phase 2: real FOCAS session via the managed wire client — initiate handshake + one
// sample read. Bounded by a linked CTS = ct + CancelAfter(budget); the wire reads are
// abort-bounded so a TCP-accept-then-stall host can't hold the probe past the budget.
using var sessionCts = CancellationTokenSource.CreateLinkedTokenSource(ct);
var budget = timeout > TimeSpan.Zero ? timeout : TimeSpan.FromSeconds(1);
handshakeCts.CancelAfter(budget);
sessionCts.CancelAfter(budget);
try
{
var (degraded, rc) = await Task.Run(
() => TryAllocateAndFreeHandle(host, port, budget),
handshakeCts.Token);
await using var wire = new FocasWireClient();
await wire.ConnectAsync(host, port, budget, sessionCts.Token).ConfigureAwait(false);
var status = await wire.ReadStatusAsync(sessionCts.Token, budget).ConfigureAwait(false);
sw.Stop();
if (degraded)
{
// FWLIB absent / cannot load — never worse than the original TCP-only probe.
return new(
true,
$"Reachable at {host}:{port} (FOCAS handshake unavailable on this host — " +
"FWLIB absent, TCP reachability only)",
sw.Elapsed);
}
if (rc == EwOk)
return new(true, "FOCAS handle OK", sw.Elapsed);
// FWLIB present but the remote returned an error — reachable TCP but not a CNC.
return new(false, $"Reachable at {host}:{port} but FOCAS handshake failed: focas_rc={rc}", null);
return status.IsOk
? new(true, $"FOCAS session OK at {host}:{port} (cnc_statinfo)", sw.Elapsed)
: new(false, $"Reachable at {host}:{port} but FOCAS read failed: EW_{status.Rc}", null);
}
catch (OperationCanceledException)
{
// The caller cancelled, or the Task.Run was cancelled before the native call started.
// (A native cnc_allclibhndl3 that is already running is bounded by the timeoutSeconds
// argument passed into it, not by handshakeCts — see TryAllocateAndFreeHandle.)
return new(false, $"Probe timed out after {timeout.TotalSeconds:F0}s.", null);
}
}
/// <summary>
/// Attempts the FWLIB handle handshake against <paramref name="host"/>/<paramref name="port"/>.
/// On success the handle is freed immediately. Returns <c>degraded=true</c> when the native
/// library cannot be loaded (FWLIB absent — the dev/CI reality); otherwise
/// <c>degraded=false</c> with the FWLIB return code (<c>EW_OK</c> = handle allocated).
/// </summary>
private static (bool degraded, short rc) TryAllocateAndFreeHandle(string host, int port, TimeSpan timeout)
{
var timeoutSeconds = (int)Math.Ceiling(timeout.TotalSeconds);
if (timeoutSeconds <= 0) timeoutSeconds = 1;
ushort handle = 0;
try
catch (FocasWireException ex)
{
var rc = NativeFwlib.cnc_allclibhndl3(host, (ushort)port, timeoutSeconds, out handle);
return (degraded: false, rc);
}
catch (DllNotFoundException) { return (degraded: true, rc: default); }
catch (TypeInitializationException) { return (degraded: true, rc: default); }
catch (NotSupportedException) { return (degraded: true, rc: default); }
catch (BadImageFormatException) { return (degraded: true, rc: default); }
catch (EntryPointNotFoundException) { return (degraded: true, rc: default); }
finally
{
// Best-effort free if a handle was actually allocated (incl. after a timeout race).
if (handle != 0)
{
try { NativeFwlib.cnc_freelibhndl(handle); }
catch { /* best-effort — never let teardown hide the probe result */ }
}
// TCP-reachable but the FOCAS initiate/read failed — a listener that is not a CNC.
return new(false, $"Reachable at {host}:{port} but FOCAS session failed: {ex.Message}", null);
}
}
@@ -166,28 +117,4 @@ public sealed class FocasDriverProbe : IDriverProbe
return (parsed.Host, parsed.Port);
}
/// <summary>
/// Minimal P/Invoke surface for the two FANUC FWLIB entry points the probe needs:
/// <c>cnc_allclibhndl3</c> to allocate a CNC handle against a host/port, and
/// <c>cnc_freelibhndl</c> to release it. The native library (<c>fwlib32.dll</c> /
/// <c>fwlib64.dll</c> on Windows, <c>libfwlib32.so</c> on Linux) is only present on a host
/// with the FANUC FWLIB redistributable installed. On every other host the JIT fails to
/// bind these entry points and throws <see cref="DllNotFoundException"/> — caught by the
/// probe's degrade guard.
/// </summary>
private static class NativeFwlib
{
private const string Library = "fwlib32";
[DllImport(Library, CallingConvention = CallingConvention.Cdecl, CharSet = CharSet.Ansi)]
internal static extern short cnc_allclibhndl3(
[MarshalAs(UnmanagedType.LPStr)] string ipaddr,
ushort port,
int timeout,
out ushort handle);
[DllImport(Library, CallingConvention = CallingConvention.Cdecl)]
internal static extern short cnc_freelibhndl(ushort handle);
}
}
@@ -369,19 +369,7 @@ public sealed class FocasWireClient : IAsyncDisposable, IDisposable
var rc = AggregateRc(blocks);
if (rc != 0) return new FocasResult<IReadOnlyList<WireServoMeter>>(rc, null);
var payload = FindPayload(blocks, 0x0056);
var result = new List<WireServoMeter>();
for (var offset = 0; offset + 12 <= payload.Length && result.Count < maxCount; offset += 12)
{
var name = FocasWireProtocol.ReadNameRecord(payload.AsSpan(offset + 8, 4));
result.Add(new WireServoMeter(
(short)(result.Count + 1),
name,
ReadInt32(payload, offset),
ReadInt16(payload, offset + 4),
ReadInt16(payload, offset + 6)));
}
var result = ParseServoMeters(FindPayload(blocks, 0x0056), FindPayload(blocks, 0x0089), maxCount);
return new FocasResult<IReadOnlyList<WireServoMeter>>(rc, result);
}
@@ -602,31 +590,7 @@ public sealed class FocasWireClient : IAsyncDisposable, IDisposable
callTimeout.Token,
new RequestBlock(0x8001, start, end, area, dataType, RequestClass: 2, PathId: EffectivePathId(pathId))).ConfigureAwait(false);
return ToResult(block, payload =>
{
var width = dataType switch
{
1 => 2,
2 or 4 => 4,
5 => 8,
_ => 1,
};
var values = new List<long>();
for (var offset = 0; offset + width <= payload.Length; offset += width)
{
values.Add(width switch
{
1 => payload[offset],
2 => ReadInt16(payload, offset),
4 => ReadInt32(payload, offset),
8 => BinaryPrimitives.ReadInt64BigEndian(payload.AsSpan(offset, 8)),
_ => 0,
});
}
return new WirePmcRange(area, dataType, start, end, values);
});
return ToResult(block, payload => ParsePmcRange(area, dataType, start, end, payload));
}
/// <summary>Typed overload for <see cref="ReadPmcRangeAsync(short, short, ushort, ushort, CancellationToken, TimeSpan?, ushort?)"/>.</summary>
@@ -740,7 +704,7 @@ public sealed class FocasWireClient : IAsyncDisposable, IDisposable
ushort? pathId = null)
=> ReadSingleWithTimeoutAsync(
0x0120,
payload => new WireTimer(type, payload.Length >= 4 ? ReadInt32(payload, 0) : 0, payload.Length >= 8 ? ReadInt32(payload, 4) : 0),
payload => ParseTimer(type, payload),
cancellationToken, timeout, EffectivePathId(pathId), type);
// ---- internal plumbing ------------------------------------------------------------
@@ -922,6 +886,88 @@ public sealed class FocasWireClient : IAsyncDisposable, IDisposable
private static short AggregateRc(IReadOnlyList<ResponseBlock> blocks)
=> blocks.FirstOrDefault(block => block.Rc != 0)?.Rc ?? 0;
/// <summary>
/// Decode a <c>cnc_rdtimer</c> (0x0120) payload into <see cref="WireTimer"/>. The 8-byte
/// data block is two 32-bit fields {minute, msec}, and they are <b>little-endian</b> on the
/// wire — unlike the big-endian block envelope and every other payload (sysinfo / dynamic /
/// macro). Validated against a live FANUC 31i-B (2026-06-25): the msec field only falls in
/// valid range (0..59999) under little-endian; big-endian decoded it as ~2.4e9. Captured
/// cutting-time payload <c>ac f2 10 00 90 a3 00 00</c> → minute=1110188, msec=41872. See
/// <c>docs/plans/2026-06-25-focas-pdu-v3-30i-b-support.md</c>.
/// </summary>
internal static WireTimer ParseTimer(short type, byte[] payload) => new(
type,
payload.Length >= 4 ? BinaryPrimitives.ReadInt32LittleEndian(payload.AsSpan(0, 4)) : 0,
payload.Length >= 8 ? BinaryPrimitives.ReadInt32LittleEndian(payload.AsSpan(4, 4)) : 0);
/// <summary>
/// Decode a <c>cnc_rdsvmeter</c> response into <see cref="WireServoMeter"/> records. On the
/// 31i-B (v3) each per-axis LOADELM is <b>8 bytes</b> — {int32 data; int16 dec; int16 unit}
/// — NOT the 12-byte shape the original parser assumed. The 12-byte stride misaligned after
/// the first record (it read the dec/unit shorts of the prior record as the next record's
/// data → wild values like 655360). Axis NAMES are not in the svmeter payload; they come
/// from the <c>cnc_rdaxisname</c> (0x0089) block requested alongside it and are correlated
/// by index. Validated against a live 31i-B 2026-06-25.
/// <para><b>Scaling caveat:</b> downstream applies LoadPercent = data / 10^dec; on the 31i-B
/// <c>dec</c> read as 10, which makes idle loads vanishingly small. The data/dec/unit field
/// semantics for servo load are inferred from the wire and not yet confirmed against the
/// machine's servo-meter screen — confirm magnitude at commissioning. The alignment + name
/// fix here is what removes the gross misaligned garbage.</para>
/// </summary>
internal static IReadOnlyList<WireServoMeter> ParseServoMeters(byte[] svPayload, byte[] axisNamePayload, int maxCount)
{
var result = new List<WireServoMeter>();
for (var offset = 0; offset + 8 <= svPayload.Length && result.Count < maxCount; offset += 8)
{
var index = result.Count;
var nameOffset = index * 4;
var name = nameOffset + 4 <= axisNamePayload.Length
? FocasWireProtocol.ReadNameRecord(axisNamePayload.AsSpan(nameOffset, 4))
: string.Empty;
result.Add(new WireServoMeter(
(short)(index + 1),
name,
ReadInt32(svPayload, offset),
ReadInt16(svPayload, offset + 4),
ReadInt16(svPayload, offset + 6)));
}
return result;
}
/// <summary>Byte width of one PMC slot for a FOCAS data-type code: Byte=1, Word=2, Long/Real=4, Double=8.</summary>
internal static int PmcByteWidth(short dataType) => dataType switch
{
1 => 2,
2 or 4 => 4,
5 => 8,
_ => 1,
};
/// <summary>
/// Decode a <c>pmc_rdpmcrng</c> payload into <see cref="WirePmcRange"/>. The CNC returns
/// (end-start+1) BYTES; this slices them into width-sized big-endian slots. Callers must
/// size the request range to <c>width</c> bytes per value (see
/// <c>WireFocasClient.ReadPmcAsync</c>) or a trailing partial slot is dropped — which on the
/// 31i-B surfaced as a spurious <c>BadOutOfRange</c> for a single Word read. 2026-06-25.
/// </summary>
internal static WirePmcRange ParsePmcRange(short area, short dataType, ushort start, ushort end, byte[] payload)
{
var width = PmcByteWidth(dataType);
var values = new List<long>();
for (var offset = 0; offset + width <= payload.Length; offset += width)
{
values.Add(width switch
{
1 => payload[offset],
2 => ReadInt16(payload, offset),
4 => ReadInt32(payload, offset),
8 => BinaryPrimitives.ReadInt64BigEndian(payload.AsSpan(offset, 8)),
_ => 0,
});
}
return new WirePmcRange(area, dataType, start, end, values);
}
private static byte[] FindPayload(IReadOnlyList<ResponseBlock> blocks, ushort command)
=> blocks.FirstOrDefault(block => block.Command == command)?.Payload ?? Array.Empty<byte>();
@@ -19,7 +19,22 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Wire;
/// </remarks>
internal static class FocasWireProtocol
{
/// <summary>The PDU version this client emits in every outgoing request header.</summary>
public const ushort Version = 1;
/// <summary>
/// PDU versions accepted on inbound PDUs. The 10-byte header framing is identical across
/// these (only the version field differs), so the framing layer accepts both while we keep
/// emitting <see cref="Version"/> (v1) on requests. The docker mock + older controls answer
/// v1; modern controls answer v3 — FANUC 30i-B validated live 2026-06-25 (macro reads OK).
/// See <c>docs/plans/2026-06-25-focas-pdu-v3-30i-b-support.md</c>.
/// </summary>
private static readonly ushort[] SupportedReadVersions = [1, 3];
/// <summary>True when <paramref name="version"/> is a PDU version this client can frame-parse.</summary>
internal static bool IsSupportedReadVersion(ushort version) =>
Array.IndexOf(SupportedReadVersions, version) >= 0;
public const byte DirectionRequest = 0x01;
public const byte DirectionResponse = 0x02;
public const byte TypeInitiate = 0x01;
@@ -99,7 +114,7 @@ internal static class FocasWireProtocol
throw new FocasWireException("Invalid FOCAS PDU magic.");
var version = BinaryPrimitives.ReadUInt16BigEndian(header.AsSpan(4, 2));
if (version != Version)
if (!IsSupportedReadVersion(version))
throw new FocasWireException($"Unsupported FOCAS PDU version {version}.");
var bodyLength = BinaryPrimitives.ReadUInt16BigEndian(header.AsSpan(8, 2));
@@ -122,7 +137,7 @@ internal static class FocasWireProtocol
throw new FocasWireException("Invalid FOCAS PDU magic.");
var version = BinaryPrimitives.ReadUInt16BigEndian(header.AsSpan(4, 2));
if (version != Version)
if (!IsSupportedReadVersion(version))
throw new FocasWireException($"Unsupported FOCAS PDU version {version}.");
var bodyLength = BinaryPrimitives.ReadUInt16BigEndian(header.AsSpan(8, 2));
@@ -135,13 +150,29 @@ internal static class FocasWireProtocol
private static async Task ReadExactlyAsync(NetworkStream stream, byte[] buffer, CancellationToken cancellationToken)
{
// NetworkStream.ReadAsync's CancellationToken does not reliably abort a socket read that is
// blocked waiting for bytes the peer never sends — a CNC that TCP-accepts then stalls
// mid-PDU (the cnc_rdsvmeter "hang" the 31i-B work chased). Register a hard abort that
// disposes the stream on cancellation so a stalled read throws instead of wedging the
// caller's poll loop, and normalize the resulting failure to OperationCanceledException so
// the request path tears the transport down as a transient. See
// docs/plans/2026-06-25-focas-pdu-v3-30i-b-support.md (Phase 2).
await using var abort = cancellationToken.Register(static s => ((IDisposable)s!).Dispose(), stream);
var offset = 0;
while (offset < buffer.Length)
try
{
var read = await stream.ReadAsync(buffer, offset, buffer.Length - offset, cancellationToken).ConfigureAwait(false);
if (read == 0)
throw new EndOfStreamException("FOCAS socket closed before the expected number of bytes were read.");
offset += read;
while (offset < buffer.Length)
{
var read = await stream.ReadAsync(buffer.AsMemory(offset, buffer.Length - offset), cancellationToken).ConfigureAwait(false);
if (read == 0)
throw new EndOfStreamException("FOCAS socket closed before the expected number of bytes were read.");
offset += read;
}
}
catch (Exception ex) when (cancellationToken.IsCancellationRequested && ex is not OperationCanceledException)
{
// The stalled read was aborted by the dispose-on-cancel registration above.
throw new OperationCanceledException(cancellationToken);
}
}
@@ -196,6 +196,11 @@ public sealed class WireFocasClient : IFocasClient
/// <returns>The dynamic snapshot of the axis.</returns>
public async Task<FocasDynamicSnapshot> ReadDynamicAsync(int axisIndex, CancellationToken cancellationToken)
{
// FOCAS axes are 1-based; cnc_rddynamic2 with axis 0 returns EW_4 (live-confirmed on the
// 31i-B). The FixedTree poll already iterates 1..AxesCount, but enforce the contract so a
// future caller can't silently request axis 0. 2026-06-25.
if (axisIndex < 1)
throw new ArgumentOutOfRangeException(nameof(axisIndex), axisIndex, "FOCAS axis index is 1-based.");
RequireConnected();
var result = await _wire.ReadDynamic2Async((short)axisIndex, cancellationToken).ConfigureAwait(false);
ThrowIfRcNonZero(result.Rc, "cnc_rddynamic2", result.IsOk);
@@ -337,7 +342,11 @@ public sealed class WireFocasClient : IFocasClient
if (area is null) return (null, FocasStatusMapper.BadNodeIdUnknown);
var dataType = FocasPmcDataTypeLookup.FromFocasDataType(type);
var start = (ushort)address.Number;
var end = start;
// pmc_rdpmcrng returns (end-start+1) BYTES. A multi-byte slot (Word/Long/Real/Double) needs
// the range widened to its byte width, else the value parser gets too few bytes → 0 values
// → spurious BadOutOfRange (live-confirmed on the 31i-B: a Word read with end=start returned
// a single byte). Bit/Byte stay width 1 so end==start. 2026-06-25.
var end = (ushort)(start + FocasWireClient.PmcByteWidth((short)dataType) - 1);
try
{