Files
lmxopcua/docs/plans/live-hardware-validation-runbooks.md
Joseph Doherty 16a87b08f3 docs: add four planning runbooks for Phase 6.3 interop, v2 GA gates, live-hardware validation, and alarms worker wiring
Produces docs/plans/ entries for tasks #13, #15, #16, and #17-#20:
- phase-6-3-redundancy-interop-plan.md: automation boundary analysis,
  concrete test matrix (A/B/C blocks), and a step-by-step cutover
  runbook for the deferred Stream F client interop work
- v2-ga-lab-gates-plan.md: exact gate list with command, pass criterion,
  and owner for each of the nine v2 GA exit criteria
- live-hardware-validation-runbooks.md: one runbook per driver (FOCAS
  CNC smoke #54, AB CIP live-boot, TwinCAT wire-live) with preconditions,
  procedure, expected results, and recording template
- alarms-worker-wiring-plan.md: focused plan for A.2/A.3-A.4/C.1/D.1
  worker wiring in the mxaccessgw sibling repo, documenting the
  discovered AVEVA API surface, the architectural decision that blocks
  A.2, the dependency order, and what each item needs to unblock

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 04:53:36 -04:00

16 KiB
Raw Blame History

Live-Hardware Driver Validation Runbooks

Scope: These runbooks cover the three driver validation tasks that require physical hardware or a hardware-equivalent live environment and cannot be satisfied by the Docker-based simulator fixtures or unit tests alone.

Driver implementation is complete. The runbooks document the preconditions, step-by-step procedure, expected results, and how to record the outcome for each driver that has an open live-hardware gap.


1. FANUC FOCAS — Live CNC Smoke (task #54)

Background

The FOCAS driver (src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/) uses the pure-managed WireFocasClient that speaks FOCAS2 over TCP directly (no Fwlib64.dll, no P/Invoke). The integration test suite at tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/ runs against the focas-mock Python server (PDU-verified against fwlibe64.dll upstream) and covers all call-shapes the driver issues. What the mock cannot cover:

  • Series-specific firmware quirks (e.g. 0i-F vs 30i-B parameter range limits)
  • Real CNC Ethernet stack behaviour (TCP keep-alive, session-close edge cases)
  • Series gating: some driver nodes are conditionally emitted based on CncSeries — only a physical CNC can confirm the suppression works

Preconditions

Item Requirement
CNC hardware FANUC CNC with Ethernet option enabled; TCP port 8193 reachable from the dev box or from the host running OtOpcUa
CNC series Any of: 0i-D, 0i-F, 0i-MF, 0i-TF, 16i, 30i-B, 31i, 32i, Power Motion i
CNC state Running state (not E-stop, not alarm) for live axis-data reads
Network TCP reachability from OtOpcUa server host to CNC port 8193
OtOpcUa Server built and deployed (dotnet publish or running via dotnet run)
Config DriverInstance row for FOCAS in Config DB (Type="FOCAS", Backend="wire", Devices[0].HostAddress="focas://<cnc-ip>:8193", Devices[0].Series="<series>")

Procedure

Step 1 — Verify TCP reachability

Test-NetConnection -ComputerName <cnc-ip> -Port 8193

Pass: TcpTestSucceeded: True.

Step 2 — Start OtOpcUa with FOCAS driver configured

Ensure the Config DB has the DriverInstance row. Start the server:

sc start OtOpcUa
# or for a dev run:
dotnet run --project src/Server/ZB.MOM.WW.OtOpcUa.Server

Watch the Serilog log for:

[INF] FocasDriver initializing device focas://<cnc-ip>:8193 series=<series>
[INF] FocasDriver device <cnc-ip>:8193 Connected

If EW_SOCKET (-1) appears, the TCP endpoint is unreachable or the CNC Ethernet option is not active.

Step 3 — Browse the address space

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    browse -u opc.tcp://localhost:4840 -r -d 3

Expected: a node tree containing at minimum:

FOCAS/
  <device>/
    Identity/
      SeriesNumber
      Version
      MaxAxes
    Status/
      RunState
      Mode
      EmergencyStop
    Axes/
      <X|Y|Z>/
        AbsolutePosition
        MachinePosition

Nodes suppressed by the Series capability gate will be absent — this is correct behaviour.

Step 4 — Read identity nodes

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    read -u opc.tcp://localhost:4840 -n "ns=2;s=FOCAS/<device>/Identity/SeriesNumber"

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    read -u opc.tcp://localhost:4840 -n "ns=2;s=FOCAS/<device>/Identity/MaxAxes"

Pass: Good quality; SeriesNumber matches the string printed on the CNC control panel (e.g. "0i-F"); MaxAxes is a non-zero integer.

Step 5 — Read live status and axis data

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    read -u opc.tcp://localhost:4840 -n "ns=2;s=FOCAS/<device>/Status/RunState"

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    read -u opc.tcp://localhost:4840 -n "ns=2;s=FOCAS/<device>/Axes/X/AbsolutePosition"

Pass: both return Good quality. AbsolutePosition is a Double (e.g. -12.3456 mm). Manually compare against the machine's position display.

Step 6 — Subscribe and observe polling

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    subscribe -u opc.tcp://localhost:4840 `
    -n "ns=2;s=FOCAS/<device>/Status/RunState" -i 500

Let run for 30 s while jogging an axis or changing mode on the CNC operator panel. Pass: at least one data-change event received within 5 s; events continue arriving every ~500 ms.

Step 7 — 2-minute soak

Let the server run for 2 minutes with the subscription active. Pass: no EW_SOCKET, EW_HANDLE, EW_BUSY errors in the Serilog output; subscribed node continues delivering updates.

Step 8 — Run the FOCAS e2e script

pwsh scripts/e2e/test-focas.ps1 -ServerUrl opc.tcp://localhost:4840 `
    -DriverInstance "<device>" -Series "<series>"

Pass: script exits 0.

Expected results

Check Expected
TCP connect to CNC port 8193 Success
FOCAS session open (cnc_allclibhndl3) EW_OK (0) in driver log
Identity/SeriesNumber Matches CNC panel, Good quality
Identity/MaxAxes Non-zero integer, Good quality
Status/RunState Integer 03, Good quality
Axes/X/AbsolutePosition Double, Good quality, matches display
Subscribe: events delivered >= 3 events in 5 s soak
2-minute soak: no FOCAS errors Clean Serilog log

Recording the outcome

FOCAS live-CNC smoke — task #54
Date: YYYY-MM-DD
CNC: <manufacturer> <model> series=<series> firmware=<version>
IP: <cnc-ip>:8193
OtOpcUa SHA: <git sha>

TCP connect: PASS
Session open: PASS
Identity reads: PASS  SeriesNumber="<>" MaxAxes=<n>
Status read:  PASS  RunState=<n>
Axis read:    PASS  X/AbsolutePosition=<value>
Subscribe:    PASS  <n> events in 30s
2-min soak:   PASS  no errors
e2e script:   PASS

2. Allen-Bradley CIP — Live Boot (ControlLogix)

Background

The AB CIP driver (src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.AbCip/) uses libplctag 1.6.x. The Docker ab_server simulator covers connectivity and atomic type reads (7 integration tests). Live-boot validation is needed to confirm UDT shape-reading, array tag access, and the CIP packing behaviour on a real ControlLogix backplane — all gaps acknowledged in docs/drivers/AbServer-Test-Fixture.md.

AB CIP live-boot was first verified against a ControlLogix rig at PR #222. Continue running before each release.

Preconditions

Item Requirement
PLC hardware ControlLogix (preferred) or CompactLogix; firmware 20+ for request packing
Network TCP port 44818 reachable from OtOpcUa server host
PLC state Running; at least one DINT / REAL / BOOL / STRING controller-scoped tag defined
OtOpcUa Server built and deployed
Config DriverInstance row: Type="AbCip", Host="<plc-ip>", Path="1,0", PlcType="ControlLogix"

Procedure

Step 1 — Verify TCP reachability

Test-NetConnection -ComputerName <plc-ip> -Port 44818

Pass: TcpTestSucceeded: True.

Step 2 — Start OtOpcUa and watch driver log

sc start OtOpcUa

Look for:

[INF] AbCipDriver device <plc-ip> Connected  path=1,0  plcType=ControlLogix

Step 3 — Browse the address space

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    browse -u opc.tcp://localhost:4840 -r -d 3

Pass: node tree shows the tags defined in the ControlLogix project (controller- and program-scoped). UDT members appear as child nodes.

Step 4 — Read atomic tags

# Read a DINT tag
dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    read -u opc.tcp://localhost:4840 -n "ns=2;s=AbCip/<device>/<TagName>"

Pass: Good quality; value type matches the PLC tag type.

Step 5 — Read a UDT member

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    read -u opc.tcp://localhost:4840 -n "ns=2;s=AbCip/<device>/<UDT>/<MemberName>"

Pass: Good quality; value matches the live PLC value.

Step 6 — Write a DINT tag (if in ReadWrite mode)

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    write -u opc.tcp://localhost:4840 `
    -n "ns=2;s=AbCip/<device>/<TagName>" -v 42 -t Int32

Verify the new value via a subsequent read or on the PLC HMI.

Pass: read back returns 42 with Good quality.

Step 7 — Subscribe to a tag that changes

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    subscribe -u opc.tcp://localhost:4840 `
    -n "ns=2;s=AbCip/<device>/<ChangingTag>" -i 500

Jog or trigger a value change on the PLC. Pass: events received within 2 s.

Step 8 — Override endpoint to docker sim and confirm parity

$env:AB_SERVER_ENDPOINT = "<plc-ip>:44818"
dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests `
    --filter "AbServerFact"

Pass: all 7 integration tests pass against the live PLC.

Expected results

Check Expected
TCP connect Success
Driver log Connected Present, no error
Browse Node tree mirrors PLC tag list
Atomic read Good quality, correct type
UDT member read Good quality, correct value
Write round-trip Written value reads back
Subscribe Events delivered on value change
Integration tests with live PLC 7/7 pass

Recording the outcome

AB CIP live-boot
Date: YYYY-MM-DD
PLC: Allen-Bradley <model> firmware=<version>
IP: <plc-ip>:44818  path=1,0
OtOpcUa SHA: <git sha>

TCP connect: PASS
Driver connected: PASS
Browse: PASS  <n> tags visible
Atomic read: PASS
UDT read: PASS
Write round-trip: PASS
Subscribe: PASS
Integration tests: 7/7 PASS

3. Beckhoff TwinCAT — Wire-Live Validation

Background

The TwinCAT driver (src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/) uses the Beckhoff TwinCAT.Ads .NET SDK v6. The integration test suite at tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/ (TwinCAT3SmokeTests.cs) covers 14 [TwinCATFact] methods + one 16-case [TwinCATTheory] (30 cases total) against a live ADS runtime. The TCBSD ESXi VM at 10.100.0.128 (AmsNetId 41.169.163.43.1.1) is the primary fixture runtime (project memory project_tcbsd_fixture.md) and bypasses the TwinCAT/Hyper-V conflict on the dev box.

Live-hardware validation extends beyond the TCBSD VM to confirm the driver works against a production PLC (not just the ESXi test VM) and that the three defects found during original integration testing do not regress on newer firmware:

  1. Notification cycle time unit (250 ms was being set to ~41 min — fixed).
  2. STRING(N) / WSTRING(N) type mapper (fixed).
  3. Bit-indexed BOOL path (fixed).

Preconditions

TCBSD ESXi fixture (primary — no physical hardware needed)

Item Requirement
TCBSD VM Running on ESXi at 10.100.0.128
AMS Net ID 41.169.163.43.1.1
ADS port 851 (TwinCAT 3 PLC runtime 1)
PLC project TwinCAT project from tests/.../TwinCatProject/ loaded and in Run state
Network TCP port 48898 reachable from dev box to 10.100.0.128

Production PLC (for true wire-live validation)

Item Requirement
TwinCAT hardware Beckhoff IPC or CX series, TwinCAT 3 (TC3); TC2 is a known gap per fixture doc
AMS route Route configured on TwinCAT device back to the OtOpcUa host
PLC state Run state
GVL At least a GVL_Fixture.nCounter DINT and GVL_Fixture.rSetpoint REAL present

Procedure — TCBSD ESXi fixture

Step 1 — Verify TCBSD VM is reachable

Test-NetConnection -ComputerName 10.100.0.128 -Port 48898

Pass: TcpTestSucceeded: True.

Step 2 — Run the integration test suite

$env:TWINCAT_TARGET_HOST  = "10.100.0.128"
$env:TWINCAT_TARGET_NETID = "41.169.163.43.1.1"

dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests `
    --logger "console;verbosity=normal"

Pass: all 30 test cases pass (14 [TwinCATFact] + 16-case [TwinCATTheory]). No [TwinCATFact] / [TwinCATTheory] skips — the env var is set, so the runtime probe is expected to succeed.

Key tests to watch:

Test Validates
Driver_subscribe_receives_native_ADS_notifications_on_counter_changes Native ADS notification path (the cycle-time-unit bug regression)
Driver_reads_every_primitive_type_with_correct_mapping 16-type theory incl. STRING(N)
Driver_reads_bit_indexed_BOOL_from_word Bit-indexed BOOL fix regression
Driver_auto_reconnects_after_underlying_client_is_disposed Reconnect on ADS client dispose
Driver_routes_reads_per_device_and_isolates_unreachable_peers Multi-device isolation

Step 3 — OtOpcUa server browse/read via Client CLI

Start OtOpcUa with a TwinCAT DriverInstance pointing at the TCBSD VM:

# appsettings.json DriverInstance: Type=TwinCAT, AmsNetId=41.169.163.43.1.1, AmsPort=851
sc start OtOpcUa
# or dev run
dotnet run --project src/Server/ZB.MOM.WW.OtOpcUa.Server
dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    browse -u opc.tcp://localhost:4840 -r -d 4

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    read -u opc.tcp://localhost:4840 -n "ns=2;s=TwinCAT/<device>/GVL_Fixture/nCounter"

Pass: browse shows the PLC symbol tree; read returns Good quality with an integer value.

Procedure — Production PLC (optional, for full wire-live signoff)

If a Beckhoff production IPC is available in the lab:

Step 1 — Configure the AMS route on the TwinCAT device (TwinCAT System Manager → Routes → Add static route from the TwinCAT device back to the OtOpcUa server machine).

Step 2 — Set env vars and run the integration suite against the production target:

$env:TWINCAT_TARGET_HOST  = "<production-plc-ip>"
$env:TWINCAT_TARGET_NETID = "<production-ams-net-id>"
$env:TWINCAT_TARGET_PORT  = "851"

dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests

Step 3 — Subscribe to a counter tag for 30 s to confirm native notifications arrive:

dotnet run --project src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI -- `
    subscribe -u opc.tcp://localhost:4840 `
    -n "ns=2;s=TwinCAT/<device>/GVL_Fixture/nCounter" -i 100

Pass: events arrive every ~100 ms driven by the PLC's ADS notification, not by polling.

Expected results

Check TCBSD VM Production PLC
ADS port 48898 reachable Required Required
Integration tests: all 30 pass Required Optional (same 30)
Notification cycle-time test passes Required Required
Server browse shows symbol tree Required Optional
Read Good quality Required Optional
Native ADS notifications deliver in subscribe Required Recommended

Known gaps (documented — not blockers for v2 GA)

Per docs/drivers/TwinCAT-Test-Fixture.md §"What it does NOT cover":

  • Multi-hop AMS routing — single-hop only.
  • TC2 (ADS v1) compatibility — TC3 only.
  • Notification coalescing under sustained CPU load.
  • Symbol version changed (0x0702) storm handling under rapid PLC re-downloads.

These are deferred to v3 per docs/v3/twincat-backlog.md.

Recording the outcome

TwinCAT wire-live validation
Date: YYYY-MM-DD
Target: TCBSD VM 10.100.0.128 AmsNetId=41.169.163.43.1.1  (and/or production PLC details)
TwinCAT version: <version>
OtOpcUa SHA: <git sha>

ADS port reachable: PASS
Integration tests: 30/30 PASS
  notification-cycle-time test: PASS  (regression check)
  STRING(N) type test: PASS  (regression check)
  bit-indexed BOOL test: PASS  (regression check)
Server browse: PASS
Read Good quality: PASS
Native subscription delivery: PASS  <n> events in 30s