Files
lmxopcua/docs/v2/implementation/focas-simulator-plan.md
Joseph Doherty 2d07d716dc Recover stashed driver-gaps work from pre-v2-mxgw-merge working tree
Captures uncommitted work that lived in the working tree on
v2-mxgw-integration but was orthogonal to the migration. Stashed
during the v2-mxgw merge to master (2026-04-30) and replanted here on
a feature branch off master so it's git-visible rather than living in
the stash list.

Two distinct buckets:

1. Tracked fixture/config refinements (10 files, ~36 lines):
   - scripts/e2e/test-opcuaclient.ps1
   - src/ZB.MOM.WW.OtOpcUa.Admin/appsettings.json
   - 5 docker-compose.yml under tests/.../IntegrationTests/Docker/
     (AbCip, Modbus, OpcUaClient, S7)
   - 4 fixture .cs files (AbServerFixture, ModbusSimulatorFixture,
     OpcPlcFixture, Snap7ServerFixture)

2. Untracked driver-gaps queue artifacts (~8000 lines):
   - docs/plans/{abcip,ablegacy,focas,opcuaclient,s7,twincat}-plan.md
     — per-driver gap plans
   - docs/featuregaps.md — cross-cutting analysis
   - docs/v2/focas-deployment.md, docs/v2/implementation/focas-simulator-plan.md
   - followup.md — auto/driver-gaps queue follow-ups
   - scripts/queue/ — PR-queue automation tooling (12 files including
     pr-manifest.yaml at 1473 lines)

This commit is a snapshot for recoverability — review and split into
focused PRs (or discard) before merging anywhere downstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 08:28:01 -04:00

316 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# FOCAS Docker simulator — implementation plan
> **Status**: **IN PROGRESS** 2026-04-23. **Streams A + B shipped.** Stream C (real Fwlib64 wire compat) + Stream D (e2e + docs) still open — both require a Windows rig with licensed Fwlib64.dll + captured Wireshark traces. Stream B shipped the full architectural scaffold (Docker image, 9 per-series compose profiles, asyncio TCP server, handler dispatch, profile-driven range enforcement, local validation harness) — exercised end-to-end against both `thirtyone_i` and `powermotion_i` profiles.
## Goal
Close the one remaining FOCAS gap (`#222` follow-up — "wire-level live-boot against real hardware") with a hardware-free fixture that:
1. Runs in Docker, matches the per-driver fixture pattern (`docker compose up -d` in the test project).
2. Exposes the FOCAS TCP port (`8193` by default) to the host.
3. Speaks enough of the FOCAS wire protocol that **a Windows test rig running our unmodified `Driver.FOCAS.Host` + licensed `Fwlib64.dll` can open a session and exercise the 9 FWLIB functions the driver actually uses.**
4. Supports **version profiles** — one container per Fanuc series (0i-D, 0i-F, 30i, 31i, 32i, PowerMotion-i) — so driver-side range validation, error-code mapping, and per-series quirks get exercised against a server that actually behaves differently per series.
5. Plugs into the existing e2e infrastructure (`scripts/e2e/test-focas.ps1` loses the `FOCAS_TRUST_WIRE=1` gate when the fixture is up).
## Non-goals
- **Not a full FOCAS emulator.** Fanuc's FOCAS spec is closed; faithfully reproducing every function across every controller model would be a years-long project. We implement the narrow subset the driver uses (see §Protocol surface).
- **Not a CNC behavioural model.** We return plausible values for PMC/param/macro reads; we do NOT simulate axis motion, program execution, or alarm generation. The mock exists to exercise the driver's marshalling + IPC + status-code paths, not to prove the CNC behaves correctly.
- **Not a replacement for a bench CNC.** A physical controller still catches timing-dependent bugs (Fwlib-internal thread-pool exhaustion, handle-recycle pathologies, vendor-firmware quirks) that a mock can't reproduce. Mock covers ~80% of value; real-hardware smoke stays as a final gate.
## Constraint that shapes the design
`Fwlib64.dll` is a proprietary closed-source library that speaks FOCAS to the CNC. **Our driver never touches raw TCP** — it calls `cnc_allclibhndl3` / `pmc_rdpmcrng` / etc. and Fwlib encodes the wire frames internally.
This means the mock has two possible architectures:
| Option | Where the mock lives | Exercises Fwlib? |
|---|---|---|
| **A. IPC-layer fake** (already shipped as `FakeFocasBackend`) | Between `FwlibFrameHandler` and the FWLIB call | ❌ No — bypasses Fwlib entirely |
| **B. TCP wire mock** (this plan) | Listens on port 8193; Fwlib connects to it | ✅ Yes — Fwlib encodes real frames |
Option B is the only one that validates the driver's actual production wire path (driver → Host → `FwlibFocasClient``Fwlib64.dll` → TCP → mock).
**Prerequisite reading** the implementer needs before starting Option B:
- `strangesast/fwlib` on GitHub — reverse-engineered FOCAS2 Linux client, has frame-format notes
- `GalvinGao/opcua-server-fanuc` — another OSS FOCAS client with wire-format traces
- `jdegre/focas-python` (if it still exists) — previous Python FOCAS stub, starting point
- Our own `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibNative.cs` — the 9-function surface we need to satisfy
## Protocol surface (what the mock must speak)
From `FwlibNative.cs`, our driver makes exactly 9 FWLIB calls:
| FWLIB function | What it does | Wire complexity |
|---|---|---|
| `cnc_allclibhndl3` | Open Ethernet handle (connect) | **High** — initial handshake, version negotiation, session state |
| `cnc_freelibhndl` | Close handle | Low |
| `pmc_rdpmcrng` | PMC range read (byte/word/long + optional bit) | **Medium** — 40-byte buffer with type-dependent layout |
| `pmc_wrpmcrng` | PMC range write | **Medium** — same buffer shape inverted |
| `cnc_rdparam` | Parameter read (axis-aware) | Medium — 32-byte buffer |
| `cnc_wrparam` | Parameter write | Medium |
| `cnc_rdmacro` | Macro variable read (value + decimal-point count) | Low |
| `cnc_wrmacro` | Macro variable write | Low |
| `cnc_statinfo` | Status info (for probe) | Low — fixed-shape response |
**Coverage target**: all 9 functions return plausible responses for the address ranges declared in each series profile. Out-of-range addresses return `EW_NUMBER` / `EW_PARAM`. Unknown PMC letters return `EW_DATA`. Session state (handle validity, unknown handle detection) is enforced.
## Version profiles
The driver has `FocasCncSeries` + `FocasCapabilityMatrix` already — we mirror that matrix into JSON profiles the mock loads at start:
```
fixture/
├── Dockerfile
├── requirements.txt
├── server/
│ ├── focas_server.py # asyncio TCP server + frame parser
│ ├── handlers/
│ │ ├── allclibhndl3.py
│ │ ├── pmc.py
│ │ ├── param.py
│ │ ├── macro.py
│ │ └── status.py
│ ├── state.py # in-memory "CNC" state
│ └── frames.py # FOCAS frame encode/decode
└── profiles/
├── zero_i_d.json
├── zero_i_f.json
├── zero_i_mf.json
├── zero_i_tf.json
├── sixteen_i.json
├── thirty_i.json
├── thirtyone_i.json
├── thirtytwo_i.json
└── powermotion_i.json
```
Each profile captures:
```json
{
"series": "ThirtyOne_i",
"api_version": "0x30",
"pmc_ranges": {
"X": [0, 127], "Y": [0, 127], "F": [0, 767], "G": [0, 767],
"R": [0, 1499], "D": [0, 2999], "C": [0, 199], "K": [0, 31],
"A": [0, 24], "T": [0, 79], "E": [0, 9999]
},
"param_ranges": [[1000, 9999], [10000, 15999]],
"macro_range": [100, 999],
"extended_macros": false,
"axes": 3,
"quirks": {
"crash_after_handle_cycles": null,
"edit_mode_rejects_connection": false,
"allclibhndl3_blocks_during_alarm": false,
"param_bit_index_max": 7
},
"alarm_default": false,
"emergency_default": false
}
```
**Differences that actually matter** for driver coverage:
| Series | Meaningful difference vs baseline |
|---|---|
| 0i-D / 0i-F / 0i-MF / 0i-TF | PMC range narrower; no E-relay; macro range `100-999` strict |
| 16i | Older Fwlib version; `cnc_allclibhndl3` extra-slow on first connect (artificial delay in mock) |
| 30i | Full PMC range; extended macros (`#10000+`) supported |
| 31i / 32i | 5-axis; larger parameter ranges |
| PowerMotion-i | No PMC `T` timer; motion-only controller quirks |
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Windows test rig (net10.0 x64) │
│ │
│ FocasDriver ──► FwlibFocasClient ──► Fwlib64.dll ──► TCP ──┐ │
│ (real P/Invoke) │ │
└─────────────────────────────────────────────────────────────┼───┘
port 8193 │
┌─────────────────────────────────────────────────────────────────┐
│ Docker container: otopcua-focas-sim-{series} │
│ │
│ Python asyncio TCP server │
│ ├─ frames.py: parse + encode FOCAS frames │
│ ├─ handlers/: one module per FWLIB function │
│ ├─ state.py: per-session handle registry + simulated memory │
│ └─ profiles/{series}.json: range + quirk table loaded at │
│ boot via env var OTOPCUA_FOCAS_ │
│ PROFILE=thirtyone_i │
└─────────────────────────────────────────────────────────────────┘
```
Python choice rationale: the existing OSS FOCAS implementations are Python-first; asyncio's `StreamReader`/`StreamWriter` maps cleanly to FOCAS's length-prefixed frame model; one Dockerfile covers every profile because profile-switching is an env-var.
`docker-compose.yml` exposes one service per profile as a `--profile`:
```yaml
services:
focas-thirtyone:
profiles: ["thirtyone"]
image: otopcua-focas-sim:latest
environment: { OTOPCUA_FOCAS_PROFILE: "thirtyone_i" }
ports: ["8193:8193"]
focas-zerod:
profiles: ["zerod"]
image: otopcua-focas-sim:latest
environment: { OTOPCUA_FOCAS_PROFILE: "zero_i_d" }
ports: ["8193:8193"]
# ... one per supported series ...
```
Users pick a profile with `docker compose --profile thirtyone up -d`. Only one profile runs at a time (port collision on 8193) — matching the other driver fixtures' single-image pattern.
## Delivery plan — three streams
### Stream A — Version-aware fake backend (C#, 2-3 days) — ✅ **SHIPPED 2026-04-23**
**What landed**:
- `FakeFocasBackend` gained a second ctor `(FocasCncSeries series, FakeFocasBackendQuirks? quirks)`; default ctor preserves the pre-Stream-A permissive behaviour.
- `ValidateAddress` delegates to the existing `FocasCapabilityMatrix.Validate` so mock + driver share one source of truth. Out-of-range reads/writes/PMC-bit-writes return `BadOutOfRange` (0x803C0000 — matching what the real driver maps `EW_NUMBER`/`EW_PARAM` to).
- `FakeFocasBackendQuirks` record carries four opt-in quirks: `EditModeRejectsConnection`, `CrashAfterHandleCycles`, `SlowFirstConnectDelay`, `EmergencyAtStartup`.
- `Program.cs` reads `OTOPCUA_FOCAS_SERIES` (case-insensitive FocasCncSeries enum value) + `OTOPCUA_FOCAS_QUIRKS` (comma-separated token list: `EditMode`, `Emergency`, `SlowFirstConnect[=ms]`, `CrashAfterCycles=N`). Unknown tokens log-and-ignore. Values surface in Host log at startup.
- 19 new tests in `FakeFocasBackendSeriesTests.cs` covering: Unknown-permissive baseline, Zero_i_D macro rejection, ThirtyOne_i extended-macro acceptance, PowerMotion_i T-timer rejection, Write+PmcBitWrite parallel rejection, all four quirks, + 8 theory cases for the env-var parser.
**Deliverable shipped**:
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host/Backend/FakeFocasBackend.cs` — extended
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host/Program.cs``BuildFakeBackend` local fn + `ParseFakeQuirks` helper
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Tests/FakeFocasBackendSeriesTests.cs` — new, 19 tests
- 38/38 Host tests green post-Stream-A.
### Stream B — Python FOCAS TCP server (scaffold) — ✅ **SHIPPED 2026-04-23**
**What landed** under `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/`:
- `Dockerfile` — Python 3.12-slim image; stdlib-only, no external deps
- `docker-compose.yml` — 9 `--profile` entries, one per Fanuc series (`thirtyone`, `thirtytwo`, `thirty`, `sixteen`, `zerod`, `zerof`, `zeromf`, `zerotf`, `powermotion`). All share one image + one port (8193).
- `server/focas_server.py` — asyncio entry point, per-connection session loop, graceful-shutdown signal handling
- `server/frames.py` — length-prefixed frame codec (scaffold — see Stream C note below)
- `server/state.py` — per-session handle registry + in-memory PMC/param/macro dictionaries
- `server/profile.py` — JSON profile loader
- `server/handlers/` — one module per FWLIB function (9 total): open/close, PMC read/write, param read/write, macro read/write, statinfo. Profile-driven range validation; error responses use a `FLAG_ERROR` bit on the response header.
- `profiles/*.json` — 9 series profiles mirroring `FocasCapabilityMatrix`. Quirks (`slow_first_connect_ms`, `alarm_default`, `emergency_default`, `crash_after_handle_cycles`, `edit_mode_rejects_connection`) declared per profile.
- `validate_harness.py` — scaffold-protocol TCP client that opens a session, round-trips a macro, triggers range-rejection, asserts the expected error reasons surface.
- `README.md` — operator-facing usage + Stream C next-steps checklist.
**Exit criterion met**: validated end-to-end against two profiles (`thirtyone_i`, `powermotion_i`) via the local harness. Session handshake → statinfo → macro round-trip → out-of-range rejection → PMC round-trip → bad-letter rejection → clean close — all PASS. Profile-switching confirmed working: 31i API 0x0030 → PowerMotion 0x0040, macro range [0,99999]→[0,999], letter set {A,C,D,E,F,G,K,M,R,T,X,Y}→{D,R,X,Y}.
**⚠️ The wire *framing* is a scaffold — NOT Fwlib64-compatible yet.** `server/frames.py` uses a plausible length-prefixed framing (big-endian header: uint32 length, uint16 function_id, uint16 flags) that satisfies the harness but has never been validated against the real Fanuc DLL. Stream C is the iterative refinement cycle where a Windows rig drives that convergence.
**The response payload shapes inside those frames ARE authoritative** (refined 2026-04-23 after `fwlib32.h` review):
- `ODBM` (macro read) = 10 bytes: `short datano, short dummy, int32 mcr_val, short dec_val`
- `ODBST` (statinfo) = 18 bytes: 9 × `short` (dummy/tmmode/aut/run/motion/mstb/emergency/alarm/edit)
- `IODBPSD` (param read) = 36 bytes: `short datano, short type, bytes[32]` (union = 8 axes × 4 bytes)
- `IODBPMC` (PMC range read) = 48 bytes: `short type_a, short type_d, uint16 datano_s, uint16 datano_e, bytes[40]`
Validate harness asserts exact byte sizes + header field round-trip. When Stream C's Wireshark traces arrive, the payload layer should already match — only framing needs iteration.
See [`focas-wire-protocol.md`](focas-wire-protocol.md) for the authoritative-vs-guessed breakdown.
**C# integration test scaffold** also shipped (`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/`) — `FocasSimFixture` probes port 8193 + skips when the container's down; three smoke tests pass against a running container (TCP reachability, clean connect-close, profile parsing). A `Series/WireCompatGatedTests.cs` skeleton gates Fwlib64-dependent tests behind `OTOPCUA_FOCAS_SIM_WIRE_COMPAT=1`, ready for Stream C activation.
### Stream C — FWLIB compat + version profiles (2-3 weeks) — **blocked on Windows rig + Wireshark traces**
See `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/README.md` §"Stream C — what's required to reach wire compatibility" for the concrete implementer checklist.
**Goal**: real Fwlib64.dll running on a Windows test rig can open a session against the mock and round-trip the 9 FWLIB calls our driver makes.
Sub-tasks:
1. **Handshake** (`handlers/allclibhndl3.py`) — the hardest piece. FOCAS session open negotiates protocol version + controller type. Incorrect negotiation → Fwlib disconnects. Start from `strangesast/fwlib`'s handshake trace.
2. **PMC read/write** (`handlers/pmc.py`) — 40-byte buffer with type-dependent layout. Must match `FwlibNative.IODBPMC` struct layout exactly. Implement per-profile range checks.
3. **Parameter read/write** (`handlers/param.py`) — 32-byte axis-aware buffer. Similar to PMC but simpler (no sub-address bit indexing beyond `param_bit_index_max`).
4. **Macro read/write** (`handlers/macro.py`) — straightforward; value + decimal-point count as `ODBM`.
5. **Status info** (`handlers/status.py`) — fixed `ODBST` shape; profile declares defaults for `Aut` / `Run` / `Motion` / `Alarm`.
6. **State management** (`server/state.py`) — per-session handle registry, in-memory PMC/param/macro dictionaries, persistent across one session, reset on session close.
7. **Profile loader** — reads `OTOPCUA_FOCAS_PROFILE` env var, loads matching JSON, injects into handlers.
8. **Windows validation rig** — one-time setup: a Windows VM (or dev box) with licensed `Fwlib64.dll` + a tiny test driver that calls the 9 FWLIB functions + asserts round-trip. This is the first live-wire validation the plan asks for.
9. **Per-series test matrix**`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/` new project, one test class per series, each class's `[Fact]` runs against that profile's container.
**Exit criterion**: live Fwlib64.dll on a Windows rig opens a session, reads + writes across all 9 FWLIB functions, against each of the 9 profiles. Integration test suite green.
### Stream D — e2e integration + doc close-out (1-2 days)
- Update `scripts/e2e/test-focas.ps1` to accept `-ProfileName` and skip `FOCAS_TRUST_WIRE` gate when the matching container is up.
- Add the FOCAS simulator to `docs/v2/test-data-sources.md` + `docs/drivers/FOCAS-Test-Fixture.md` (flip the "hardware-gated" caveat to "fixture or hardware").
- Update `exit-gate-phase-3.md` — final FOCAS deferral closes.
## Test integration
The new project `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/` mirrors `Driver.OpcUaClient.IntegrationTests`:
```
tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/
├── Docker/
│ ├── docker-compose.yml # references the 9 series profiles
│ ├── Dockerfile # Python image
│ ├── requirements.txt
│ ├── server/
│ └── profiles/
├── FocasSimFixture.cs # probes 8193 at collection init, skips if down
├── FocasSimSeriesProfile.cs # test-side mirror of the JSON profile
└── Series/
├── ThirtyOneITests.cs
├── ZeroIDTests.cs
└── ... one file per series ...
```
The existing `FocasDocker`-less skip pattern applies: if the container isn't running, tests skip with a clear message pointing at `docker compose up -d`. Matches Modbus / S7 / OpcUaClient.
## Risks + mitigations
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| FOCAS wire protocol is more complex than the OSS traces suggest → Stream C slips weeks | **Medium** | High | Stream A delivers 70% value with zero protocol risk. If Stream C stalls, ship A + schedule C as a follow-up. |
| Fwlib64.dll version differs from what `strangesast/fwlib` reverse-engineered → handshake fails | Medium | High | Capture Wireshark trace of a real CNC session against our actual licensed Fwlib64 version before coding. One-time investment, catches drift early. |
| Profile differences that matter at the wire level aren't captured in `FocasCapabilityMatrix` | Medium | Medium | Stream C exit criterion includes validating each profile against live Fwlib — any mismatch is a profile-table bug we fix then. |
| Docker container startup time breaks PR-CI budget | Low | Low | Each profile is one Python container + profile JSON — sub-5s cold start. Matches opc-plc. |
| Windows validation rig availability blocks Stream C | Medium | High | Use the existing TCBSD-class approach: a dedicated ESXi VM with Windows + licensed Fwlib64.dll, provisioned once, shared by the team. Cost ~1 dev-day to set up; unblocks all future FOCAS work forever. |
| Fanuc licence audit surfaces our mock as an "unlicensed FOCAS implementation" | **Low** | **High** | The mock doesn't ship the Fanuc DLL or reproduce any of Fanuc's code. Reverse-engineered wire formats from OSS research are fair use; the mock is our code. Consult legal before open-sourcing, not before internal use. |
## Timeline estimate
Assuming one dev full-time:
| Stream | Duration | Dependencies |
|---|---|---|
| A — Version-aware fake backend | 2-3 days | none |
| B — TCP server scaffold | 1 week | Windows rig not required yet |
| C — FWLIB compat + profiles | 2-3 weeks | Windows rig with Fwlib64 + Wireshark trace |
| D — e2e + docs | 1-2 days | C done |
**Total**: ~4-5 weeks to full coverage. Ship A immediately (independent value), start C in parallel with Windows-rig setup.
## Exit criteria (what closes #222)
- [ ] All 9 series profiles containerized + pass startup health check
- [ ] Live Fwlib64.dll round-trips all 9 FWLIB calls against every profile (Stream C validation rig)
- [ ] Per-series integration test suite green in CI
- [ ] `test-focas.ps1` runs end-to-end against the simulator without `FOCAS_TRUST_WIRE=1`
- [ ] Docs updated: `FOCAS-Test-Fixture.md` flipped from "hardware-only" to "fixture or hardware"
- [ ] One live-CNC smoke still runs during v2 release readiness, as a belt-and-braces final check
## Open questions
1. **Licence clarity**: is reverse-engineered FOCAS2 wire-format documentation (from `strangesast/fwlib` etc.) compatible with our Fanuc FOCAS developer-kit licence? Legal check required before starting Stream C.
2. **Windows rig**: do we dedicate an existing VM (like the TCBSD box) or provision a new one? Cost difference is small; decision affects who owns maintenance.
3. **Profile source of truth**: if `FocasCapabilityMatrix.cs` and `profiles/*.json` ever disagree, which wins? Proposal: profiles win (wire behavior is authoritative), driver's matrix is regenerated from profiles as a build step.
4. **Alarm events**: the driver doesn't currently use `cnc_rdalmmsg2` / alarm subscription, so the mock doesn't need to simulate alarms beyond the `statinfo.Alarm` flag. If we add `IAlarmSource` to FOCAS later, Stream C expands.
## References
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibNative.cs` — 9-function P/Invoke surface the mock must satisfy
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FocasCapabilityMatrix.cs` — per-series range tables (profile seed data)
- `docs/v2/focas-version-matrix.md` — human-readable version matrix the profiles mirror
- `docs/drivers/FOCAS-Test-Fixture.md` — current test-fixture doc (flips post-Stream-D)
- `tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/Docker/` — pattern this plan mirrors for the Docker compose + fixture-skip shape
- `strangesast/fwlib` (GitHub, OSS) — primary FOCAS wire-format reverse-engineering reference