diff --git a/docs/plans/2026-06-18-focas-figure-ww-poison-abcip-gate-design.md b/docs/plans/2026-06-18-focas-figure-ww-poison-abcip-gate-design.md new file mode 100644 index 00000000..7bad62bc --- /dev/null +++ b/docs/plans/2026-06-18-focas-figure-ww-poison-abcip-gate-design.md @@ -0,0 +1,182 @@ +# FOCAS cnc_getfigure + Wonderware poison-event status + AbCip nested-UDT live-gate — Design + +**Date:** 2026-06-18 +**Branch:** `feat/focas-figure-ww-poison-abcip-gate` (off master `274ba2b1`) +**Backlog items:** `stillpending.md` §A #3 (FOCAS `cnc_getfigure`), #5 (Wonderware poison-event sidecar wire), #6 (AbCip nested-struct prod live-gate) + +Three independent backlog items bundled into one phase. They touch disjoint projects +(Driver.FOCAS + its python sim / Driver.Historian.Wonderware{,.Client} + Core.AlarmHistorian / +Driver.AbCip.IntegrationTests + docs), so they are independently implementable and parallelizable. + +--- + +## Standing constraints (in force) + +- **NO Commons wire/proto contract change**, NO Core.Abstractions / breaking interface contract change, + NO EF migration, NO bUnit (Razor proven only by live `/run` — N/A this phase, no Razor touched). +- Stage by explicit path, never `git add .`; never stage the never-stage files (`sql_login.txt`, + `src/Server/.../Host/pki/`, `pending.md`, `current.md`, `stillpending.md`, `docker-dev/docker-compose.yml`). +- No force-push, no `--no-verify`. Never echo/commit secrets. Finish = merge to master + push. +- `dangerouslyDisableSandbox: true` for all build/test/rig commands. + +--- + +## Component A — FOCAS `cnc_getfigure` wire command (backlog #3) + +### The finding that makes this safe + +`WireFocasClient` (the pure-managed FOCAS/2 Ethernet client) is the **production path to real CNCs** +(not the Fwlib P/Invoke path). Its `GetPositionFiguresAsync` is a hard-coded empty-list stub +(`Wire/WireFocasClient.cs:287-297`). The consumer wraps it in a graceful probe: + +```csharp +// FocasDriver.cs:662 +state.PositionFigures = await SafeProbe(() => client.GetPositionFiguresAsync(ct), []); +``` + +and `AxisFactor` (`FocasDriver.cs:~853`) uses a per-axis figure **only when present and ≥ 0**, else falls +back to the `PositionDecimalPlaces` config knob. So a wire command that errors or returns empty degrades to +**exactly today's behavior** — implementing it is monotonic and cannot regress real hardware. + +The `.NET FocasWireClient` and the python `focas_mock` (`tests/.../FOCAS.IntegrationTests/Docker/focas-mock/`) +speak a **co-designed, internally-consistent** wire protocol: the mock dispatches on command codes in +`server.py:_wire_payload` (0x56=servo-meter, 0x26=axis position, 0x89=axis names, 0x120=timer, …). The +protocol is validated **against the sim**, not against real Fanuc hardware — that validation is bench-CNC-gated +for the *entire* wire backend (`docs/v2/implementation/focas-wire-protocol.md`), not unique to this command. + +### Approach (driver-internal + sim; NO interface change — `IFocasClient.GetPositionFiguresAsync` already exists) + +1. **`FocasWireClient.ReadPositionFiguresAsync(...)`** — mirror `ReadServoMeterAsync` (`FocasWireClient.cs:351-386`): + send the figure request **paired with the axis-name request (`0x0089`)** so figures align positionally to + axes (exactly how servo-meter pairs `0x0056`+`0x0089`). Pick a wire command id **currently unused by both + the client and the mock** (the mock uses 0x0E/0x10/0x15/0x16/0x18/0x19/0x1A/0x1C/0x1D/0x23/0x24/0x25/0x26/ + 0x35/0x40/0x56/0x57/0x89/0x8A/0x98/0xE1/0xFC/0x120/0x8001/0x8002 — choose a clearly-unused code, e.g. `0x00D3`, + and document it as sim-consistent / bench-CNC-unvalidated). Response payload = per-axis `short dec` + (decimal-place count); parse into `IReadOnlyList`. `rc != 0` → empty list (graceful). +2. **`WireFocasClient.GetPositionFiguresAsync`** — replace the stub: call the new client method, return its list; + any failure path returns empty (never throws — the `SafeProbe` + per-axis fallback contract is preserved). + Rewrite the now-stale `` doc. +3. **`focas_mock`** — add a `cnc_getfigure` admin/handler entry + a `_wire_payload` branch for the chosen code + returning per-axis `short dec` from a **new data-store key `position_figures`** (default **0 per axis** → + `10^0 = 1.0` factor → no scaling → every existing integration assertion preserved). Register the method name + in `constants.py:IMPLEMENTED_FOCAS_METHODS` + the server handler map + profile `exports` as needed. + +### Testing + +- **Driver consumption is already covered offline** by `FocasPositionAutoScaleTests` (FakeFocasClient returns + figures → asserts scaled vs. fallback). No change needed there. +- **New end-to-end proof** = a skip-gated integration test in `Series/WireBackendCoverageTests.cs` (the + established pattern, `[Collection(FocasSimCollection.Name)]` + `if (_fx.SkipReason is not null) Assert.Skip(...)`): + `mock_patch` non-zero `position_figures`, init the wire-backed `FocasDriver`, assert the published + `AbsolutePosition` is the **scaled** value (raw ÷ 10^dec), not the raw integer. +- **Live `/run` for A** = bring the focas-mock up locally on the Mac + (`docker compose -f tests/.../FOCAS.IntegrationTests/Docker/docker-compose.yml up -d --build`) and run the + FOCAS integration suite; the new test executes (does not skip) and passes. + +### Honest boundary (documented, not built) + +Real-Fanuc validation of the chosen wire code/payload stays bench-CNC-gated — same status as the whole wire +backend. Worst case on real hardware = graceful fallback to the config knob (i.e. no regression). + +--- + +## Component B — Wonderware poison-event per-event status (backlog #5) + +### The finding that shapes it + +The sidecar IPC reply `WriteAlarmEventsReply.PerEventOk` is a `bool[]` on **both** ends — which are **both in +this repo** (`...Wonderware.Client/Ipc/Contracts.cs` + `...Wonderware/Ipc/Contracts.cs`, MessagePack over TCP; +**not** a Commons proto). The client (`WonderwareHistorianClient.WriteBatchAsync:340-369`) can therefore only +produce `Ack`/`RetryPlease`, never `PermanentFail`, so a poison event loops to the retry cap instead of +dead-lettering immediately. The `HistorianWriteOutcome` enum **already has `PermanentFail`** and the sink +(`SqliteStoreAndForwardSink.cs:456-465`) **already dead-letters it immediately**. The sidecar writer seam +`IAlarmEventWriter.WriteAsync` returns only `bool[]` and its sole real impl is the **infra-gated** +`AahClientManagedAlarmEventWriter` (AAH SDK). + +### Approach (additive IPC field + sidecar classifier; NO `IAlarmEventWriter` change, NO Commons) + +1. **Additive wire field (both Contracts.cs):** add `[Key(4)] byte[] PerEventStatus` to `WriteAlarmEventsReply` + (0=Ack, 1=Retry, 2=Permanent). **Keep `PerEventOk [Key(3)]`** populated for rolling-deploy back-compat + (new client ↔ old sidecar: empty Key(4) → fall back to PerEventOk; old client ↔ new sidecar: ignores Key(4)). +2. **Sidecar `HistorianFrameHandler.HandleWriteAlarmEventsAsync`:** add a pure `ClassifyEvents(events)` step — + an event that is **structurally malformed** (empty `SourceName`, empty `AlarmType`, or `EventTimeUtcTicks <= 0`) + can never persist → mark **Permanent** and **exclude it from the writer batch** (mirrors the client's existing + corrupt-row exclusion). Remaining events go to the writer; `true`→Ack, `false`→Retry. Populate **both** + `PerEventOk` (Ack→true else false) and `PerEventStatus`. +3. **Client `WriteBatchAsync`:** when `reply.PerEventStatus.Length == batch.Count`, map 0/1/2 → + `Ack`/`RetryPlease`/`PermanentFail`; else fall back to the existing `PerEventOk` path. Rewrite the stale + `` + inline "PermanentFail is never emitted" comments. + +### Testing (fully offline — no rig) + +- Sidecar: pure `ClassifyEvents` unit test (malformed → Permanent + excluded; valid → delegated; writer + false → Retry). +- Client: a `FakeSidecarServer` reply with `PerEventStatus=[2]` → `WriteBatchAsync` returns `PermanentFail`; + a reply with empty `PerEventStatus` → falls back to `PerEventOk` (back-compat). +- End-to-end sink: an existing `SqliteStoreAndForwardSink` test already proves `PermanentFail` → immediate + dead-letter; add/confirm a test that a Permanent classification dead-letters on the **first** drain (vs. the + retry-cap path the finding-002 regression test covers). + +### Honest boundary (documented) + +SDK-**semantic** permanent rejections (a structurally-valid event the AAH SDK rejects, e.g. unknown tag) still +map to Retry→cap until the infra-gated `AahClientManagedAlarmEventWriter` surfaces richer per-event status — a +noted follow-up. This phase closes the **structurally-malformed (poison)** case the finding describes. + +--- + +## Component C — AbCip nested-struct live-gate (backlog #6) + +### Verdict (from the feasibility pass) + +A **local** live-gate is architecturally impossible: `ab_server` (the libplctag CIP sim used by the default +`abserver` tier) does **not** implement the CIP Template Object service (class 0x6C) that nested-UDT discovery +depends on. The decode + threading already **shipped** (`3d8ce4e8`/`d203f31c`; `AbCipUdtMember.NestedTemplateId` +→ existing `@udt/{id}` fetch) and 301 offline tests pass. The honest close formalizes the gate at the existing +**Emulate** fidelity tier (Logix Emulate / real ControlLogix). + +### Approach (skip-gated test + docs; NO runtime change) + +1. **New `AbCipEmulateNestedUdtTests`** (mirror `AbCipEmulateUdtReadTests` + `AbServerProfileGate.SkipUnless(Emulate)`): + drives `FocasDriver`-equivalent AbCip discovery against a nested-UDT-bearing Emulate project and asserts the + nested struct's atomic leaves are addressable (`Parent.Status.Code`, `Parent.Status.Running`) + the nested + sub-folder materializes. Skips cleanly on the default `abserver` tier (which can't serve Template Object). +2. **`docs/drivers/AbCip.md`** — document nested-struct support as **Emulate-tier verified** (ab_server lacks + CIP Template Object), referencing the new test + the existing offline unit coverage. + +### Testing + +The Emulate test **compiles and skips** locally (no Logix Emulate on this Mac) — proving it is wired into the +suite. The decode/threading risk is already pinned by the shipped offline unit tests +(`CipTemplateObjectDecoderTests` + `AbCipDriverDiscoveryTests`). + +--- + +## Component D — Reconcile + finish + +- `stillpending.md` (never-staged): mark #3 (FOCAS wire command shipped + sim-proven), #5 (Wonderware + poison-event structurally-malformed close + the SDK-semantic follow-up boundary), #6 (AbCip live-gate + formalized as the Emulate skip-gated test). +- Update memory (`project_stillpending_backlog.md` + `MEMORY.md` index line). +- Build clean + targeted tests green (FOCAS + Wonderware + AbCip) + Component A live `/run` (focas-mock) + + merge to master + push. + +--- + +## Task slicing (independent → parallelizable) + +| Task | Component | Project(s) | Class | Parallel with | +|---|---|---|---|---| +| T1 | FOCAS wire `cnc_getfigure` (client method + stub wire-in + mock handler) | Driver.FOCAS + focas-mock (py) | standard | T2, T3 | +| T2 | Wonderware per-event status (DTO ×2 + sidecar classifier + client consume + tests) | Wonderware{,.Client} + Core.AlarmHistorian.Tests | standard | T1, T3 | +| T3 | AbCip Emulate nested-UDT skip-gated test + AbCip.md | AbCip.IntegrationTests + docs | small | T1, T2 | +| T4 | FOCAS integration test (mock) + live `/run` verify | FOCAS.IntegrationTests | small | none (after T1) | +| T5 | Reconcile stillpending #3/#5/#6 + memory + finish (build, tests, merge+push) | docs (never-staged) | small | none | + +Parallel implementers use **worktree isolation** (the shared-tree git-race lesson) since T1/T2/T3 touch disjoint +projects. T4 depends on T1; T5 runs last. + +## Done = + +Build clean + `dotnet test` green (Driver.FOCAS + Wonderware client/sink + AbCip) + Component A live `/run` +(focas-mock integration test executes & passes) + Components B/C offline-proven + merged to master + pushed.