E2E — live-boot verification for AB CIP + S7 smoke seeds #220

Closed
opened 2026-04-21 11:33:20 -04:00 by dohertj2 · 1 comment
Owner

Follow-up to #209 exit gate (PR #218). Modbus is live-verified end-to-end (4/5 stages — see the child issue for stage 4). AB CIP + S7 factories + seeds shipped in #217 but weren't live-booted in the same pass — a time-boxing choice since Modbus surfaced two seed-side fixes and each extra driver compounds debugging.

What remains

  1. Apply seed-abcip-smoke.sql + docker compose --profile controllogix up -d
  2. Boot server with Node__NodeId=abcip-smoke-node Node__ClusterId=abcip-smoke
  3. ./scripts/e2e/test-abcip.ps1 -BridgeNodeId "ns=2;s=TestDINT"
  4. Expect: 4/5 or 5/5 PASS matching the Modbus result — probe / driver-loopback / forward-bridge / (reverse-write blocked by the same authz gap as Modbus) / subscribe-sees-change

Same for S7: seed-s7-smoke.sql + --profile s7_1500 + test-s7.ps1 -BridgeNodeId "ns=2;s=DB1_DBW0" -S7Host "127.0.0.1:1102".

AB Legacy

Hardware-gated per #222 — ab_server PCCC is upstream-broken. seed-ablegacy-smoke.sql ships with a placeholder gateway; live-boot verification needs a real SLC / MicroLogix / PLC-5 / RSEmulate 500. Not blocked by anything in the server code path.

Non-goals

Do NOT wait on the reverse-write authz fix to ship this verification. That's filed separately; knowing 4/5 stages work on all three drivers is more valuable than blocking on 5/5.

Follow-up to #209 exit gate (PR #218). Modbus is live-verified end-to-end (4/5 stages — see the child issue for stage 4). AB CIP + S7 factories + seeds shipped in #217 but weren't live-booted in the same pass — a time-boxing choice since Modbus surfaced two seed-side fixes and each extra driver compounds debugging. ## What remains 1. Apply `seed-abcip-smoke.sql` + `docker compose --profile controllogix up -d` 2. Boot server with `Node__NodeId=abcip-smoke-node Node__ClusterId=abcip-smoke` 3. `./scripts/e2e/test-abcip.ps1 -BridgeNodeId "ns=2;s=TestDINT"` 4. Expect: 4/5 or 5/5 PASS matching the Modbus result — probe / driver-loopback / forward-bridge / (reverse-write blocked by the same authz gap as Modbus) / subscribe-sees-change Same for S7: `seed-s7-smoke.sql` + `--profile s7_1500` + `test-s7.ps1 -BridgeNodeId "ns=2;s=DB1_DBW0" -S7Host "127.0.0.1:1102"`. ## AB Legacy Hardware-gated per #222 — ab_server PCCC is upstream-broken. `seed-ablegacy-smoke.sql` ships with a placeholder gateway; live-boot verification needs a real SLC / MicroLogix / PLC-5 / RSEmulate 500. Not blocked by anything in the server code path. ## Non-goals Do NOT wait on the reverse-write authz fix to ship this verification. That's filed separately; knowing 4/5 stages work on all three drivers is more valuable than blocking on 5/5.
Author
Owner

AB CIP + S7 live-boot verified 5/5 each in PR #222. AB Legacy factory + seed ready; live-boot still blocked on #222 (hardware).

AB CIP + S7 live-boot verified 5/5 each in PR #222. AB Legacy factory + seed ready; live-boot still blocked on #222 (hardware).
dohertj2 referenced this issue from a commit 2026-04-30 08:21:26 -04:00
FOCAS version-matrix stabilization (PR 1 of #220 split) — ship the cheap half of the hardware-free stability gap ahead of the Tier-C out-of-process split. Without any CNC or simulator on the bench, the highest-leverage move is to catch operator config errors at init time instead of at steady-state per-read. Adds FocasCncSeries enum (Unknown/16i/0i-D/0i-F family/30i family/PowerMotion-i) + FocasCapabilityMatrix static class that encodes the per-series documented ranges for macro variables (cnc_rdmacro/wrmacro), parameters (cnc_rdparam/wrparam), and PMC letters + byte ceilings (pmc_rdpmcrng/wrpmcrng) straight from the Fanuc FOCAS Developer Kit. FocasDeviceOptions gains a Series knob (defaults Unknown = permissive so pre-matrix configs don't break on upgrade). FocasDriver.InitializeAsync now calls FocasAddress.TryParse on every tag + runs FocasCapabilityMatrix.Validate against the owning device's declared series, throwing InvalidOperationException with a reason string that names both the series and the documented limit ("Parameter #30000 is outside the documented range [0, 29999] for Thirty_i") so an operator can tell whether the mismatch is in the config or in their declared CNC model. Unknown series skips validation entirely. Ships 46 new theory cases in FocasCapabilityMatrixTests.cs — covering every boundary in the matrix (widen 16i->0i-F: macro ceiling 999->9999, param 9999->14999; widen 0i-F->30i: PMC letters +K+T; PMC-number 16i=999/0i-D=1999/0i-F=9999/30i=59999), permissive Unknown-series behavior, rejection-message content, and case-insensitive PMC-letter matching. Widening a range without updating docs/v2/focas-version-matrix.md fails a test because every InlineData cites the row it reflects. Full FOCAS test suite stays at 165/165 passing (119 existing + 46 new). Also authors docs/v2/focas-version-matrix.md as the authoritative range reference with per-function citations, CNC-series era context, error-surface shape, and the link back to the matrix code; docs/v2/implementation/focas-isolation-plan.md as the multi-PR plan for #220 Tier-C isolation (Shared contracts -> Host skeleton -> move Fwlib32 calls -> Supervisor+respawn -> MMF+ops glue, 2200-3200 LOC across 5 PRs mirroring the Galaxy Tier-C topology); and promotes docs/drivers/FOCAS-Test-Fixture.md from "version-matrix coverage = no" to explicit coverage via the new test file + cross-links to the matrix and isolation-plan docs. Leaves task #220 open since isolation itself (the expensive half) is still ahead.
dohertj2 referenced this issue from a commit 2026-04-30 08:21:27 -04:00
FOCAS Tier-C PR A — Driver.FOCAS.Shared MessagePack IPC contracts. First PR of the 5-PR #220 split (isolation plan at docs/v2/implementation/focas-isolation-plan.md). Adds a new netstandard2.0 project consumable by both the .NET 10 Proxy and the future .NET 4.8 x86 Host, carrying every wire DTO the Proxy <-> Host pair will exchange: Hello/HelloAck + Heartbeat/HeartbeatAck + ErrorResponse for session negotiation (shared-secret + protocol major/minor mirroring Galaxy.Shared); OpenSessionRequest/Response + CloseSessionRequest carrying the declared FocasCncSeries so the Host picks up the pre-flight matrix; FocasAddressDto + FocasDataTypeCode for wire-compatible serialization of parsed addresses (0=Pmc/1=Param/2=Macro matches FocasAreaKind enum order so both sides cast (int)); ReadRequest/Response + WriteRequest/Response with MessagePack-serialized boxed values tagged by FocasDataTypeCode; PmcBitWriteRequest/Response as a first-class RMW operation so the critical section stays Host-side; Subscribe/Unsubscribe/OnDataChangeNotification for poll-loop-pushes-deltas model (FOCAS has no CNC-initiated callbacks); Probe + RuntimeStatusChange + Recycle surface for Tier-C supervision. Framing is [4-byte BE length][1-byte kind][body] with 16 MiB body cap matching Galaxy; FocasMessageKind byte values align with Galaxy ranges so an operator reading a hex dump doesn't have to context-switch. FrameReader/FrameWriter ported from Galaxy.Shared with thread-safe concurrent-write serialization. 24 new unit tests: 18 per-DTO round-trip tests covering every field + 6 framing tests (single-frame round-trip, clean-EOF returns null, oversized-length rejection, mid-frame EOF throws, 20-way concurrent-write ordering preserved, MessageKind byte values locked as wire-stable). No driver changes; existing 165 FOCAS unit tests still pass unchanged. PR B (Host skeleton) goes next.
dohertj2 referenced this issue from a commit 2026-04-30 08:21:27 -04:00
FOCAS Tier-C PR B — Driver.FOCAS.Host net48 x86 skeleton + pipe server. Second PR of the 5-PR #220 split. Stands up the Windows Service entry point + named-pipe scaffolding so PR C has a place to move the Fwlib32 calls into. New net48 x86 project (Fwlib32.dll is 32-bit-only, same bitness constraint as Galaxy.Host/MXAccess); references Driver.FOCAS.Shared for framing + DTOs so the wire format Proxy<->Host stays a single type system. Ships four files: PipeAcl creates a PipeSecurity where only the configured server principal SID gets ReadWrite+Synchronize + LocalSystem/BuiltinAdministrators are explicitly denied (so a compromised service account on the same host can't escalate via the pipe); IFrameHandler abstracts the dispatch surface with HandleAsync + AttachConnection for server-push event sinks; PipeServer accepts one connection at a time, verifies the peer SID via RunAsClient, reads the first Hello frame + matches the shared-secret and protocol major version, sends HelloAck, then hands off to the handler until EOF or cancel; StubFrameHandler fully handles Heartbeat/HeartbeatAck so a future supervisor's liveness detector stays happy, and returns ErrorResponse{Code=not-implemented,Message="Kind X is stubbed - Fwlib32 lift lands in PR C"} for every data-plane request. Program.cs mirrors the Galaxy.Host startup exactly: reads OTOPCUA_FOCAS_PIPE / OTOPCUA_ALLOWED_SID / OTOPCUA_FOCAS_SECRET from the env (supervisor passes these at spawn time), builds Serilog rolling-file logger into %ProgramData%\OtOpcUa\focas-host-*.log, constructs the pipe server with StubFrameHandler, loops on RunAsync until SIGINT. Log messages mark the backend as "stub" so it's visible in logs that Fwlib32 isn't actually connected yet. Driver.FOCAS.Host.Tests (net48 x86) ships three integration tests mirroring IpcHandshakeIntegrationTests from Galaxy.Host: correct-secret handshake + heartbeat round-trip, wrong-secret rejection with UnauthorizedAccessException, and a new test that sends a ReadRequest and asserts the StubFrameHandler returns ErrorResponse{not-implemented} mentioning PR C in the message so the wiring between frame dispatch + kind → error mapping is locked. Tests follow the same is-Administrator skip guard as Galaxy because PipeAcl denies BuiltinAdministrators. No changes to existing driver code; FOCAS unit tests still at 165/165 + Shared tests at 24/24. PR C wires the real Fwlib32 backend.
dohertj2 referenced this issue from a commit 2026-04-30 08:21:27 -04:00
FOCAS Tier-C PR C — IPC path end-to-end: Proxy IpcFocasClient + Host FwlibFrameHandler + IFocasBackend abstraction. Third of 5 PRs for #220. Ships the wire path from IFocasClient calls in the .NET 10 driver, over a named-pipe (or in-memory stream) to the .NET 4.8 Host's FwlibFrameHandler, dispatched to an IFocasBackend. Keeps the existing IFocasClient DI seam intact so existing unit tests are unaffected (172/172 still pass). Proxy side adds Ipc/FocasIpcClient (owns one pipe stream + call gate so concurrent callers don't interleave frames, supports both real NamedPipeClientStream and arbitrary Stream for in-memory test loopback) and Ipc/IpcFocasClient (implements IFocasClient by forwarding every call as an IPC frame — Connect sends OpenSessionRequest and caches the SessionId; Read sends ReadRequest and decodes the typed value via FocasDataTypeCode; Write sends WriteRequest for non-bit data or PmcBitWriteRequest when it's a PMC bit so the RMW critical section stays on the Host; Probe sends ProbeRequest; Dispose best-effort sends CloseSessionRequest); plus FocasIpcException surfacing Host-side ErrorResponse frames as typed exceptions. Host side adds Backend/IFocasBackend (the Host's view of one FOCAS session — Open/Close/Read/Write/PmcBitWrite/Probe) with two implementations: FakeFocasBackend (in-memory, per-address value store, honors bit-write RMW semantics against the containing byte — used by tests and as an OTOPCUA_FOCAS_BACKEND=fake operational stub) and UnconfiguredFocasBackend (structured failure pointing at docs/v2/focas-deployment.md — the safe default when OTOPCUA_FOCAS_BACKEND is unset or hardware isn't configured). Ipc/FwlibFrameHandler replaces StubFrameHandler: deserializes each request DTO, delegates to the IFocasBackend, re-serializes into the matching response kind. Catches backend exceptions and surfaces them as ErrorResponse{backend-exception} rather than tearing down the pipe. Program.cs now picks the backend from OTOPCUA_FOCAS_BACKEND env var (fake/unconfigured/fwlib32; fwlib32 still maps to Unconfigured because the real Fwlib32 P/Invoke integration is a hardware-dependent follow-up — #220 captures it). Tests: 7 new IPC round-trip tests on the Proxy side (IpcFocasClient vs. an IpcLoopback fake server: connect happy path, connect rejection, read decode, write round-trip, PMC bit write routes to first-class RMW frame, probe, ErrorResponse surfaces as typed exception) + 6 new Host-side tests on FwlibFrameHandler (OpenSession allocates id, read-without-session fails, full open/write/read round-trip preserves value, PmcBitWrite sets the specified bit, Probe reports healthy with open session, UnconfiguredBackend returns pointed-at-docs error with ErrorCode=NoFwlibBackend). Existing 165 FOCAS unit tests + 24 Shared tests + 3 Host handshake tests all unchanged. Total post-PR: 172+24+9 = 205 FOCAS-family tests green. What's NOT in this PR: the actual Fwlib32.dll P/Invoke integration inside the Host (FwlibHostedBackend) lands as a hardware-dependent follow-up since no CNC is available for validation; supervisor + respawn + crash-loop breaker comes in PR D; MMF + NSSM install scripts in PR E.
dohertj2 referenced this issue from a commit 2026-04-30 08:21:27 -04:00
FOCAS Tier-C PR D — supervisor + backoff + crash-loop breaker + heartbeat monitor. Fourth of 5 PRs for #220. Ships the resilience harness that sits between the driver's IFocasClient requests and the Tier-C Host process, so a crashing Fwlib32.dll takes down only the Host (not the main server), gets respawned on a backoff ladder, and opens a circuit with a sticky operator alert when the crash rate is pathological. Same shape as Galaxy Tier-C so the Admin /hosts surface has a single mental model. New Supervisor/ namespace in Driver.FOCAS (.NET 10, Proxy-side): Backoff with the 5s→15s→60s default ladder + StableRunThreshold that resets the index after a 2-min clean run (so a one-off crash after hours of steady-state doesn't restart from the top); CircuitBreaker with 3-crashes-in-5-min threshold + escalating 1h→4h→manual-reset cooldown ladder + StickyAlertActive flag that persists across cooldowns until AcknowledgeAndReset is called; HeartbeatMonitor tracking ConsecutiveMisses against the 3-misses-kill threshold + LastAckUtc for telemetry; IHostProcessLauncher abstraction over "spawn Host process + produce an IFocasClient connected to it" so the supervisor stays I/O-free and fully testable with a fake launcher that can be told to throw on specific attempts (production wiring over Process.Start + FocasIpcClient.ConnectAsync is the PR E ops-glue concern); FocasHostSupervisor orchestrating them — GetOrLaunchAsync cycles through backoff until either a client is returned or the breaker opens (surfaced as InvalidOperationException so the driver maps to BadDeviceFailure), NotifyHostDeadAsync fans out the unavailable event + terminates the current launcher + records the crash without blocking (so heartbeat-loss detection can short-circuit subscriber fan-out and let the next GetOrLaunchAsync handle the respawn), AcknowledgeAndReset is the operator-clear path, OnUnavailable event for Admin /hosts wiring + ObservedCrashes + BackoffAttempt + StickyAlertActive for telemetry. 14 new unit tests across SupervisorTests.cs: Backoff (default sequence, clamping, RecordStableRun resets), CircuitBreaker (below threshold allowed, opens at threshold, escalates cooldown after second open, ManualReset clears state), HeartbeatMonitor (3 consecutive misses declares dead, ack resets counter), FocasHostSupervisor (first-launch success, retry-with-backoff after transient failure, repeated failures open breaker + surface InvalidOperationException, NotifyHostDeadAsync terminates + fan-outs + increments crash count, AcknowledgeAndReset clears sticky, Dispose terminates). Full FOCAS driver tests now 186/186 green (172 + 14 new). No changes to IFocasClient DI contract; existing FakeFocasClient-based tests unaffected. PR E wires the real Process-based IHostProcessLauncher + NSSM install scripts + MMF post-mortem + docs.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: dohertj2/lmxopcua#220