# Driver-pages Phase 10 — reconnect-transition E2E + plan close-out — Design **Date:** 2026-06-18 **Status:** Approved **Base:** master `08c7a2bd` **Branch:** `feat/driver-pages-reconnect-e2e` ## Context The `2026-05-28-adminui-driver-pages` plan is **fully shipped** (Phases 0–10), but its `.tasks.json` is comprehensively stale (it still marks Phases 6–10 "pending" while the code + tests all exist and are real). A brainstorming pass on 2026-06-18 verified, seam by seam, that: - **Phase 6** (live `DriverStatusPanel`) is built end-to-end: `DriverHealthChanged` contract, `AkkaDriverHealthPublisher` (DI-bound in prod, invoked at `DriverInstanceActor.PublishHealthSnapshot`), `DriverStatusSignalRBridge` (spawned admin-gated at `Program.cs:196`), the shared-singleton snapshot store, the hub (`MapHub`), and the panel wired into all 9 driver pages. Page `DriverInstanceId` and actor `_driverInstanceId` key on the same EF value — no mismatch. - **Phase 8** (Reconnect/Restart) is built: messages, `AdminOperationsActor` + `DriverHostActor` handlers, DriverOperator-gated buttons in the panel. - **Phase 10** automated E2E tests (`DriverTestConnectE2eTests`, `DriverReconnectE2eTests`, `DriverStatusHubE2eTests`) are real, skip-gated, and honest about their scope. Two genuine remnants remain, both flagged by the `DriverReconnectE2eTests` / `DriverStatusHubE2eTests` scope-notes: 1. **The one real coverage gap:** no test proves a *deployed* driver actually transitions **Healthy → Reconnecting → Healthy** in response to `ReconnectDriver`. The existing 10.2 test only asserts command *ingestion* (the round-trip reply), not the resulting actor health transition. 2. **A harness-fidelity gap behind it:** `TwoNodeClusterHarness.BuildNodeAsync` calls `WithOtOpcUaRuntimeActors()` (the Akka actor spawn) but **not** `AddOtOpcUaRuntime()` (the DI registration that binds `IDriverHealthPublisher → AkkaDriverHealthPublisher`). Consequently, in *every* current `Host.IntegrationTests`, deployed driver actors fall back to `NullDriverHealthPublisher` and emit no health to the `driver-health` DPS topic. The harness also leaves `IDriverFactory` at `NullDriverFactory`, so deployed drivers reach `Stubbed`, never `Connected`. The plan's stale trackers caused this item to be (wrongly) re-listed as OPEN in `stillpending.md` §A.9. This phase closes both remnants and reconciles the trackers. ## Goal Close the genuine `ReconnectDriver` health-transition E2E gap, fix the harness-fidelity gap behind it, prove the full Phase 10 driver suite green, and reconcile the stale trackers so this fully-shipped plan stops re-triggering as backlog. ## Design ### 1. Harness fidelity fix In `TwoNodeClusterHarness.BuildNodeAsync`: - Add `builder.Services.AddOtOpcUaRuntime()` **before** `AddAkka` (matching production `Program.cs:87`). This binds the real `AkkaDriverHealthPublisher` so deployed drivers publish health to the `driver-health` DPS topic. It also seeds the `Null*` runtime defaults (`IHistorianDataSource`, `IAlarmHistorianSink`, `IHistoryWriter`, `IDriverFactory`, …) — all harmless no-ops that don't change existing test behavior (nothing in the current suite subscribes to driver-health, and the Null sinks are inert). - Add an **opt-in** seam to inject a test `IDriverFactory` for tests that need a connecting driver. Default (no factory supplied) leaves the existing behavior untouched. Mechanism: a `StartAsync` parameter (e.g. `IDriverFactory? driverFactory = null`) threaded into `BuildNodeAsync`; when supplied, register it as a singleton **after** `AddOtOpcUaRuntime` so it wins over the `Null` default (last-registration-wins / replace). This change is fidelity-improving for the whole suite: the existing `DriverStatusHubE2eTests` keeps spawning its own bridge, but real driver health now flows in tests that deploy drivers. ### 2. The reconnect-transition E2E test (the real gap) A new test (in `DriverReconnectE2eTests.cs`, or a focused new file) that: 1. Starts the harness with a **controllable fake `IDriverFactory`** (see decision below). 2. Seeds a driver row + minimal equipment/tag using the existing `SeedDeploymentWithEquipmentTags` precedent (from `DriverHostActorLiveValueTests`), bound to `NodeANodeId`. 3. Triggers a deploy (`DispatchDeployment`, or `POST /api/deployments` with `HarnessDeployApiKey`). 4. Spawns the **real** `DriverStatusSignalRBridge` over the real DI snapshot store (the store is the observation surface; a mock `IHubContext` captures the hub push the same way the existing hub test does). 5. Waits (condition-poll, generous timeout) for the snapshot store to report the deployed instance as `Healthy`. 6. Dispatches `ReconnectDriver` via `IAdminOperationsClient` (the real cluster-singleton path the AdminUI button uses). 7. Asserts the store observes the transition **`Reconnecting`** and then returns to **`Healthy`** within a timeout — proving the full wiring: `ReconnectDriver → AdminOperationsActor → DriverHostActor.HandleReconnectDriver → DriverInstanceActor FSM (ForceReconnect → Become(Reconnecting) → Become(Connected)) → PublishHealthSnapshot → driver-health DPS topic → DriverStatusSignalRBridge → store`. #### Decision: controllable fake driver factory (not the real Modbus sim) **Recommended and approved:** observe the transition via a deterministic, controllable fake `IDriver` / `IDriverFactory` test double rather than a real Modbus sim connection. Rationale: - **Determinism, no flakiness.** A fake driver whose connect succeeds drives the actor to `Connected` immediately; `ReconnectDriver` re-enters `Reconnecting` then `Connected` deterministically. No sim timing, no skip-gate, runs everywhere. - **Smaller blast radius.** The real-sim path additionally needs the full driver-factory bootstrap (all 9 driver factories + deps) wired into the shared harness. - **The wiring is what matters.** This gap is about the *health-transition + command wiring*, not the Modbus protocol. The real Modbus TCP connect/reconnect is already covered by the `Modbus.IntegrationTests` and the 10.1 `TestConnect` E2E (against the sim). The fake `IDriver` exposes a minimal controllable surface (succeed-connect, optionally signal a fault) sufficient to walk the FSM through `Connected → Reconnecting → Connected`. ### 3. Live suite run Run the full `Host.IntegrationTests` driver E2E suite. Bring up the Modbus sim (`lmxopcua-fix up modbus standard`, endpoint `10.100.0.35:5020` / `MODBUS_SIM_ENDPOINT`) so the skip-gated 10.1 `DriverTestConnectE2eTests` actually execute green (not skipped), alongside the new deterministic reconnect test (which runs regardless of the sim). ### 4. Reconcile the stale trackers - `docs/plans/2026-05-28-adminui-driver-pages-plan.md.tasks.json`: mark Phases 6–10 tasks `completed` with their real commits / a "shipped — reconciled 2026-06-18" note; flip `executionState`/`lastUpdated`. - `stillpending.md` §A.9: mark Phase 6/8/10 SHIPPED (note the new reconnect-transition test; keep the deferred full-stack hub test as a documented follow-up). *(Never staged — local working file.)* - `docs/plans/2026-05-28-adminui-driver-pages-design.md` §8.3: fix the stale `ModbusTcp` → `Modbus` reference in the smoke checklist. - Memory: update `project_stillpending_backlog.md` + `MEMORY.md`. ### 5. Explicitly deferred (documented follow-up) The **full-stack WebSocket + JWT `DriverStatusHub` connection test** (a real `HubConnection` to `/hubs/driverstatus` with a minted bearer token, `JoinDriver`, assert client receipt). No repo precedent (no test mints a JWT or opens a real `HubConnection`), flaky-prone, and it only re-covers transport the mock-hub test + the §8.3 manual runbook already handle. ## Out of scope - The 10.4 manual browser smoke (driving the AdminUI on docker-dev). Foldable later; the automated reconnect test + green suite is the higher-value core. - Any production code change. This phase is test + harness + docs only. ## Constraints - xUnit + Shouldly. **No bUnit.** - **No** EF migration, **no** Commons wire/proto contract change, **no** Core.Abstractions / interface contract change. - Stage by explicit path; never `git add .`; never stage `sql_login.txt`, `src/Server/ZB.MOM.WW.OtOpcUa.Host/pki/`, `pending.md`, `current.md`, `docker-dev/docker-compose.yml`, `stillpending.md`. - No force-push, no `--no-verify`. ## Finish Merge to master + push.