Files
lmxopcua/docs/plans/2026-06-18-driver-pages-reconnect-e2e.md
T

198 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Driver-pages Phase 10 — reconnect-transition E2E + close-out Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:subagent-driven-development to implement this plan task-by-task.
**Goal:** Add the one genuinely-missing driver-pages E2E test — a *deployed* driver
transitioning **Healthy → Reconnecting → Healthy** on `ReconnectDriver` — fix the
harness-fidelity gap behind it, prove the suite green, and reconcile the stale trackers.
**Architecture:** Extend `TwoNodeClusterHarness` to match production DI (`AddOtOpcUaRuntime`,
which binds the real `AkkaDriverHealthPublisher`) and to accept an opt-in test
`IDriverFactory`. A controllable fake driver lets a deployed driver reach `Connected`
deterministically; the real `DriverStatusSignalRBridge` + a capturing mock `IHubContext`
record the full health-transition sequence through the real cluster wiring.
**Tech Stack:** xUnit + Shouldly, Akka.NET TestKit/Hosting, Moq (for `IHubContext`), EF
InMemory. No bUnit, no EF migration, no Commons/proto/interface change.
**Design:** `docs/plans/2026-06-18-driver-pages-reconnect-e2e-design.md` (committed `482418c8`).
---
### Task 1: Harness fidelity fix + controllable fake driver factory
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/Fakes/FakeReconnectDriverFactory.cs`
- Modify: `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/TwoNodeClusterHarness.cs`
- Read for contract (do NOT edit): `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IDriver.cs`,
`src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IDriverFactory.cs`,
`tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverInstanceActorTests.cs`
(existing fake `IDriver`/`IDriverFactory` double to template from),
`src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ServiceCollectionExtensions.cs` (the
`AddOtOpcUaRuntime` method + the `resolver.GetService<IDriverHealthPublisher>()` site).
**Context (scene-setting):** Today `TwoNodeClusterHarness.BuildNodeAsync` calls
`WithOtOpcUaRuntimeActors()` (Akka actor spawn) but **not** `AddOtOpcUaRuntime()` (the DI
registration). So `IDriverHealthPublisher` resolves to `NullDriverHealthPublisher` and
`IDriverFactory` to `NullDriverFactory` → deployed drivers never publish health and reach
only `Stubbed`. Production (`Program.cs:87` + `:199`) calls both. This task brings the
harness to production fidelity and adds an opt-in fake factory so a test can drive a real
`Connected` state.
**Step 1: Build the fake driver double.** Create `FakeReconnectDriverFactory` implementing
`IDriverFactory` whose `TryCreate(driverType, instanceId, configJson)` returns a fake
`IDriver` (for `driverType == "Modbus"`; `SupportedTypes => ["Modbus"]`). Mirror the
existing fake `IDriver` in `DriverInstanceActorTests.cs` for the full member surface
(`DriverType`, connect/initialize, read/write/subscribe, dispose). The fake's
initialize/connect path must **succeed** so the `DriverInstanceActor` reaches
`InitializeSucceeded → Become(Connected)`. Keep read/subscribe as benign success/no-ops.
(No fault-injection needed: `ReconnectDriver` drives `ForceReconnect → Reconnecting →
re-initialize → Connected` on its own.)
**Step 2: Wire the harness.** In `TwoNodeClusterHarness`:
- Add `builder.Services.AddOtOpcUaRuntime();` **before** the `AddAkka(...)` call in
`BuildNodeAsync` (match production ordering — "Call this BEFORE AddAkka").
- Add an optional `IDriverFactory? driverFactory = null` parameter to `StartAsync` and
thread it into `BuildNodeAsync`; when non-null, register it
(`builder.Services.AddSingleton<IDriverFactory>(driverFactory);` placed **after**
`AddOtOpcUaRuntime` so it replaces the `Null` default — confirm the runtime resolves the
last/replacement registration, not the `TryAdd` default; if `TryAdd` would win, use
`Replace` or register the override before `AddOtOpcUaRuntime`).
**Step 3: Build + regression-check existing suite.** Run
`dotnet build tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests` (clean), then
`dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests` — the existing tests
(DeployApi, DriverReconnect, DriverStatusHub, DriverTestConnect, etc.) must stay green with
the added `AddOtOpcUaRuntime` (the Null sinks are inert; nothing existing subscribes to
driver-health). Skipped fixture-gated tests staying skipped is expected.
**Step 4: Commit.**
```bash
git add tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/Fakes/FakeReconnectDriverFactory.cs \
tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/TwoNodeClusterHarness.cs
git commit -m "test(harness): production-fidelity DI (AddOtOpcUaRuntime) + opt-in fake driver factory"
```
---
### Task 2: Reconnect health-transition E2E test
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (depends on Task 1)
**Files:**
- Modify: `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/DriverReconnectE2eTests.cs`
(add the new test method; keep the existing two)
- Read for pattern (do NOT edit):
`tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverHostActorLiveValueTests.cs`
(the `SeedDeploymentWithEquipmentTags` + `DispatchDeployment` deploy precedent),
`tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/DriverStatusHubE2eTests.cs`
(the mock `IHubContext` capture + bridge-spawn pattern).
**Context:** The existing `Reconnect_RoundTrip_ReturnsOk` only asserts command ingestion.
This new test proves the *actual health transition* of a deployed driver, end-to-end through
the real cluster wiring: `ReconnectDriver → AdminOperationsActor → DriverHostActor →
DriverInstanceActor FSM → PublishHealthSnapshot → driver-health DPS topic →
DriverStatusSignalRBridge → snapshot store / hub push`.
**Step 1: Write the test** `Reconnect_DeployedDriver_TransitionsThroughReconnectingBackToHealthy`:
1. `await TwoNodeClusterHarness.StartAsync(driverFactory: new FakeReconnectDriverFactory())`.
2. Seed a deployment with one `Modbus` driver instance (+ one equipment/tag so the artifact
projects) bound to `NodeANodeId`, using the `SeedDeploymentWithEquipmentTags` approach
adapted from `DriverHostActorLiveValueTests` (seed via `CreateConfigDbContextAsync`), then
trigger the deploy (`DispatchDeployment` to the deploy coordinator, or
`POST /api/deployments` with `HarnessDeployApiKey` as in `DeployApiE2eTests`).
3. Resolve the real DI `IDriverStatusSnapshotStore`; spawn the real
`DriverStatusSignalRBridge` over it with a **capturing** mock `IHubContext<DriverStatusHub>`
that appends every pushed `DriverHealthChanged.State` to a list (reuse the
`DriverStatusHubE2eTests` mock pattern — note it records every `SendCoreAsync`, giving the
full transition sequence, not just last-write).
4. Condition-poll (generous timeout, e.g. 20 s) until the store reports the instance
`Healthy` (confirm the exact `DriverState` string the Connected state publishes by reading
the health computation; it is the same value the panel's `ChipClass` maps as `"Healthy"`).
5. Dispatch `ReconnectDriver(ClusterId, instanceId, "e2e", Guid.NewGuid())` via
`IAdminOperationsClient.AskAsync<ReconnectDriverResult>`; assert `Ok`.
6. Condition-poll the captured push list until it contains a `Reconnecting` entry **after**
the initial Healthy, followed by a return to `Healthy`. Assert the store's final state is
`Healthy`.
This test must run **without** any Docker fixture (the fake driver is in-process) — it is
NOT skip-gated.
**Step 2: Run.** `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests --filter "FullyQualifiedName~DriverReconnectE2eTests"` — all green (the new test executes, not skips).
**Step 3: Commit.**
```bash
git add tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/DriverReconnectE2eTests.cs
git commit -m "test(adminui): E2E deployed-driver Healthy→Reconnecting→Healthy transition on Reconnect"
```
---
### Task 3: Full driver E2E suite live run + verification
**Classification:** small
**Estimated implement time:** ~5 min
**Parallelizable with:** none (depends on Task 2)
**Files:** none (verification only)
**Step 1: Bring up the Modbus sim** so the skip-gated 10.1 tests execute (not skip):
`lmxopcua-fix up modbus standard` (sim at `10.100.0.35:5020`). Verify reachability.
**Step 2: Run the full driver E2E suite:**
`dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests` — confirm the
`DriverTestConnectE2eTests` (now executing against the live sim, not skipped),
`DriverReconnectE2eTests` (incl. the new transition test), and `DriverStatusHubE2eTests`
all pass. Record pass counts + which previously-skipped tests now executed.
**Step 3:** If any sim-gated test cannot run (sim unreachable from this host), record that
honestly; the new in-process transition test must pass regardless. No commit (verification).
---
### Task 4: Reconcile stale trackers + finish
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** none (depends on Task 3)
**Files:**
- Modify: `docs/plans/2026-05-28-adminui-driver-pages-plan.md.tasks.json` (mark Phases 610
`completed` with real commits / a "shipped — reconciled 2026-06-18" note; bump `lastUpdated`)
- Modify: `docs/plans/2026-05-28-adminui-driver-pages-design.md` (§8.3: `ModbusTcp``Modbus`)
- Modify: `stillpending.md` §A.9 (mark Phase 6/8/10 SHIPPED; record the new reconnect-transition
test; keep the full-stack hub test as a documented deferred follow-up) — **NEVER STAGE this
file** (local working file)
- Modify: memory `project_stillpending_backlog.md` + `MEMORY.md`
**Step 1:** Reconcile the `.tasks.json` (Phases 610 → completed, with commit refs from the
brainstorming finding) and fix the §8.3 `ModbusTcp` string.
**Step 2:** Stage **only** the two `docs/plans/...` files (the tasks.json + the design md) —
by explicit path. Do NOT `git add .`. Do NOT stage `stillpending.md`.
```bash
git add docs/plans/2026-05-28-adminui-driver-pages-plan.md.tasks.json \
docs/plans/2026-05-28-adminui-driver-pages-design.md
git commit -m "docs(plans): reconcile driver-pages tasks (Phases 6-10 shipped) + fix smoke checklist"
```
**Step 3:** Update `stillpending.md` §A.9 (unstaged) + memory files.
**Step 4: Finish.** Use superpowers-extended-cc:finishing-a-development-branch — verify the
suite green, then merge `feat/driver-pages-reconnect-e2e` → master (ff/merge) + push. Bookkeep
this plan's `.tasks.json` (executionState COMPLETE) on master.
---
## Cross-cutting verification (before merge)
1. `dotnet build tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests` — clean.
2. `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests` — green (new test executes).
3. `git diff --stat master..` — only the expected harness/test/docs files; no surprise changes,
no never-stage files staged.