docs(plans): refine Task 2 with GetHealth/seed/bridge-ordering findings from exploration
This commit is contained in:
@@ -88,11 +88,14 @@ git commit -m "test(harness): production-fidelity DI (AddOtOpcUaRuntime) + opt-i
|
||||
**Files:**
|
||||
- Modify: `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/DriverReconnectE2eTests.cs`
|
||||
(add the new test method; keep the existing two)
|
||||
- Modify: `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/Fakes/FakeReconnectDriverFactory.cs`
|
||||
(Step 0: controllable health + `InitializeCount` + created-driver accessor)
|
||||
- Read for pattern (do NOT edit):
|
||||
`tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverHostActorLiveValueTests.cs`
|
||||
(the `SeedDeploymentWithEquipmentTags` + `DispatchDeployment` deploy precedent),
|
||||
`tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/MultiClusterScopingTests.cs`
|
||||
(the GREEN seed-`ServerCluster`/`Namespace`/`ClusterNode`/`DriverInstance` + `StartDeploymentAsync`
|
||||
→ `Accepted` + per-node driver spawn precedent),
|
||||
`tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/DriverStatusHubE2eTests.cs`
|
||||
(the mock `IHubContext` capture + bridge-spawn pattern).
|
||||
(the mock `IHubContext` capture + manual bridge-spawn pattern).
|
||||
|
||||
**Context:** The existing `Reconnect_RoundTrip_ReturnsOk` only asserts command ingestion.
|
||||
This new test proves the *actual health transition* of a deployed driver, end-to-end through
|
||||
@@ -100,29 +103,55 @@ the real cluster wiring: `ReconnectDriver → AdminOperationsActor → DriverHos
|
||||
DriverInstanceActor FSM → PublishHealthSnapshot → driver-health DPS topic →
|
||||
DriverStatusSignalRBridge → snapshot store / hub push`.
|
||||
|
||||
**Step 1: Write the test** `Reconnect_DeployedDriver_TransitionsThroughReconnectingBackToHealthy`:
|
||||
1. `await TwoNodeClusterHarness.StartAsync(driverFactory: new FakeReconnectDriverFactory())`.
|
||||
2. Seed a deployment with one `Modbus` driver instance (+ one equipment/tag so the artifact
|
||||
projects) bound to `NodeANodeId`, using the `SeedDeploymentWithEquipmentTags` approach
|
||||
adapted from `DriverHostActorLiveValueTests` (seed via `CreateConfigDbContextAsync`), then
|
||||
trigger the deploy (`DispatchDeployment` to the deploy coordinator, or
|
||||
`POST /api/deployments` with `HarnessDeployApiKey` as in `DeployApiE2eTests`).
|
||||
3. Resolve the real DI `IDriverStatusSnapshotStore`; spawn the real
|
||||
`DriverStatusSignalRBridge` over it with a **capturing** mock `IHubContext<DriverStatusHub>`
|
||||
that appends every pushed `DriverHealthChanged.State` to a list (reuse the
|
||||
`DriverStatusHubE2eTests` mock pattern — note it records every `SendCoreAsync`, giving the
|
||||
full transition sequence, not just last-write).
|
||||
4. Condition-poll (generous timeout, e.g. 20 s) until the store reports the instance
|
||||
`Healthy` (confirm the exact `DriverState` string the Connected state publishes by reading
|
||||
the health computation; it is the same value the panel's `ChipClass` maps as `"Healthy"`).
|
||||
5. Dispatch `ReconnectDriver(ClusterId, instanceId, "e2e", Guid.NewGuid())` via
|
||||
`IAdminOperationsClient.AskAsync<ReconnectDriverResult>`; assert `Ok`.
|
||||
6. Condition-poll the captured push list until it contains a `Reconnecting` entry **after**
|
||||
the initial Healthy, followed by a return to `Healthy`. Assert the store's final state is
|
||||
`Healthy`.
|
||||
**Key findings from exploration (these shape the design — do not skip):**
|
||||
- **Published `State` = `_driver.GetHealth().State`** (`DriverInstanceActor.PublishHealthSnapshot`,
|
||||
line 750). The actor FSM (`Become(Reconnecting)`) does NOT set the published state directly —
|
||||
on `ForceReconnect` it does `DetachSubscription(); Become(Reconnecting); PublishHealthSnapshot()`,
|
||||
which *polls the driver's `GetHealth()`*. So the **always-`Healthy` Task 1 fake can never surface
|
||||
`Reconnecting`.** The fake must report `Reconnecting` at that poll. The realistic, deterministic
|
||||
way: the fake reports `Reconnecting` (simulating a dropped connection — exactly what prompts an
|
||||
operator to click Reconnect), the `ForceReconnect` poll publishes it, and the retry's
|
||||
`InitializeAsync` clears it back to `Healthy`.
|
||||
- **Validator-clean seed** (from the GREEN `MultiClusterScopingTests`): `ServerCluster` +
|
||||
`Namespace` + `ClusterNode`(NodeId = `NodeANodeId`) + `DriverInstance`(Enabled, DriverType
|
||||
`"Modbus"`, DriverConfig `"{}"`). **No equipment/tags** — equipment/tags trip `DraftValidator`
|
||||
and the deploy is `Rejected` (this is why the stale `EquipmentNamespaceMaterializationTests`
|
||||
fails — pre-existing, unrelated).
|
||||
- **Bridge is NOT auto-spawned** by the harness — spawn it manually (as `DriverStatusHubE2eTests`
|
||||
does). DPS is fire-and-forget (no replay), and the driver's repeat-publish is deduped, so spawn
|
||||
the bridge + await its DPS subscription (~2 s) **before** deploying so it catches the initial
|
||||
`Healthy`.
|
||||
|
||||
This test must run **without** any Docker fixture (the fake driver is in-process) — it is
|
||||
NOT skip-gated.
|
||||
**Step 0: Enhance `FakeReconnectDriver` / `FakeReconnectDriverFactory`** (same file from Task 1):
|
||||
- `FakeReconnectDriver`: add a `volatile`/locked controllable health — `GetHealth()` returns
|
||||
`DriverState.Reconnecting` when a `_reconnecting` flag is set, else `Healthy`; a public
|
||||
`ReportReconnecting()` sets the flag; `InitializeAsync` clears it (and bumps a public
|
||||
`InitializeCount`). (`DriverHealth` ctor = `new(state, lastSuccessfulRead, lastError)`.)
|
||||
- `FakeReconnectDriverFactory`: record created drivers so the test can retrieve the one for a
|
||||
given `driverInstanceId` (e.g. a `ConcurrentDictionary<string, FakeReconnectDriver> Created`
|
||||
or `TryGetCreated(id)`).
|
||||
|
||||
**Step 1: Write the test** `Reconnect_DeployedDriver_TransitionsThroughReconnectingBackToHealthy`:
|
||||
1. `var factory = new FakeReconnectDriverFactory(); await TwoNodeClusterHarness.StartAsync(driverFactory: factory)`.
|
||||
2. Resolve the DI `IDriverStatusSnapshotStore`; spawn the real `DriverStatusSignalRBridge` over it
|
||||
with a **capturing** mock `IHubContext<DriverStatusHub>` recording every pushed
|
||||
`DriverHealthChanged` (reuse the `DriverStatusHubE2eTests` mock pattern — records every
|
||||
`SendCoreAsync`). Wait ~2 s for the DPS `SubscribeAck`.
|
||||
3. Seed `ServerCluster` + `Namespace` + `ClusterNode`(`NodeANodeId`) + one `DriverInstance`
|
||||
(`"Modbus"`, Enabled, `"{}"`, **no tags**) via `CreateConfigDbContextAsync` (mirror
|
||||
`MultiClusterScopingTests.SeedTwoClusterConfigAsync` but a single cluster bound to `NodeANodeId`).
|
||||
4. `StartDeploymentAsync(createdBy: ...)` → assert `Accepted`. Condition-poll (≤20 s) until the
|
||||
store reports the instance `Healthy` (and `factory.TryGetCreated(instanceId)` is non-null).
|
||||
5. `factory.Created[instanceId].ReportReconnecting()` — simulate the driver having lost its
|
||||
connection (the realistic trigger for an operator Reconnect).
|
||||
6. Dispatch `ReconnectDriver(clusterId, instanceId, "e2e", Guid.NewGuid())` via
|
||||
`IAdminOperationsClient.AskAsync<ReconnectDriverResult>`; assert `Ok`.
|
||||
7. Condition-poll the captured push list until it contains a `Reconnecting` entry followed by a
|
||||
later `Healthy`. Assert: the sequence shows `Reconnecting` → `Healthy`, the store's final
|
||||
state is `Healthy`, and `InitializeCount >= 2` (proves the command genuinely re-initialised
|
||||
the deployed driver through the full cluster path — not just a health poke).
|
||||
|
||||
This test runs **without** any Docker fixture (the fake driver is in-process) — NOT skip-gated.
|
||||
|
||||
**Step 2: Run.** `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests --filter "FullyQualifiedName~DriverReconnectE2eTests"` — all green (the new test executes, not skips).
|
||||
|
||||
|
||||
Reference in New Issue
Block a user