docs(plans): residual-followups-cleanup plan (4 offline items; reconcile stale residuals)

This commit is contained in:
Joseph Doherty
2026-06-14 22:38:42 -04:00
parent cd20c3c064
commit a1d333869e
2 changed files with 131 additions and 0 deletions
@@ -0,0 +1,120 @@
# Residual Follow-ups Cleanup Implementation Plan
> **For Claude:** Execute task-by-task via superpowers-extended-cc:subagent-driven-development.
**Goal:** Close the remaining offline-doable, non-blocking residual follow-ups from `pending.md` sections 13 (write-pipeline / Phase B native alarms / Galaxy driver nits), and reconcile `pending.md` to record the ones already implemented.
**Architecture:** Pure test + doc work. **No production code changes.** Three new/extended unit tests, one doc note, one test-harness de-dup refactor. All under `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/` + `docs/`.
**Tech stack:** xUnit + Shouldly, Akka.TestKit, .NET 10.
**Branch:** `feat/residual-followups-cleanup` off master `cd20c3c0`.
## Reconciliation finding (why this plan is small)
A grounding sweep (3 read-only Explore agents + direct verification) found that **most** of the
pending.md residuals were already implemented in the harden-milestone-1b / write-outcome / `f05b5d79`
commits. Already-done (verified, no work):
- **WP (a)** stale/reconnecting writes fast-fail — `DriverHostActor.cs:661` (`Receive<RouteNodeWrite>`), `DriverInstanceActor.cs:229,316` (`Receive<WriteAttribute>`).
- **WP (b)** `ExecuteSynchronously` already dropped (`DriverHostActor.cs:630` uses `TaskContinuationOptions.None`); `driverIds` forward-map is already a `HashSet` (`DeploymentArtifact.cs:285`).
- **WP (c)** FOCAS address parse already cached via `GetOrAdd` (`FocasDriver.cs:247`); only the unavoidable first-write parse remains.
- **WP (d)** raw-protocol-blob write test already exists (`Primary_routes_write_for_raw_protocol_blob_tag`).
- **WP (e)** `[InlineData(2, false)]` future-enum trap already in `ParseComposition_maps_numeric_AccessLevel_to_Writable`.
- **Galaxy (1)** `_itemHandles` + `_supervisedHandles` already cleared on reconnect — `GatewayGalaxyDataWriter.InvalidateHandleCaches()` clears both, called from `GalaxyDriver.ReopenAsync()`.
- **Galaxy (2)** `SubscriptionEstablished` self-Tell already handled (debug-swallow) in all three `DriverInstanceActor` states (`:253,:301,:336`).
Genuinely open + offline → the four tasks below. (Live `/run` and live-gw-only integration checks are out of scope per the request.)
---
### Task 1: Phase B (a) — regression test: native alarm during Reconnecting is dropped
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 2, Task 3, Task 4 (disjoint files; but execute serially to avoid concurrent-commit races)
**Files:**
- Test: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverInstanceActorNativeAlarmTests.cs`
**Context:** `DriverInstanceActor` in `Connecting`/`Reconnecting` explicitly drops `NativeAlarmRaised` with a debug log (`DriverInstanceActor.cs:262, :345`) so it never dead-letters; only `Connected` forwards it to the parent as `AttributeAlarmPublished` (`:298`). The production behaviour exists; there is no regression test guarding it. The `DetachSubscription` doc-comment (`:499`) already documents the data-change + native-alarm teardown coupling — no doc change needed.
**Steps:**
1. Read the existing test file to reuse its harness (the alarm-source stub driver, the parent `TestProbe`, and the helper that drives the actor into a given state). Note how existing tests (`Native_alarm_is_forwarded_to_parent_as_AttributeAlarmPublished`, `Reconnect_does_not_double_attach_alarm_handler`) construct the actor and force `Reconnecting`.
2. Add a test `Native_alarm_during_reconnect_is_dropped_not_forwarded`: drive the actor into `Reconnecting` (e.g. start it failing init / force-reconnect), `Tell` it a `NativeAlarmRaised`, and assert the parent probe receives **no** `AttributeAlarmPublished` within a short window (`ExpectNoMsg`), and the actor is still alive (no crash / restart).
3. Run: `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests --filter "FullyQualifiedName~DriverInstanceActorNativeAlarm"` — expect green.
4. Commit `test(alarms): guard native-alarm-during-reconnect is dropped not dead-lettered`.
**Acceptance:** new test passes; asserts no forward + actor survives. No production file touched.
---
### Task 2: Phase B (b) — test: OperatorComment flows through ForwardNativeAlarm
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 1, 3, 4 (serial execution)
**Files:**
- Test: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverHostActorNativeAlarmTests.cs`
**Context:** `DriverHostActor.ForwardNativeAlarm` (`DriverHostActor.cs:519`) publishes an `AlarmTransitionEvent` to the alerts topic with `Comment: msg.Args.OperatorComment` and `User: msg.Args.OperatorComment is null ? string.Empty : "device"`. `AlarmEventArgs.OperatorComment` (`IAlarmSource.cs`) and `AlarmTransitionEvent.Comment` (`Messages/Alerts/AlarmTransitionEvent.cs`) both carry it. No existing test asserts the comment propagates.
**Steps:**
1. Read the existing test file to reuse its harness (Primary node setup, alerts-topic `TestProbe`/mediator, the `AttributeAlarmPublished` construction + alarm-node-id registration that existing tests like `Secondary_node_suppresses_alerts_publish_but_still_updates_condition` use).
2. Add a test `Native_alarm_operator_comment_flows_to_transition_event`: construct an `AttributeAlarmPublished` whose `Args.OperatorComment = "investigating"`, route it through the Primary host, and assert the published `AlarmTransitionEvent.Comment == "investigating"` and `.User == "device"`.
3. Run: `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests --filter "FullyQualifiedName~DriverHostActorNativeAlarm"` — expect green.
4. Commit `test(alarms): assert OperatorComment flows through ForwardNativeAlarm`.
**Acceptance:** new test passes asserting `Comment` + `User` propagation. No production file touched.
---
### Task 3: Phase B (c) — doc note on severity-bucket snapping
**Classification:** trivial
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 1, 2, 4 (serial execution)
**Files:**
- Doc: `docs/ScriptedAlarms.md`
**Context:** Authored native-alarm severity (1..1000) seeds the condition at materialise time, but on the **first transition** it snaps to a 4-bucket value via `NativeAlarmProjector.MapSeverity` (Low→200, Medium→500, High→700, Critical→900; default→500). That projected value is then mapped to SDK `EventSeverity` brackets by `OtOpcUaNodeManager.MapSeverity` (`<200`=Low, `<400`=MediumLow, `<600`=Medium, `<800`=MediumHigh, `≥800`=High) for the Part 9 node's `Severity` attribute.
**Steps:**
1. In the "Native driver alarms (equipment-tag path)" section (after the `severity` field description, ~line 143), add a short **Severity mapping** note describing the snap: authored severity seeds the condition; the first transition snaps it to one of 200/500/700/900; that bucket then maps to the SDK `EventSeverity` brackets. Reference `NativeAlarmProjector.MapSeverity`.
2. Commit `docs(alarms): note native-alarm severity-bucket snapping`.
**Acceptance:** doc note is accurate to the two mapping tables; no behaviour change.
---
### Task 4: Galaxy (3) — extract shared stub-driver test harness
**Classification:** small
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 1, 2, 3 (serial execution)
**Files:**
- Create: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/StubDrivers.cs`
- Modify: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverInstanceActorTests.cs` (remove the private nested stub classes, ~lines 325462)
- Modify: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverInstanceActorWriteAndSubscribeTests.cs` (remove the private nested stub classes, ~lines 178312)
**Context:** Both files define their own `StubDriver : IDriver`, `WritableStubDriver : StubDriver, IWritable`, `SubscribableStubDriver : StubDriver, ISubscribable`, and nested `StubHandle : ISubscriptionHandle`. `DriverInstanceActorTests`' copy is a **superset** (adds `ReinitializeCount`, `InitConfigs`, `InitBehavior`, `UnsubscribeYields`, `LastSubscribedRefs`). The de-dup should promote the **superset** versions to shared `internal` classes so both suites compile, then delete both private copies.
**Steps:**
1. Read both test files' stub regions to capture the exact superset (the `DriverInstanceActorTests` versions).
2. Create `StubDrivers.cs` with `internal` `StubDriver`, `WritableStubDriver`, `SubscribableStubDriver`, `StubHandle` = the superset versions (keep all the doc-comments so doc-checker stays clean). Use the same namespace as the two test files.
3. Delete the private nested copies from both `DriverInstanceActorTests.cs` and `DriverInstanceActorWriteAndSubscribeTests.cs`.
4. Run the WHOLE project (both suites share the harness): `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests` — expect green, same test count as before.
5. Commit `test(drivers): extract shared stub-driver harness (de-dup)`.
**Acceptance:** both suites compile + pass against the single shared harness; no test removed; no production file touched.
---
### Task 5: Reconcile pending.md (disk-only, NOT committed)
**Classification:** trivial — performed by the controller, not a subagent.
Update `pending.md` open-item #3 to record the verified already-done follow-ups (with file:line evidence) and mark the four items above closed. `pending.md` is a disk-only working-notes file and is **not staged/committed** (standing hard rule), so this is a working-tree edit only.
@@ -0,0 +1,11 @@
{
"planPath": "docs/plans/2026-06-14-residual-followups-cleanup-plan.md",
"tasks": [
{"id": 1, "subject": "Task 1: Phase B (a) regression test — native alarm during Reconnecting is dropped", "status": "pending"},
{"id": 2, "subject": "Task 2: Phase B (b) test — OperatorComment flows through ForwardNativeAlarm", "status": "pending"},
{"id": 3, "subject": "Task 3: Phase B (c) doc note — severity-bucket snapping (ScriptedAlarms.md)", "status": "pending"},
{"id": 4, "subject": "Task 4: Galaxy (3) — extract shared stub-driver test harness (de-dup)", "status": "pending"},
{"id": 5, "subject": "Task 5: Reconcile pending.md (disk-only, not committed)", "status": "pending"}
],
"lastUpdated": "2026-06-14"
}