Files
lmxopcua/docs/plans/2026-06-14-residual-followups-cleanup-plan.md
T

121 lines
9.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Residual Follow-ups Cleanup Implementation Plan
> **For Claude:** Execute task-by-task via superpowers-extended-cc:subagent-driven-development.
**Goal:** Close the remaining offline-doable, non-blocking residual follow-ups from `pending.md` sections 13 (write-pipeline / Phase B native alarms / Galaxy driver nits), and reconcile `pending.md` to record the ones already implemented.
**Architecture:** Pure test + doc work. **No production code changes.** Three new/extended unit tests, one doc note, one test-harness de-dup refactor. All under `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/` + `docs/`.
**Tech stack:** xUnit + Shouldly, Akka.TestKit, .NET 10.
**Branch:** `feat/residual-followups-cleanup` off master `cd20c3c0`.
## Reconciliation finding (why this plan is small)
A grounding sweep (3 read-only Explore agents + direct verification) found that **most** of the
pending.md residuals were already implemented in the harden-milestone-1b / write-outcome / `f05b5d79`
commits. Already-done (verified, no work):
- **WP (a)** stale/reconnecting writes fast-fail — `DriverHostActor.cs:661` (`Receive<RouteNodeWrite>`), `DriverInstanceActor.cs:229,316` (`Receive<WriteAttribute>`).
- **WP (b)** `ExecuteSynchronously` already dropped (`DriverHostActor.cs:630` uses `TaskContinuationOptions.None`); `driverIds` forward-map is already a `HashSet` (`DeploymentArtifact.cs:285`).
- **WP (c)** FOCAS address parse already cached via `GetOrAdd` (`FocasDriver.cs:247`); only the unavoidable first-write parse remains.
- **WP (d)** raw-protocol-blob write test already exists (`Primary_routes_write_for_raw_protocol_blob_tag`).
- **WP (e)** `[InlineData(2, false)]` future-enum trap already in `ParseComposition_maps_numeric_AccessLevel_to_Writable`.
- **Galaxy (1)** `_itemHandles` + `_supervisedHandles` already cleared on reconnect — `GatewayGalaxyDataWriter.InvalidateHandleCaches()` clears both, called from `GalaxyDriver.ReopenAsync()`.
- **Galaxy (2)** `SubscriptionEstablished` self-Tell already handled (debug-swallow) in all three `DriverInstanceActor` states (`:253,:301,:336`).
Genuinely open + offline → the four tasks below. (Live `/run` and live-gw-only integration checks are out of scope per the request.)
---
### Task 1: Phase B (a) — regression test: native alarm during Reconnecting is dropped
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 2, Task 3, Task 4 (disjoint files; but execute serially to avoid concurrent-commit races)
**Files:**
- Test: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverInstanceActorNativeAlarmTests.cs`
**Context:** `DriverInstanceActor` in `Connecting`/`Reconnecting` explicitly drops `NativeAlarmRaised` with a debug log (`DriverInstanceActor.cs:262, :345`) so it never dead-letters; only `Connected` forwards it to the parent as `AttributeAlarmPublished` (`:298`). The production behaviour exists; there is no regression test guarding it. The `DetachSubscription` doc-comment (`:499`) already documents the data-change + native-alarm teardown coupling — no doc change needed.
**Steps:**
1. Read the existing test file to reuse its harness (the alarm-source stub driver, the parent `TestProbe`, and the helper that drives the actor into a given state). Note how existing tests (`Native_alarm_is_forwarded_to_parent_as_AttributeAlarmPublished`, `Reconnect_does_not_double_attach_alarm_handler`) construct the actor and force `Reconnecting`.
2. Add a test `Native_alarm_during_reconnect_is_dropped_not_forwarded`: drive the actor into `Reconnecting` (e.g. start it failing init / force-reconnect), `Tell` it a `NativeAlarmRaised`, and assert the parent probe receives **no** `AttributeAlarmPublished` within a short window (`ExpectNoMsg`), and the actor is still alive (no crash / restart).
3. Run: `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests --filter "FullyQualifiedName~DriverInstanceActorNativeAlarm"` — expect green.
4. Commit `test(alarms): guard native-alarm-during-reconnect is dropped not dead-lettered`.
**Acceptance:** new test passes; asserts no forward + actor survives. No production file touched.
---
### Task 2: Phase B (b) — test: OperatorComment flows through ForwardNativeAlarm
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 1, 3, 4 (serial execution)
**Files:**
- Test: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverHostActorNativeAlarmTests.cs`
**Context:** `DriverHostActor.ForwardNativeAlarm` (`DriverHostActor.cs:519`) publishes an `AlarmTransitionEvent` to the alerts topic with `Comment: msg.Args.OperatorComment` and `User: msg.Args.OperatorComment is null ? string.Empty : "device"`. `AlarmEventArgs.OperatorComment` (`IAlarmSource.cs`) and `AlarmTransitionEvent.Comment` (`Messages/Alerts/AlarmTransitionEvent.cs`) both carry it. No existing test asserts the comment propagates.
**Steps:**
1. Read the existing test file to reuse its harness (Primary node setup, alerts-topic `TestProbe`/mediator, the `AttributeAlarmPublished` construction + alarm-node-id registration that existing tests like `Secondary_node_suppresses_alerts_publish_but_still_updates_condition` use).
2. Add a test `Native_alarm_operator_comment_flows_to_transition_event`: construct an `AttributeAlarmPublished` whose `Args.OperatorComment = "investigating"`, route it through the Primary host, and assert the published `AlarmTransitionEvent.Comment == "investigating"` and `.User == "device"`.
3. Run: `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests --filter "FullyQualifiedName~DriverHostActorNativeAlarm"` — expect green.
4. Commit `test(alarms): assert OperatorComment flows through ForwardNativeAlarm`.
**Acceptance:** new test passes asserting `Comment` + `User` propagation. No production file touched.
---
### Task 3: Phase B (c) — doc note on severity-bucket snapping
**Classification:** trivial
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 1, 2, 4 (serial execution)
**Files:**
- Doc: `docs/ScriptedAlarms.md`
**Context:** Authored native-alarm severity (1..1000) seeds the condition at materialise time, but on the **first transition** it snaps to a 4-bucket value via `NativeAlarmProjector.MapSeverity` (Low→200, Medium→500, High→700, Critical→900; default→500). That projected value is then mapped to SDK `EventSeverity` brackets by `OtOpcUaNodeManager.MapSeverity` (`<200`=Low, `<400`=MediumLow, `<600`=Medium, `<800`=MediumHigh, `≥800`=High) for the Part 9 node's `Severity` attribute.
**Steps:**
1. In the "Native driver alarms (equipment-tag path)" section (after the `severity` field description, ~line 143), add a short **Severity mapping** note describing the snap: authored severity seeds the condition; the first transition snaps it to one of 200/500/700/900; that bucket then maps to the SDK `EventSeverity` brackets. Reference `NativeAlarmProjector.MapSeverity`.
2. Commit `docs(alarms): note native-alarm severity-bucket snapping`.
**Acceptance:** doc note is accurate to the two mapping tables; no behaviour change.
---
### Task 4: Galaxy (3) — extract shared stub-driver test harness
**Classification:** small
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 1, 2, 3 (serial execution)
**Files:**
- Create: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/StubDrivers.cs`
- Modify: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverInstanceActorTests.cs` (remove the private nested stub classes, ~lines 325462)
- Modify: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverInstanceActorWriteAndSubscribeTests.cs` (remove the private nested stub classes, ~lines 178312)
**Context:** Both files define their own `StubDriver : IDriver`, `WritableStubDriver : StubDriver, IWritable`, `SubscribableStubDriver : StubDriver, ISubscribable`, and nested `StubHandle : ISubscriptionHandle`. `DriverInstanceActorTests`' copy is a **superset** (adds `ReinitializeCount`, `InitConfigs`, `InitBehavior`, `UnsubscribeYields`, `LastSubscribedRefs`). The de-dup should promote the **superset** versions to shared `internal` classes so both suites compile, then delete both private copies.
**Steps:**
1. Read both test files' stub regions to capture the exact superset (the `DriverInstanceActorTests` versions).
2. Create `StubDrivers.cs` with `internal` `StubDriver`, `WritableStubDriver`, `SubscribableStubDriver`, `StubHandle` = the superset versions (keep all the doc-comments so doc-checker stays clean). Use the same namespace as the two test files.
3. Delete the private nested copies from both `DriverInstanceActorTests.cs` and `DriverInstanceActorWriteAndSubscribeTests.cs`.
4. Run the WHOLE project (both suites share the harness): `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests` — expect green, same test count as before.
5. Commit `test(drivers): extract shared stub-driver harness (de-dup)`.
**Acceptance:** both suites compile + pass against the single shared harness; no test removed; no production file touched.
---
### Task 5: Reconcile pending.md (disk-only, NOT committed)
**Classification:** trivial — performed by the controller, not a subagent.
Update `pending.md` open-item #3 to record the verified already-done follow-ups (with file:line evidence) and mark the four items above closed. `pending.md` is a disk-only working-notes file and is **not staged/committed** (standing hard rule), so this is a working-tree edit only.