Files
lmxopcua/docs/plans/2026-06-14-residual-followups-cleanup-plan.md
T

9.3 KiB
Raw Blame History

Residual Follow-ups Cleanup Implementation Plan

For Claude: Execute task-by-task via superpowers-extended-cc:subagent-driven-development.

Goal: Close the remaining offline-doable, non-blocking residual follow-ups from pending.md sections 13 (write-pipeline / Phase B native alarms / Galaxy driver nits), and reconcile pending.md to record the ones already implemented.

Architecture: Pure test + doc work. No production code changes. Three new/extended unit tests, one doc note, one test-harness de-dup refactor. All under tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ + docs/.

Tech stack: xUnit + Shouldly, Akka.TestKit, .NET 10.

Branch: feat/residual-followups-cleanup off master cd20c3c0.

Reconciliation finding (why this plan is small)

A grounding sweep (3 read-only Explore agents + direct verification) found that most of the pending.md residuals were already implemented in the harden-milestone-1b / write-outcome / f05b5d79 commits. Already-done (verified, no work):

  • WP (a) stale/reconnecting writes fast-fail — DriverHostActor.cs:661 (Receive<RouteNodeWrite>), DriverInstanceActor.cs:229,316 (Receive<WriteAttribute>).
  • WP (b) ExecuteSynchronously already dropped (DriverHostActor.cs:630 uses TaskContinuationOptions.None); driverIds forward-map is already a HashSet (DeploymentArtifact.cs:285).
  • WP (c) FOCAS address parse already cached via GetOrAdd (FocasDriver.cs:247); only the unavoidable first-write parse remains.
  • WP (d) raw-protocol-blob write test already exists (Primary_routes_write_for_raw_protocol_blob_tag).
  • WP (e) [InlineData(2, false)] future-enum trap already in ParseComposition_maps_numeric_AccessLevel_to_Writable.
  • Galaxy (1) _itemHandles + _supervisedHandles already cleared on reconnect — GatewayGalaxyDataWriter.InvalidateHandleCaches() clears both, called from GalaxyDriver.ReopenAsync().
  • Galaxy (2) SubscriptionEstablished self-Tell already handled (debug-swallow) in all three DriverInstanceActor states (:253,:301,:336).

Genuinely open + offline → the four tasks below. (Live /run and live-gw-only integration checks are out of scope per the request.)


Task 1: Phase B (a) — regression test: native alarm during Reconnecting is dropped

Classification: small Estimated implement time: ~4 min Parallelizable with: Task 2, Task 3, Task 4 (disjoint files; but execute serially to avoid concurrent-commit races)

Files:

  • Test: tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverInstanceActorNativeAlarmTests.cs

Context: DriverInstanceActor in Connecting/Reconnecting explicitly drops NativeAlarmRaised with a debug log (DriverInstanceActor.cs:262, :345) so it never dead-letters; only Connected forwards it to the parent as AttributeAlarmPublished (:298). The production behaviour exists; there is no regression test guarding it. The DetachSubscription doc-comment (:499) already documents the data-change + native-alarm teardown coupling — no doc change needed.

Steps:

  1. Read the existing test file to reuse its harness (the alarm-source stub driver, the parent TestProbe, and the helper that drives the actor into a given state). Note how existing tests (Native_alarm_is_forwarded_to_parent_as_AttributeAlarmPublished, Reconnect_does_not_double_attach_alarm_handler) construct the actor and force Reconnecting.
  2. Add a test Native_alarm_during_reconnect_is_dropped_not_forwarded: drive the actor into Reconnecting (e.g. start it failing init / force-reconnect), Tell it a NativeAlarmRaised, and assert the parent probe receives no AttributeAlarmPublished within a short window (ExpectNoMsg), and the actor is still alive (no crash / restart).
  3. Run: dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests --filter "FullyQualifiedName~DriverInstanceActorNativeAlarm" — expect green.
  4. Commit test(alarms): guard native-alarm-during-reconnect is dropped not dead-lettered.

Acceptance: new test passes; asserts no forward + actor survives. No production file touched.


Task 2: Phase B (b) — test: OperatorComment flows through ForwardNativeAlarm

Classification: small Estimated implement time: ~4 min Parallelizable with: Task 1, 3, 4 (serial execution)

Files:

  • Test: tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverHostActorNativeAlarmTests.cs

Context: DriverHostActor.ForwardNativeAlarm (DriverHostActor.cs:519) publishes an AlarmTransitionEvent to the alerts topic with Comment: msg.Args.OperatorComment and User: msg.Args.OperatorComment is null ? string.Empty : "device". AlarmEventArgs.OperatorComment (IAlarmSource.cs) and AlarmTransitionEvent.Comment (Messages/Alerts/AlarmTransitionEvent.cs) both carry it. No existing test asserts the comment propagates.

Steps:

  1. Read the existing test file to reuse its harness (Primary node setup, alerts-topic TestProbe/mediator, the AttributeAlarmPublished construction + alarm-node-id registration that existing tests like Secondary_node_suppresses_alerts_publish_but_still_updates_condition use).
  2. Add a test Native_alarm_operator_comment_flows_to_transition_event: construct an AttributeAlarmPublished whose Args.OperatorComment = "investigating", route it through the Primary host, and assert the published AlarmTransitionEvent.Comment == "investigating" and .User == "device".
  3. Run: dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests --filter "FullyQualifiedName~DriverHostActorNativeAlarm" — expect green.
  4. Commit test(alarms): assert OperatorComment flows through ForwardNativeAlarm.

Acceptance: new test passes asserting Comment + User propagation. No production file touched.


Task 3: Phase B (c) — doc note on severity-bucket snapping

Classification: trivial Estimated implement time: ~3 min Parallelizable with: Task 1, 2, 4 (serial execution)

Files:

  • Doc: docs/ScriptedAlarms.md

Context: Authored native-alarm severity (1..1000) seeds the condition at materialise time, but on the first transition it snaps to a 4-bucket value via NativeAlarmProjector.MapSeverity (Low→200, Medium→500, High→700, Critical→900; default→500). That projected value is then mapped to SDK EventSeverity brackets by OtOpcUaNodeManager.MapSeverity (<200=Low, <400=MediumLow, <600=Medium, <800=MediumHigh, ≥800=High) for the Part 9 node's Severity attribute.

Steps:

  1. In the "Native driver alarms (equipment-tag path)" section (after the severity field description, ~line 143), add a short Severity mapping note describing the snap: authored severity seeds the condition; the first transition snaps it to one of 200/500/700/900; that bucket then maps to the SDK EventSeverity brackets. Reference NativeAlarmProjector.MapSeverity.
  2. Commit docs(alarms): note native-alarm severity-bucket snapping.

Acceptance: doc note is accurate to the two mapping tables; no behaviour change.


Task 4: Galaxy (3) — extract shared stub-driver test harness

Classification: small Estimated implement time: ~5 min Parallelizable with: Task 1, 2, 3 (serial execution)

Files:

  • Create: tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/StubDrivers.cs
  • Modify: tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverInstanceActorTests.cs (remove the private nested stub classes, ~lines 325462)
  • Modify: tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Drivers/DriverInstanceActorWriteAndSubscribeTests.cs (remove the private nested stub classes, ~lines 178312)

Context: Both files define their own StubDriver : IDriver, WritableStubDriver : StubDriver, IWritable, SubscribableStubDriver : StubDriver, ISubscribable, and nested StubHandle : ISubscriptionHandle. DriverInstanceActorTests' copy is a superset (adds ReinitializeCount, InitConfigs, InitBehavior, UnsubscribeYields, LastSubscribedRefs). The de-dup should promote the superset versions to shared internal classes so both suites compile, then delete both private copies.

Steps:

  1. Read both test files' stub regions to capture the exact superset (the DriverInstanceActorTests versions).
  2. Create StubDrivers.cs with internal StubDriver, WritableStubDriver, SubscribableStubDriver, StubHandle = the superset versions (keep all the doc-comments so doc-checker stays clean). Use the same namespace as the two test files.
  3. Delete the private nested copies from both DriverInstanceActorTests.cs and DriverInstanceActorWriteAndSubscribeTests.cs.
  4. Run the WHOLE project (both suites share the harness): dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests — expect green, same test count as before.
  5. Commit test(drivers): extract shared stub-driver harness (de-dup).

Acceptance: both suites compile + pass against the single shared harness; no test removed; no production file touched.


Task 5: Reconcile pending.md (disk-only, NOT committed)

Classification: trivial — performed by the controller, not a subagent.

Update pending.md open-item #3 to record the verified already-done follow-ups (with file:line evidence) and mark the four items above closed. pending.md is a disk-only working-notes file and is not staged/committed (standing hard rule), so this is a working-tree edit only.