Files
lmxopcua/docs/plans/2026-06-19-followups-batch.md
T
Joseph Doherty ad359c5cd3
v2-ci / build (push) Failing after 40s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
docs(plan): design + implementation plan + tasklist for non-arch follow-ups batch (A/B/C)
2026-06-19 01:19:37 -04:00

8.9 KiB
Raw Blame History

Non-architectural follow-ups batch — Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers-extended-cc:subagent-driven-development to execute this plan task-by-task. Each task is self-contained; honor its Classification for the review chain.

Goal: Close the actionable non-architectural follow-ups (A/B groups), and capture the operational/verify and blocked items (C) so nothing is lost.

Design: docs/plans/2026-06-19-followups-batch-design.md Base: master f57aa8fa. Branch (at execution): feat/followups-batch (off master).

Standing guardrails: no EF migration, no Commons/proto/wire change, no bUnit; stage by explicit path; never stage sql_login.txt/Host/pki//docker-dev/docker-compose.yml/pending.md/ current.md/stillpending.md; no --no-verify/force-push; dangerouslyDisableSandbox for build/test/rig. Finish a batch = ff-merge to master + push.

Recommended execution order / waves (disjoint files → concurrent):

  • Wave 1 (code, concurrent): T1 (OpcUaClient) ∥ T2 (Client.CLI) ∥ T3 (cert-audit AdminUI) ∥ T4 (Galaxy modal) ∥ T5 (vtag modal) — all disjoint projects/files.
  • Wave 2 (code): T6 (write-outcome, OpcUaServer/Runtime) — its own.
  • Gates (do NOT build without explicit go-ahead): T7, T8 (reconsider).
  • Operator/rig: T9, T10 (verify). Blocked: T11.
  • Each wave: per-task review by classification + a final integration review, then merge+push.

Task 1 (A1): OpcUaClient history session-capture-before-gate race

Classification: standard · Parallelizable with: T2,T3,T4,T5,T6 Files: src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient/OpcUaClientDriver.cs · tests in tests/Drivers/.../OpcUaClient*Tests Steps:

  1. Audit every var session = RequireSession() that precedes await _gate.WaitAsync (known sites: 1134, 1299, 1413, 1618 ExecuteHistoryReadAsync, 1788). Compare to the correct idiom at :622-628 (outside _ = RequireSession() fast-fail guard, then re-read the session inside the gate).
  2. Write a failing regression test: acquire _gate, swap Session (simulate OnReconnectComplete), release; assert the method under test uses the NEW session, not the captured one. (Use the existing OpcUaClient test harness; if session-swap isn't fakeable, assert via the Gate internal + a seam.)
  3. Refactor each site to re-resolve inside the gate (keep the outside guard). Run the driver unit suite green; dotnet build the driver.
  4. Commit fix(opcuaclient): re-resolve session inside _gate in history/read paths (stale-session race).

Task 2 (A2): Client.CLI enable/disable command (H4 client path)

Classification: standard · Parallelizable with: T1,T3,T4,T5,T6 Files: src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI/ (Program.cs + the ack/shelve/confirm command template — explore for the actual command structure) · the client-side IOpcUaClientService (+ impl) · CLI/Client.Shared tests Steps:

  1. Find the existing ack/shelve/confirm CLI command + the IOpcUaClientService.{Acknowledge,Shelve,Confirm}AlarmAsync they call (template). Confirm whether Enable/Disable already exist on the service (grep) — if not, add EnableAsync(nodeId)/DisableAsync(nodeId) that call the OPC UA ConditionType Enable/Disable methods (mirror the ack call shape). Client app interface only — NOT Commons/wire.
  2. Add CLI enable/disable commands mirroring ack (node-id arg, connect, call, print status).
  3. Unit-test the service/VM call + the command wiring. Build + driver/client tests green.
  4. Live (later): drive enable/disable against the rig's scripted condition node → AlarmAck-gated → engine Enable/DisableAsync (closes the deferred H4 live /run).
  5. Commit feat(cli): add enable/disable condition commands (H4 client path).

Task 3 (A3): Cert-audit minor review nits

Classification: trivial · Parallelizable with: T1,T2,T4,T5,T6 Files: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Certificates.razor · src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Certificates/CertificateStoreManager.cs Steps:

  1. (a) The two unreachable ConfirmAction fallthrough arms ("cannot delete from {Kind}", "unknown action"): add an explicit // unreachable defensive guard — buttons only render for Trusted/Rejected + 3 literal verbs comment (simplest), OR route through the manager so they audit. Pick the comment unless trivial to route.
  2. (b) Expose a PkiRoot property on CertificateStoreManager; have Certificates.razor:130 read it instead of re-reading OpcUa:PkiStoreRoot independently.
  3. Build AdminUI (0 errors); existing AdminUI.Tests green.
  4. Commit refactor(adminui): tidy cert-audit review nits (fallthrough comment + single PkiStoreRoot read).

Task 4 (B2): AdminUI — Galaxy re-pick preserves prior alarm-field edits

Classification: small · Parallelizable with: T1,T2,T3,T5,T6 Files: the Galaxy-address-picked handler on the equipment Tag modal (explore: Components/Shared/Uns/TagModal.razor + the Galaxy picker callback OnGalaxyAddressPicked/similar) · a pure merge helper + its unit test Steps:

  1. Reproduce: re-picking a Galaxy address resets manually-edited alarm fields. Find the picked-handler that overwrites the config.
  2. Extract/extend a pure merge that applies picked defaults WITHOUT clobbering already-edited alarm fields (preserve-existing idiom); unit-test the merge.
  3. Wire it into the handler. Build; AdminUI.Tests green. Live-verify on docker-dev (re-pick keeps edits).
  4. Commit fix(adminui): preserve edited alarm fields on Galaxy address re-pick.

Task 5 (B3): AdminUI — inline-create-script dropdown label drift

Classification: small · Parallelizable with: T1,T2,T3,T4,T6 Files: VirtualTagModal + its inline create-script handler (explore) · test if a pure binding helper exists Steps:

  1. Reproduce the label drift after "New script" inline-creates + binds (SC-…).
  2. Refresh the bound-script label/selection from the created id after creation. Build; tests green; live-verify.
  3. Commit fix(adminui): refresh script dropdown label after inline create.

Task 6 (B1): Write-outcome residuals (Bad-quality blip + AuditWriteUpdateEvent + sync fail-fast)

Classification: standard · Parallelizable with: T1T5 Files: node-manager write path (OtOpcUaNodeManager OnWriteValue / the IOpcUaNodeWriteGateway outcome continuation — the write-outcome self-correction site, master 1d797c1c) · Runtime gateway · tests Steps:

  1. Locate the failed-write revert continuation. Add behind the existing failure branch: (i) a brief Bad-quality status blip on the node before/with the revert; (ii) raise an OPC UA AuditWriteUpdateEvent; (iii) synchronous structural fail-fast for pre-dispatch-rejectable writes.
  2. TDD each sub-behaviour (protocol-driver path only — Galaxy is fire-and-forget). Use the modbus exception-injector recipe for live proof (FC06 reject).
  3. If any sub-part balloons >~300 LOC, split it out. Build; OpcUaServer + Runtime tests green.
  4. Commit feat(opcua): emit Bad blip + AuditWriteUpdateEvent + sync fail-fast on failed device write.

Task 7 (B4): F10b surgical DataType/IsArray in-place writes — RECONSIDER GATE

Classification: standard · Do NOT build without an explicit fresh go-ahead (previously decided against as dirty — brief value-type mismatch, no ModelChangeEvents, rare edits). If approved: extend ISurgicalAddressSpaceSink.UpdateTagAttributes to swap DataType/ValueRank in place + emit ModelChangeEvents; widen Phase7Applier.TagDeltaIsSurgicalEligible; live-/run the rebuild=False path (the prod-inertness trap, see the F10b deferred-wrapper lesson). Until approved this stays a deferred record.

Task 8 (B5): Alarm-severity SetSeverity surgical update — RECONSIDER GATE

Classification: small · Do NOT build without an explicit fresh go-ahead (operationally invisible — the alarm engine overwrites authored severity on first eval). Recorded so the decision isn't a silent gap.

Task 9 (C1): Modbus-Int64 full live authoring — VERIFY-ONLY (operator/rig)

Classification: verify · Seed a Modbus driver on docker-dev → sim 10.100.0.35:5020, author an Int64 equipment tag, deploy, confirm the OPC UA node advertises DataTypeIds.Int64 + reads changing. No code unless a gap surfaces.

Task 10 (C2): S7 + AbCip Test-Connect probe happy-path — VERIFY-ONLY (needs Windows-VM fixtures)

Classification: verify · lmxopcua-fix up s7 s7_1500 / up abcip controllogix from the Windows VM, then run the skip-gated probe E2E green path (DriverProbeHandshakeE2eTests).

Task 11 (C3): Device-gated proofs — BLOCKED (hardware)

Classification: blocked · H6 native-ack→AVEVA, Galaxy Phase C historian T7, Phase B T9, AbLegacy/TwinCAT/FOCAS probe happy-paths — need Wonderware+AVEVA (10.100.0.48), a Galaxy native alarm, PLC5/SLC sim, ADS target, CNC+FWLIB. Captured; not executable here.