ad359c5cd3
v2-ci / build (push) Failing after 40s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
93 lines
8.9 KiB
Markdown
93 lines
8.9 KiB
Markdown
# Non-architectural follow-ups batch — Implementation Plan
|
||
|
||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:subagent-driven-development to execute this plan task-by-task. Each task is self-contained; honor its Classification for the review chain.
|
||
|
||
**Goal:** Close the actionable non-architectural follow-ups (A/B groups), and capture the
|
||
operational/verify and blocked items (C) so nothing is lost.
|
||
|
||
**Design:** `docs/plans/2026-06-19-followups-batch-design.md`
|
||
**Base:** master `f57aa8fa`. **Branch (at execution):** `feat/followups-batch` (off master).
|
||
|
||
**Standing guardrails:** no EF migration, no Commons/proto/wire change, no bUnit; stage by explicit
|
||
path; never stage `sql_login.txt`/`Host/pki/`/`docker-dev/docker-compose.yml`/`pending.md`/
|
||
`current.md`/`stillpending.md`; no `--no-verify`/force-push; `dangerouslyDisableSandbox` for
|
||
build/test/rig. Finish a batch = ff-merge to master + push.
|
||
|
||
**Recommended execution order / waves** (disjoint files → concurrent):
|
||
- **Wave 1 (code, concurrent):** T1 (OpcUaClient) ∥ T2 (Client.CLI) ∥ T3 (cert-audit AdminUI) ∥ T4 (Galaxy modal) ∥ T5 (vtag modal) — all disjoint projects/files.
|
||
- **Wave 2 (code):** T6 (write-outcome, OpcUaServer/Runtime) — its own.
|
||
- **Gates (do NOT build without explicit go-ahead):** T7, T8 (reconsider).
|
||
- **Operator/rig:** T9, T10 (verify). **Blocked:** T11.
|
||
- Each wave: per-task review by classification + a final integration review, then merge+push.
|
||
|
||
---
|
||
|
||
### Task 1 (A1): OpcUaClient history session-capture-before-gate race
|
||
**Classification:** standard · **Parallelizable with:** T2,T3,T4,T5,T6
|
||
**Files:** `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient/OpcUaClientDriver.cs` · tests in `tests/Drivers/.../OpcUaClient*Tests`
|
||
**Steps:**
|
||
1. Audit every `var session = RequireSession()` that precedes `await _gate.WaitAsync` (known sites: 1134, 1299, 1413, **1618 `ExecuteHistoryReadAsync`**, 1788). Compare to the correct idiom at `:622-628` (outside `_ = RequireSession()` fast-fail guard, then re-read the session *inside* the gate).
|
||
2. Write a failing regression test: acquire `_gate`, swap `Session` (simulate `OnReconnectComplete`), release; assert the method under test uses the NEW session, not the captured one. (Use the existing OpcUaClient test harness; if session-swap isn't fakeable, assert via the `Gate` internal + a seam.)
|
||
3. Refactor each site to re-resolve inside the gate (keep the outside guard). Run the driver unit suite green; `dotnet build` the driver.
|
||
4. Commit `fix(opcuaclient): re-resolve session inside _gate in history/read paths (stale-session race)`.
|
||
|
||
### Task 2 (A2): Client.CLI `enable`/`disable` command (H4 client path)
|
||
**Classification:** standard · **Parallelizable with:** T1,T3,T4,T5,T6
|
||
**Files:** `src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI/` (Program.cs + the ack/shelve/confirm command template — explore for the actual command structure) · the client-side `IOpcUaClientService` (+ impl) · CLI/Client.Shared tests
|
||
**Steps:**
|
||
1. Find the existing `ack`/`shelve`/`confirm` CLI command + the `IOpcUaClientService.{Acknowledge,Shelve,Confirm}AlarmAsync` they call (template). Confirm whether `Enable`/`Disable` already exist on the service (grep) — if not, add `EnableAsync(nodeId)`/`DisableAsync(nodeId)` that call the OPC UA ConditionType Enable/Disable methods (mirror the ack call shape). **Client app interface only — NOT Commons/wire.**
|
||
2. Add CLI `enable`/`disable` commands mirroring `ack` (node-id arg, connect, call, print status).
|
||
3. Unit-test the service/VM call + the command wiring. Build + driver/client tests green.
|
||
4. Live (later): drive `enable`/`disable` against the rig's scripted condition node → AlarmAck-gated → engine Enable/DisableAsync (closes the deferred H4 live `/run`).
|
||
5. Commit `feat(cli): add enable/disable condition commands (H4 client path)`.
|
||
|
||
### Task 3 (A3): Cert-audit minor review nits
|
||
**Classification:** trivial · **Parallelizable with:** T1,T2,T4,T5,T6
|
||
**Files:** `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Certificates.razor` · `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Certificates/CertificateStoreManager.cs`
|
||
**Steps:**
|
||
1. (a) The two unreachable `ConfirmAction` fallthrough arms (`"cannot delete from {Kind}"`, `"unknown action"`): add an explicit `// unreachable defensive guard — buttons only render for Trusted/Rejected + 3 literal verbs` comment (simplest), OR route through the manager so they audit. Pick the comment unless trivial to route.
|
||
2. (b) Expose a `PkiRoot` property on `CertificateStoreManager`; have `Certificates.razor:130` read it instead of re-reading `OpcUa:PkiStoreRoot` independently.
|
||
3. Build AdminUI (0 errors); existing AdminUI.Tests green.
|
||
4. Commit `refactor(adminui): tidy cert-audit review nits (fallthrough comment + single PkiStoreRoot read)`.
|
||
|
||
### Task 4 (B2): AdminUI — Galaxy re-pick preserves prior alarm-field edits
|
||
**Classification:** small · **Parallelizable with:** T1,T2,T3,T5,T6
|
||
**Files:** the Galaxy-address-picked handler on the equipment Tag modal (explore: `Components/Shared/Uns/TagModal.razor` + the Galaxy picker callback `OnGalaxyAddressPicked`/similar) · a pure merge helper + its unit test
|
||
**Steps:**
|
||
1. Reproduce: re-picking a Galaxy address resets manually-edited `alarm` fields. Find the picked-handler that overwrites the config.
|
||
2. Extract/extend a pure merge that applies picked defaults WITHOUT clobbering already-edited alarm fields (preserve-existing idiom); unit-test the merge.
|
||
3. Wire it into the handler. Build; AdminUI.Tests green. Live-verify on docker-dev (re-pick keeps edits).
|
||
4. Commit `fix(adminui): preserve edited alarm fields on Galaxy address re-pick`.
|
||
|
||
### Task 5 (B3): AdminUI — inline-create-script dropdown label drift
|
||
**Classification:** small · **Parallelizable with:** T1,T2,T3,T4,T6
|
||
**Files:** `VirtualTagModal` + its inline create-script handler (explore) · test if a pure binding helper exists
|
||
**Steps:**
|
||
1. Reproduce the label drift after "New script" inline-creates + binds (`SC-…`).
|
||
2. Refresh the bound-script label/selection from the created id after creation. Build; tests green; live-verify.
|
||
3. Commit `fix(adminui): refresh script dropdown label after inline create`.
|
||
|
||
### Task 6 (B1): Write-outcome residuals (Bad-quality blip + AuditWriteUpdateEvent + sync fail-fast)
|
||
**Classification:** standard · **Parallelizable with:** T1–T5
|
||
**Files:** node-manager write path (`OtOpcUaNodeManager` `OnWriteValue` / the `IOpcUaNodeWriteGateway` outcome continuation — the write-outcome self-correction site, master `1d797c1c`) · Runtime gateway · tests
|
||
**Steps:**
|
||
1. Locate the failed-write revert continuation. Add behind the existing failure branch: (i) a brief Bad-quality status blip on the node before/with the revert; (ii) raise an OPC UA `AuditWriteUpdateEvent`; (iii) synchronous structural fail-fast for pre-dispatch-rejectable writes.
|
||
2. TDD each sub-behaviour (protocol-driver path only — Galaxy is fire-and-forget). Use the modbus exception-injector recipe for live proof (FC06 reject).
|
||
3. If any sub-part balloons >~300 LOC, split it out. Build; OpcUaServer + Runtime tests green.
|
||
4. Commit `feat(opcua): emit Bad blip + AuditWriteUpdateEvent + sync fail-fast on failed device write`.
|
||
|
||
### Task 7 (B4): F10b surgical DataType/IsArray in-place writes — **RECONSIDER GATE**
|
||
**Classification:** standard · **Do NOT build without an explicit fresh go-ahead** (previously decided against as dirty — brief value-type mismatch, no ModelChangeEvents, rare edits). If approved: extend `ISurgicalAddressSpaceSink.UpdateTagAttributes` to swap DataType/ValueRank in place + emit ModelChangeEvents; widen `Phase7Applier.TagDeltaIsSurgicalEligible`; **live-`/run` the rebuild=False path** (the prod-inertness trap, see the F10b deferred-wrapper lesson). Until approved this stays a deferred record.
|
||
|
||
### Task 8 (B5): Alarm-severity `SetSeverity` surgical update — **RECONSIDER GATE**
|
||
**Classification:** small · **Do NOT build without an explicit fresh go-ahead** (operationally invisible — the alarm engine overwrites authored severity on first eval). Recorded so the decision isn't a silent gap.
|
||
|
||
### Task 9 (C1): Modbus-Int64 full live authoring — **VERIFY-ONLY (operator/rig)**
|
||
**Classification:** verify · Seed a Modbus driver on docker-dev → sim `10.100.0.35:5020`, author an Int64 equipment tag, deploy, confirm the OPC UA node advertises `DataTypeIds.Int64` + reads changing. No code unless a gap surfaces.
|
||
|
||
### Task 10 (C2): S7 + AbCip Test-Connect probe happy-path — **VERIFY-ONLY (needs Windows-VM fixtures)**
|
||
**Classification:** verify · `lmxopcua-fix up s7 s7_1500` / `up abcip controllogix` from the Windows VM, then run the skip-gated probe E2E green path (`DriverProbeHandshakeE2eTests`).
|
||
|
||
### Task 11 (C3): Device-gated proofs — **BLOCKED (hardware)**
|
||
**Classification:** blocked · H6 native-ack→AVEVA, Galaxy Phase C historian T7, Phase B T9, AbLegacy/TwinCAT/FOCAS probe happy-paths — need Wonderware+AVEVA (`10.100.0.48`), a Galaxy native alarm, PLC5/SLC sim, ADS target, CNC+FWLIB. Captured; not executable here.
|