# F10b vtag-node-irrelevant rebuild skip + Client.UI Galaxy operator-comment limitation — Design **Date:** 2026-06-18 **Branch:** `feat/f10b-vtag-skip-rebuild-galaxy-comment` (off master `be703a30`) **Backlog items:** `stillpending.md` §A #11 (F10b in-place vs full rebuild) + #12 (Client.UI Galaxy operator-comment fallback) Two independent, low-risk backlog items. F10b is a *bounded* address-space-rebuild optimization (the narrow, provably-safe slice — NOT the broad surgical-write rewrite). The Client.UI item is an accepted-limitation close (doc/comment + a pinning test). They touch disjoint projects (OpcUaServer / Client.Shared). --- ## Standing constraints (in force) - NO Commons/proto change · NO Core.Abstractions / breaking-interface change · NO EF migration · NO bUnit (no Razor touched this phase). - Stage by explicit path (never `git add .`); never stage `sql_login.txt` / `pki/` / `pending.md` / `current.md` / `stillpending.md` / `docker-dev/docker-compose.yml`. No force-push / no `--no-verify`. - Finish = merge to master + push. `dangerouslyDisableSandbox: true` for all build/test commands. --- ## Component A — F10b: skip the address-space rebuild for vtag-node-irrelevant edits (backlog #11) ### What "full rebuild" costs, and why H1a made it unconditional `OpcUaPublishActor.HandleRebuild` runs: `outcome = _applier.Apply(plan)` (which calls `_sink.RebuildAddressSpace()` — a full tear-down of every folder/variable/alarm — iff `needsRebuild`), then **unconditionally** re-runs the idempotent `Materialise*` passes. H1a/H1b (`1e95856b`/`ada01e1a`) made `needsRebuild` fire on **any** Changed* (not just Added/Removed) to kill a silent stale-node bug — but that means **editing a single VirtualTag's expression tears down and rebuilds the entire server address space**, dropping every client's live values + monitored-item subscriptions server-wide. That blast radius is the cost F10b targets. ### The provably-safe slice (the only zero-node-mutation case) A VirtualTag's **materialised OPC UA node** (`Phase7Applier.MaterialiseEquipmentVirtualTags`, lines 264-272) is derived **only** from `{EquipmentId, FolderPath, Name, DataType}` → `EnsureVariable(parent/Name, Name, DataType, writable:false)`. `Expression`, `DependencyRefs`, and `Historize` do **not** touch the node — `Historize` is explicitly write-side only (the code comment at lines 260-263 says so; vtags are not materialised as SDK Historizing/HistoryRead variables). The VirtualTag **engine** picks up `Expression`/`DependencyRefs`/`Historize` changes through a **completely independent path**: `DriverHostActor` (`:1062`) tells `VirtualTagHostActor` `ApplyVirtualTags(...)`, and `VirtualTagHostActor.OnApply` (lines 97-109) stop+respawns any child "whose plan changed in place (edited Expression/DependencyRefs, or a toggled flag)". This is **not** routed through `Phase7Applier`. Therefore, when a deploy's *only* changes are vtag deltas that differ *solely* in `Expression`/`DependencyRefs`/`Historize`: - the materialised node is byte-identical → `RebuildAddressSpace` would tear it down and rebuild an identical node, dropping subscriptions for nothing; - the idempotent `Materialise*` passes already short-circuit on an existing node (`OtOpcUaNodeManager.EnsureVariable:1324` early-returns `if (_variables.ContainsKey(...))`), so they no-op and preserve the live value + subscriptions; - the engine respawns regardless via `VirtualTagHostActor`. So **skipping `RebuildAddressSpace` for this case is correct and preserves the H1a invariant** (no stale node — because the node never needed to change). ### Approach (Phase7Applier-only; NO node-manager / SDK / interface change) In `Phase7Applier.Apply`, refine `needsRebuild`. A vtag delta is **node-irrelevant** iff overriding the three engine/write-side fields to the new values makes it equal to the new plan (a whitelist-of-may-differ that **defaults to rebuild** for any other/unknown field, because `EquipmentVirtualTagPlan` has a custom `Equals` comparing all 8 logical fields): ```csharp static bool VtagDeltaIsNodeIrrelevant(Phase7Plan.EquipmentVirtualTagDelta d) => (d.Previous with { Expression = d.Current.Expression, DependencyRefs = d.Current.DependencyRefs, Historize = d.Current.Historize, }).Equals(d.Current); ``` `needsRebuild` becomes true iff there is any structural change OR any vtag delta that is **not** node-irrelevant: ```csharp var needsRebuild = plan.AddedEquipment.Count > 0 || plan.RemovedEquipment.Count > 0 || plan.ChangedEquipment.Count > 0 || plan.AddedAlarms.Count > 0 || plan.RemovedAlarms.Count > 0 || plan.ChangedAlarms.Count > 0 || plan.AddedEquipmentTags.Count > 0 || plan.RemovedEquipmentTags.Count > 0 || plan.ChangedEquipmentTags.Count > 0 || plan.AddedEquipmentVirtualTags.Count > 0 || plan.RemovedEquipmentVirtualTags.Count > 0 || plan.ChangedEquipmentVirtualTags.Any(d => !VtagDeltaIsNodeIrrelevant(d)); ``` i.e. `needsRebuild` is false **only** when every effective change is a node-irrelevant vtag edit and nothing else changed. `ChangedNodes` (the outcome count) stays as-is (it still counts the vtag change as a change — it just doesn't force a rebuild). Update the lines 84-91 comment to document the new skip + the safety reasoning. The downstream `Materialise*` passes and the `VirtualTagHostActor` respawn are unchanged. ### Out of scope (deferred, documented) Surgical *in-place attribute writes* for equipment-tag property changes (`DataType`, `Writable`, `IsHistorized`) and alarm-severity-only changes (`SetSeverity`) genuinely mutate node attributes and carry the H1a stale-node risk + SDK handler/historian-lifecycle complexity. They are NOT in this slice — F10b here is strictly the zero-node-mutation vtag skip. Record them as follow-ups. ### Testing (offline, no SDK) `Phase7ApplierTests` already uses a `RecordingSink` that captures `RebuildAddressSpace` calls. Add: - vtag delta differing **only in Expression** ⇒ `outcome.RebuildCalled == false`, `RecordingSink` saw **no** `RebuildAddressSpace`, `outcome.ChangedNodes == 1`. - same for a **DependencyRefs-only** and a **Historize-only** delta ⇒ no rebuild. - vtag delta where **DataType also changed** ⇒ `RebuildCalled == true` (the safety boundary). - vtag delta where **Name (or FolderPath) changed** ⇒ `RebuildCalled == true`. - a node-irrelevant vtag delta **mixed with** any other change (e.g. a ChangedEquipmentTag, or an added equipment) ⇒ `RebuildCalled == true` (the skip requires the vtag edit to be the *sole* change). ### Live `/run` (optional, confirmatory) On docker-dev (rebuild BOTH central nodes per the `:9200` round-robin fact): subscribe a client to a vtag, edit only its expression in `/uns`, Deploy → server log shows `rebuild=False` for that deploy and the client subscription is NOT dropped (vs `rebuild=True` when a tag's DataType changes). This is confirmatory; the unit tests pin the logic. --- ## Component B — Client.UI Galaxy operator-comment fallback (backlog #12) ### Verdict: accepted limitation, not a code-fixable bug The operator comment is **genuinely unrecoverable** in the Galaxy sub-attribute fallback path (`OpcUaClientService.cs:769-820`): - the standard OPC UA event SelectClause (`CreateAlarmEventFilter`, ~13 fields) does **not** request an AckComment/Comment field — Part 9 conditions don't carry the operator comment in the event payload here; - Galaxy's sub-attribute convention (`AlarmRefBuilder.cs:15-20`) exposes `AckMsg` as **write-only** — there is no readable attribute to recover the last operator comment; - the comment **is** available via the gateway's **native** alarm feed (`GatewayGalaxyAlarmFeed` → `GalaxyAlarmTransition.OperatorComment`), which the Galaxy *driver* already uses — a different path from this OPC UA *client* fallback. So a real fix would require a server-side change (out of scope); the backlog already labels this "accepted limitation." `AlarmEventArgs.cs:82-87` partly documents it, but the client-side extraction omission is unexplained, and the behavior is untested. ### Approach (doc/comment + a pinning test; NO behavioral change) - **Inline comment** at the fallback `AlarmEvent?.Invoke(...)` site in `OpcUaClientService.cs` (~:817): explain that `OperatorComment` is intentionally left null here — Galaxy `AckMsg` is write-only and the OPC UA event SelectClause carries no comment field, so the comment is only available via the gateway's native alarm feed (the driver path), not this client fallback. - **Tighten the `AlarmEventArgs.cs:82-87` doc** to name the two concrete reasons (write-only `AckMsg` + no event-field), so it reads as an intentional limitation, not a TODO. - **Pinning test** in `OpcUaClientServiceTests` (mirror the existing `OnAlarmEvent_MissingAckedActiveButHasConditionNode_FallbackReadsAndRaisesEvent` fixture): drive the fallback path and assert the raised `AlarmEventArgs.OperatorComment` is null — converting the gap into a tested, intentional limitation. ### Testing Fully offline — the existing `OpcUaClientServiceTests` fakes (`FakeSessionAdapter`/`FakeSubscriptionAdapter`) already drive the fallback path; the new test asserts the documented behavior. --- ## Task slicing (independent → parallelizable) | Task | Component | Project | Class | Parallel with | |---|---|---|---|---| | T1 | F10b vtag-node-irrelevant rebuild skip + tests | OpcUaServer (+ .Tests) | small | T2 | | T2 | Client.UI Galaxy comment: inline comment + doc + pinning test | Client.Shared (+ .Tests) | trivial→small | T1 | | T3 | Reconcile stillpending #11/#12 + memory + finish (build, tests, merge+push) | docs (never-staged) | small | none | T1/T2 touch disjoint projects → dispatchable concurrently (worktree isolation or serial-on-shared-tree per the git-index-race lesson). T3 runs last. ## Done = Build clean + `dotnet test` green (OpcUaServer.Tests + Client.Shared.Tests) + backlog/memory reconciled + merged to master + pushed.