From 8632c098b9a363b180957b50b2ab16c14ecdcf3f Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Thu, 28 May 2026 11:33:12 -0400 Subject: [PATCH] plan: design for OPC UA tag browser popup on instance config page Per-instance address override + live ClusterClient-based browse via a new IBrowsableDataConnection capability on RealOpcUaClient. Lazy-loaded tree with manual-paste fallback; offline-safe. --- .../2026-05-28-opcua-tag-browser-design.md | 338 ++++++++++++++++++ 1 file changed, 338 insertions(+) create mode 100644 docs/plans/2026-05-28-opcua-tag-browser-design.md diff --git a/docs/plans/2026-05-28-opcua-tag-browser-design.md b/docs/plans/2026-05-28-opcua-tag-browser-design.md new file mode 100644 index 00000000..36bfbb21 --- /dev/null +++ b/docs/plans/2026-05-28-opcua-tag-browser-design.md @@ -0,0 +1,338 @@ +# OPC UA Tag Browser Popup on Instance Configure Page — Design + +**Date:** 2026-05-28 +**Status:** Approved, ready for implementation plan +**Scope:** Add a popup OPC UA address-space browser to the template-instance config page so users can pick the actual physical tag for each attribute binding. Adds a per-instance address override that beats the template's `DataSourceReference` at runtime, plus a new OPC UA browse capability in the Data Connection Layer surfaced through a new ClusterClient management command. + +## Decisions + +| # | Decision | Selected option | +|---|----------|-----------------| +| 1 | Pick location & persistence | On the instance config page — per-instance override field on `InstanceConnectionBinding`. Template still defines the default. | +| 2 | Browse source | Live, on-demand browse to the site per click. No caching. | +| 3 | Selectable node types | Variables only, with a manual-paste fallback. Objects/Methods navigable but non-selectable. | +| 4 | Offline behavior | Popup opens, shows error banner, manual-paste field stays usable. | +| 5 | Search | Lazy-loaded tree only — no search in this slice. | +| 6 | Transport | ClusterClient command/control (`BrowseOpcUaNodeCommand` / `BrowseOpcUaNodeResult`). Not gRPC. | +| 7 | DCL surface | New capability interface `IBrowsableDataConnection` implemented by `RealOpcUaClient`; `IDataConnection` unchanged. | + +## Section 1 — Architecture + +``` +[Blazor Server browser] + │ SignalR + ▼ +[CentralUI: InstanceConfigure.razor] + │ opens + ▼ +[CentralUI: ] + │ uses + ▼ +[CentralUI: IOpcUaBrowseService] ── implementation calls + │ + ▼ +[CommunicationService.SendCommandToSiteAsync(siteId, BrowseOpcUaNodeCommand)] + │ ClusterClient Ask, ManagementEnvelope { User, Command, CorrelationId } + ▼ +[Site: CentralCommunicationActor → DataConnectionManagerActor] + │ dispatches to IBrowsableDataConnection (RealOpcUaClient) + ▼ +[OPC UA server] ◄── OPC Foundation .NET SDK Browse service +``` + +Three slices, top-to-bottom: +1. **Data model.** Add a per-instance OPC UA address override (new column on `InstanceConnectionBinding`). +2. **DCL browse capability.** New `IBrowsableDataConnection` capability interface; `RealOpcUaClient.BrowseChildrenAsync`; new site-side message handler. +3. **Central UI popup + plumbing.** `` Bootstrap modal, `IOpcUaBrowseService`, "Browse…" button on each row of the Connection Bindings table. + +Cross-cutting: +- Browse runs at the site against the **currently selected** `DataConnection` for that attribute row. If no connection is selected on that row, "Browse…" is disabled. +- "Browse…" only appears when the selected connection's `Protocol == OpcUa`. For other protocols the plain text input stays as-is. +- Auth: existing `ManagementEnvelope.User` carries the authenticated user; site-side handler requires the Design role (same role that authorizes editing instance bindings today). +- Audit: edits to the override field already fall under the existing instance-edit audit flow — no new audit code path. + +## Section 2 — Data model + +**Commons entity** (`src/ZB.MOM.WW.ScadaBridge.Commons/Entities/Instances/InstanceConnectionBinding.cs`): + +```csharp +public class InstanceConnectionBinding +{ + public int InstanceId { get; set; } + public string AttributeName { get; set; } = ""; + public int DataConnectionId { get; set; } + public string? DataSourceReferenceOverride { get; set; } // NEW — nullable +} +``` + +- `null` = use the template's `DataSourceReference` (today's behavior, unchanged). +- non-null = override at this instance for this attribute. Full node identifier the runtime hands to the OPC UA client (no partial-path / template semantics). + +**Management message** (`InstanceCommands.cs`): + +```csharp +public record ConnectionBinding( + string AttributeName, + int DataConnectionId, + string? DataSourceReferenceOverride); // NEW — additive +``` + +Additive — older sites that don't know the field deserialize it as null and behave as today. + +**EF mapping.** New nullable column `DataSourceReferenceOverride NVARCHAR(512) NULL` on the `InstanceConnectionBindings` table. EF Core migration auto-applies in dev; manual SQL script in production. No backfill — `NULL` means "use template default". + +**Runtime resolution rule** (in Site Runtime where bindings are resolved to physical tag paths): + +```csharp +effectiveAddress = binding.DataSourceReferenceOverride + ?? templateAttribute.DataSourceReference; +``` + +One line, one place — the site-side resolver that today reads `templateAttribute.DataSourceReference`. + +**Doc impact.** +- `docs/requirements/Component-TemplateEngine.md` — the "DataSourceReference is fixed at instance level" claim becomes "DataSourceReference defines the default; instances may override per attribute via `InstanceConnectionBinding.DataSourceReferenceOverride`." +- `docs/requirements/Component-CentralUI.md` — Connection Bindings section gains the override field + popup description. + +**Explicit non-goals.** No audit table — instance edits are already audited. No "reset to template default" button (clearing the field is enough). No per-attribute lock on the override. + +## Section 3 — DCL browse capability + +**Capability interface** (`src/ZB.MOM.WW.ScadaBridge.Commons/Interfaces/Protocol/IBrowsableDataConnection.cs`): + +```csharp +public interface IBrowsableDataConnection +{ + Task> BrowseChildrenAsync( + string? parentNodeId, // null = browse from server's Objects folder + CancellationToken ct); +} + +public record BrowseNode( + string NodeId, // e.g. "ns=2;s=Devices.Pump1.Speed" + string DisplayName, // e.g. "Speed" + BrowseNodeClass NodeClass, // Object | Variable | Method | Other + bool HasChildren); + +public enum BrowseNodeClass { Object, Variable, Method, Other } +``` + +- `null` parent → browse from `ObjectsFolder` (`ns=0;i=85`). +- `HasChildren` populated from the OPC UA Browse response so the tree can render the chevron without a second roundtrip. + +**RealOpcUaClient implements it.** The existing client holds a connected `Opc.Ua.Client.Session`. New method calls `session.Browse(...)` with `BrowseDirection.Forward`, `nodeClassMask = Object | Variable`, `resultMask = BrowseName | DisplayName | NodeClass | TypeDefinition`. Each result row maps directly to a `BrowseNode`. Method nodes returned by the server are mapped (so the UI can grey them as non-selectable) — not hidden, but not promised to bind. + +**Connection-state gating.** Browse only valid when `Status == Connected`. If not, throw a typed `ConnectionNotConnectedException` — the site-side command handler catches it and translates to a structured `BrowseFailure`. No silent-empty. + +**Custom (non-OPC UA) protocols.** Do NOT implement `IBrowsableDataConnection`. The site command handler checks `if (connection is IBrowsableDataConnection b) ...` and returns a `NotBrowsable` failure otherwise. Central UI button is already protocol-gated; this is defense in depth. + +**Cancellation + timeout.** Browse runs under the same `CancellationToken` as the ClusterClient Ask (default 30s in `CommunicationService`). OPC UA SDK Browse call wrapped consistent with the rest of `RealOpcUaClient`. + +**Doc impact.** `docs/requirements/Component-DataConnectionLayer.md` — add a "Browsing" subsection: clean-data-pipe philosophy still holds, browse is an opt-in capability for protocols that support it, only consumed by management/UI (never by Instance Actors). + +**Explicit non-goals.** No browse caching at site. No `BrowseNext` paging in this first cut — surface a `Truncated` flag instead; users hit manual-paste if a parent has more than the server's per-call cap. No type-system mirroring (data type, value rank, access level). + +## Section 4 — Central↔site command + envelope + +**New messages** (`src/ZB.MOM.WW.ScadaBridge.Commons/Messages/Management/BrowseCommands.cs`): + +```csharp +public record BrowseOpcUaNodeCommand( + int DataConnectionId, + string? ParentNodeId); // null = server root (ObjectsFolder) + +public record BrowseOpcUaNodeResult( + IReadOnlyList Children, + bool Truncated, // true if server hit the per-call cap + BrowseFailure? Failure); // null on success + +public record BrowseFailure( + BrowseFailureKind Kind, + string Message); + +public enum BrowseFailureKind +{ + ConnectionNotFound, + ConnectionNotConnected, + NotBrowsable, + Timeout, + ServerError +} +``` + +Returning failure inside `BrowseOpcUaNodeResult` (rather than exceptions across ClusterClient) keeps the popup's error rendering clean. + +**Wire flow.** + +``` +CentralUI.OpcUaBrowseService.BrowseChildrenAsync(siteId, connId, parent) + → CommunicationService.SendCommandToSiteAsync( + siteId, + new BrowseOpcUaNodeCommand(connId, parent)) + → ManagementEnvelope { User, Command, CorrelationId } over ClusterClient + → Site: CentralCommunicationActor unwraps envelope + → Site: DataConnectionManagerActor receives BrowseOpcUaNodeCommand + - Look up IDataConnection by Id + - if not found → ConnectionNotFound + - if !(conn is IBrowsableDataConnection) → NotBrowsable + - else await conn.BrowseChildrenAsync(ParentNodeId, ct) + - Catch ConnectionNotConnectedException → ConnectionNotConnected + - Catch OperationCanceledException → Timeout + - Catch ServiceResultException → ServerError + verbatim msg + - Else success: BrowseOpcUaNodeResult(children, truncated, null) + → Reply travels back via CentralCommunicationActor → CommunicationService + → returned to CentralUI page +``` + +Handler lives in the **DCL coordinator actor** (the same actor that owns the per-connection `IDataConnection` instances) — keeps lifecycle and browse co-located so we don't race against reconnect. + +**Auth check.** Site-side handler validates `ManagementEnvelope.User` has the Design role. Otherwise reply with `ServerError` ("Not authorized to browse data connections"). Reuses existing site-side role helper. + +**Registry.** `ManagementCommandRegistry` auto-discovers `*Command` records — `BrowseOpcUaNodeCommand` shows up automatically. `BrowseOpcUaNodeResult` / `BrowseNode` ride the existing management-namespace serialization registration. + +**Doc impact.** `docs/requirements/Component-DataConnectionLayer.md` — list the new command in the actor surface section. + +**Explicit non-goals.** No streaming response. No batch-browse-many-parents. No central-side caching. + +## Section 5 — Central UI popup + page integration + +**5a. `IOpcUaBrowseService` (CentralUI service)** + +```csharp +public interface IOpcUaBrowseService +{ + Task BrowseChildrenAsync( + string siteId, + int dataConnectionId, + string? parentNodeId, + CancellationToken ct); +} +``` + +Registered scoped in `Program.cs` next to the other CentralUI services. + +**5b. `` component** + +`Components/Dialogs/OpcUaBrowserDialog.razor` — Bootstrap 5 modal (no third-party UI framework). Layout, top to bottom: + +``` +┌─ Modal header ──────────────────────────────────────────┐ +│ Browse OPC UA — [×] │ +├─ Modal body ────────────────────────────────────────────┤ +│ [Error banner — shown only on failure] │ +│ │ +│ ▶ Objects │ +│ ▶ Devices │ +│ ▼ Pump1 │ +│ • Speed (selectable — variable) │ +│ • Status (selectable — variable) │ +│ ▶ Diagnostics │ +│ ▶ Pump2 │ +│ │ +│ ───────────────────────────────────────────────────── │ +│ Manual node id: [ns=2;s=Pump1.Speed ] [Use] │ +├─ Modal footer ──────────────────────────────────────────┤ +│ Selected: ns=2;s=Pump1.Speed │ +│ [Cancel] [Select] │ +└─────────────────────────────────────────────────────────┘ +``` + +Behavior: +- **Lazy load.** Open → `BrowseChildrenAsync(null)` for the server root. Each chevron click → `BrowseChildrenAsync(thatNode.NodeId)` once and caches the children in component-local state for the modal lifetime (closing drops the cache — live next time). +- **Selectable rows.** Variables render selectable (radio-style highlight); Objects/Methods navigable but non-selectable. Double-click a Variable = select + confirm. +- **Manual paste.** Text input below the tree with a "Use" button. Non-empty validation only — bad node ids surface at subscribe time (matches existing template `DataSourceReference` behavior). +- **Error banner.** Renders on `BrowseFailure`. Text varies by `BrowseFailureKind`; mapping lives in the component. Tree empties; manual-paste stays usable. Retry button issues the last Browse again. +- **Truncated.** When `Truncated == true`, render a "Results truncated — use manual entry if your tag isn't listed" hint under that subtree. +- **Cancel** discards pending selection; **Select** invokes `EventCallback OnSelected` with the NodeId and closes. + +Parameters: +```csharp +[Parameter] public string SiteId { get; set; } = ""; +[Parameter] public int DataConnectionId { get; set; } +[Parameter] public string ConnectionName { get; set; } = ""; +[Parameter] public string? InitialNodeId { get; set; } +[Parameter] public EventCallback OnSelected { get; set; } +[Parameter] public EventCallback OnCancelled { get; set; } +``` + +**5c. Page integration — `InstanceConfigure.razor`** + +The existing Connection Bindings table gets two new cells per attribute row: + +| Attribute | Tag Path (template default) | Override | | Connection | +|---|---|---|---|---| +| Speed | `ns=2;s=Pump.Speed` | `ns=2;s=Pump1.Speed` | **[Browse…]** | `[ Site-A OPC UA ▾ ]` | +| Status | `ns=2;s=Pump.Status` | *(uses default)* | **[Browse…]** | `[ Site-A OPC UA ▾ ]` | + +- **Override** cell — plain text input bound to `binding.DataSourceReferenceOverride`. Empty = use template default (greyed placeholder shows what that default is). +- **[Browse…]** button: + - Hidden when row's `DataConnection.Protocol` is not OPC UA. + - Disabled when no connection selected on that row (tooltip: "Pick a connection first"). + - Otherwise opens `` with the row's `SiteId`, `DataConnectionId`, and `InitialNodeId = override ?? templateDefault`. +- On `OnSelected`, the value writes into the row's `DataSourceReferenceOverride`. If the user picks back exactly the template default we don't auto-clear — the user can blank the field to revert explicitly. +- Existing Save / bulk-assign flows untouched. Override travels in the same `ConnectionBinding` record the page already submits. + +**Styling.** Existing CentralUI Bootstrap conventions; same modal/table classes as elsewhere on the page. Use the `frontend-design` skill at implementation time per project convention. + +**Doc impact.** `docs/requirements/Component-CentralUI.md` — extend Connection Bindings section with override column + popup description. + +**Explicit non-goals.** No "test this binding" button (read once to verify). No multi-select in the dialog. No per-row "overridden" badge outside the table. + +## Section 6 — Error handling, testing, rollout + +**6a. Error handling** + +| Failure | Where caught | User-visible result | +|---|---|---| +| Site offline / ClusterClient timeout | `CommunicationService.SendCommandToSiteAsync` throws | Dialog renders red banner "Site unreachable — try again or enter the node id manually"; manual-paste stays active | +| `ConnectionNotFound` | Site DCL handler | Banner "Connection no longer exists at site" | +| `ConnectionNotConnected` | `RealOpcUaClient.BrowseChildrenAsync` throws → wrapped | Banner "OPC UA session not connected — retry shortly or use manual entry". Retry re-issues the Browse | +| `NotBrowsable` | Site DCL handler | Should be impossible (button is protocol-gated). Defense-in-depth banner | +| `Timeout` | Site DCL handler / Ask timeout | Banner "Browse timed out — server may be slow" | +| `ServerError` (OPC UA `Bad_*`) | Site DCL handler | Banner with verbatim message | +| Bad node id from manual-paste | Not validated client-side | Surfaces at deploy/run time (matches today's behavior for template `DataSourceReference`) | +| Save fails | Existing instance-save handler | Unchanged — override rides along in the existing round-trip | + +Two invariants: +- **Browse never blocks the page.** All calls cancellable; closing the modal cancels in-flight Asks. +- **Override edits never require a successful browse.** Manual-paste is always available — site outages can't block instance configuration. + +**6b. Testing** + +Unit, by layer: +- **Commons** — round-trip serialization of `BrowseOpcUaNodeCommand` / `BrowseOpcUaNodeResult` / `BrowseNode`. +- **DCL** — `RealOpcUaClient.BrowseChildrenAsync` against a fake `Opc.Ua.Client.Session`. Covers: root browse, child browse, `Truncated`, `ConnectionNotConnectedException`, `ServiceResultException` propagation. +- **Site command handler** — actor-spec test: unknown connection id → `ConnectionNotFound`; non-browsable → `NotBrowsable`; success path mapped correctly; canceled Ask → `Timeout`; OPC UA exception → `ServerError`. +- **Auth** — site handler rejects a `ManagementEnvelope` whose user lacks Design role. +- **CentralUI service** — `OpcUaBrowseService` unit test with a fake `CommunicationService`. + +Integration: +- Live OPC UA against `infra/`'s OPC UA test server. A browse-roundtrip integration test deploys, opens a connection, issues `BrowseOpcUaNodeCommand`, asserts non-empty children at root, expands one level, asserts a known Variable is present. + +UI: +- Manual smoke: `bash docker/deploy.sh`, sign in as `multi-role`, open an instance configure page, click Browse, walk tree, pick a variable, confirm the override saves and round-trips on reload. Repeat with the site stopped to verify the offline error banner + manual-paste fallback. + +Not in this slice (deliberate): bUnit / Blazor render tests for the dialog — the repo doesn't use bUnit elsewhere today. Manual smoke + service-layer unit tests cover the behavior. + +**6c. Rollout / migration** + +One EF migration: add `DataSourceReferenceOverride NVARCHAR(512) NULL` to `InstanceConnectionBindings`. Auto-applies in dev (`docker/deploy.sh`); manual SQL script in prod. No backfill. + +Wire compatibility: `BrowseOpcUaNodeCommand` is new — older site builds won't have a handler; central on a new build talking to a site on an older build gets a "command not handled" reply that the dialog treats as transient `ServerError`. The additive `DataSourceReferenceOverride` field on `ConnectionBinding` lets older sites deserialize it as null and behave as today — save-side stays compatible. + +Order of merge (one PR each): +1. Commons messages + `IBrowsableDataConnection` interface + `InstanceConnectionBinding.DataSourceReferenceOverride` field + EF migration. +2. `RealOpcUaClient.BrowseChildrenAsync` + site DCL command handler + auth check. +3. Runtime resolution change (`effectiveAddress = override ?? template default`) + regression tests for value subscribes. +4. CentralUI service + `` + page integration. +5. Doc updates (`Component-DataConnectionLayer`, `Component-TemplateEngine`, `Component-CentralUI`). + +Each step is independently shippable. + +**6d. Explicit non-goals (whole feature)** + +- No browse caching at central beyond local component state for the modal's lifetime. +- No address-space search. +- No paging beyond the `Truncated` flag. +- No type-information surfacing (data type, value rank, access level) in `BrowseNode`. +- No bulk override import / CSV / find-and-replace tooling.