plan: design for OPC UA tag browser popup on instance config page

Per-instance address override + live ClusterClient-based browse via a new
IBrowsableDataConnection capability on RealOpcUaClient. Lazy-loaded tree
with manual-paste fallback; offline-safe.
This commit is contained in:
Joseph Doherty
2026-05-28 11:33:12 -04:00
parent de05c65992
commit 8632c098b9
@@ -0,0 +1,338 @@
# OPC UA Tag Browser Popup on Instance Configure Page — Design
**Date:** 2026-05-28
**Status:** Approved, ready for implementation plan
**Scope:** Add a popup OPC UA address-space browser to the template-instance config page so users can pick the actual physical tag for each attribute binding. Adds a per-instance address override that beats the template's `DataSourceReference` at runtime, plus a new OPC UA browse capability in the Data Connection Layer surfaced through a new ClusterClient management command.
## Decisions
| # | Decision | Selected option |
|---|----------|-----------------|
| 1 | Pick location & persistence | On the instance config page — per-instance override field on `InstanceConnectionBinding`. Template still defines the default. |
| 2 | Browse source | Live, on-demand browse to the site per click. No caching. |
| 3 | Selectable node types | Variables only, with a manual-paste fallback. Objects/Methods navigable but non-selectable. |
| 4 | Offline behavior | Popup opens, shows error banner, manual-paste field stays usable. |
| 5 | Search | Lazy-loaded tree only — no search in this slice. |
| 6 | Transport | ClusterClient command/control (`BrowseOpcUaNodeCommand` / `BrowseOpcUaNodeResult`). Not gRPC. |
| 7 | DCL surface | New capability interface `IBrowsableDataConnection` implemented by `RealOpcUaClient`; `IDataConnection` unchanged. |
## Section 1 — Architecture
```
[Blazor Server browser]
│ SignalR
[CentralUI: InstanceConfigure.razor]
│ opens
[CentralUI: <OpcUaBrowserDialog/>]
│ uses
[CentralUI: IOpcUaBrowseService] ── implementation calls
[CommunicationService.SendCommandToSiteAsync<BrowseOpcUaNodeResult>(siteId, BrowseOpcUaNodeCommand)]
│ ClusterClient Ask, ManagementEnvelope { User, Command, CorrelationId }
[Site: CentralCommunicationActor → DataConnectionManagerActor]
│ dispatches to IBrowsableDataConnection (RealOpcUaClient)
[OPC UA server] ◄── OPC Foundation .NET SDK Browse service
```
Three slices, top-to-bottom:
1. **Data model.** Add a per-instance OPC UA address override (new column on `InstanceConnectionBinding`).
2. **DCL browse capability.** New `IBrowsableDataConnection` capability interface; `RealOpcUaClient.BrowseChildrenAsync`; new site-side message handler.
3. **Central UI popup + plumbing.** `<OpcUaBrowserDialog/>` Bootstrap modal, `IOpcUaBrowseService`, "Browse…" button on each row of the Connection Bindings table.
Cross-cutting:
- Browse runs at the site against the **currently selected** `DataConnection` for that attribute row. If no connection is selected on that row, "Browse…" is disabled.
- "Browse…" only appears when the selected connection's `Protocol == OpcUa`. For other protocols the plain text input stays as-is.
- Auth: existing `ManagementEnvelope.User` carries the authenticated user; site-side handler requires the Design role (same role that authorizes editing instance bindings today).
- Audit: edits to the override field already fall under the existing instance-edit audit flow — no new audit code path.
## Section 2 — Data model
**Commons entity** (`src/ZB.MOM.WW.ScadaBridge.Commons/Entities/Instances/InstanceConnectionBinding.cs`):
```csharp
public class InstanceConnectionBinding
{
public int InstanceId { get; set; }
public string AttributeName { get; set; } = "";
public int DataConnectionId { get; set; }
public string? DataSourceReferenceOverride { get; set; } // NEW — nullable
}
```
- `null` = use the template's `DataSourceReference` (today's behavior, unchanged).
- non-null = override at this instance for this attribute. Full node identifier the runtime hands to the OPC UA client (no partial-path / template semantics).
**Management message** (`InstanceCommands.cs`):
```csharp
public record ConnectionBinding(
string AttributeName,
int DataConnectionId,
string? DataSourceReferenceOverride); // NEW — additive
```
Additive — older sites that don't know the field deserialize it as null and behave as today.
**EF mapping.** New nullable column `DataSourceReferenceOverride NVARCHAR(512) NULL` on the `InstanceConnectionBindings` table. EF Core migration auto-applies in dev; manual SQL script in production. No backfill — `NULL` means "use template default".
**Runtime resolution rule** (in Site Runtime where bindings are resolved to physical tag paths):
```csharp
effectiveAddress = binding.DataSourceReferenceOverride
?? templateAttribute.DataSourceReference;
```
One line, one place — the site-side resolver that today reads `templateAttribute.DataSourceReference`.
**Doc impact.**
- `docs/requirements/Component-TemplateEngine.md` — the "DataSourceReference is fixed at instance level" claim becomes "DataSourceReference defines the default; instances may override per attribute via `InstanceConnectionBinding.DataSourceReferenceOverride`."
- `docs/requirements/Component-CentralUI.md` — Connection Bindings section gains the override field + popup description.
**Explicit non-goals.** No audit table — instance edits are already audited. No "reset to template default" button (clearing the field is enough). No per-attribute lock on the override.
## Section 3 — DCL browse capability
**Capability interface** (`src/ZB.MOM.WW.ScadaBridge.Commons/Interfaces/Protocol/IBrowsableDataConnection.cs`):
```csharp
public interface IBrowsableDataConnection
{
Task<IReadOnlyList<BrowseNode>> BrowseChildrenAsync(
string? parentNodeId, // null = browse from server's Objects folder
CancellationToken ct);
}
public record BrowseNode(
string NodeId, // e.g. "ns=2;s=Devices.Pump1.Speed"
string DisplayName, // e.g. "Speed"
BrowseNodeClass NodeClass, // Object | Variable | Method | Other
bool HasChildren);
public enum BrowseNodeClass { Object, Variable, Method, Other }
```
- `null` parent → browse from `ObjectsFolder` (`ns=0;i=85`).
- `HasChildren` populated from the OPC UA Browse response so the tree can render the chevron without a second roundtrip.
**RealOpcUaClient implements it.** The existing client holds a connected `Opc.Ua.Client.Session`. New method calls `session.Browse(...)` with `BrowseDirection.Forward`, `nodeClassMask = Object | Variable`, `resultMask = BrowseName | DisplayName | NodeClass | TypeDefinition`. Each result row maps directly to a `BrowseNode`. Method nodes returned by the server are mapped (so the UI can grey them as non-selectable) — not hidden, but not promised to bind.
**Connection-state gating.** Browse only valid when `Status == Connected`. If not, throw a typed `ConnectionNotConnectedException` — the site-side command handler catches it and translates to a structured `BrowseFailure`. No silent-empty.
**Custom (non-OPC UA) protocols.** Do NOT implement `IBrowsableDataConnection`. The site command handler checks `if (connection is IBrowsableDataConnection b) ...` and returns a `NotBrowsable` failure otherwise. Central UI button is already protocol-gated; this is defense in depth.
**Cancellation + timeout.** Browse runs under the same `CancellationToken` as the ClusterClient Ask (default 30s in `CommunicationService`). OPC UA SDK Browse call wrapped consistent with the rest of `RealOpcUaClient`.
**Doc impact.** `docs/requirements/Component-DataConnectionLayer.md` — add a "Browsing" subsection: clean-data-pipe philosophy still holds, browse is an opt-in capability for protocols that support it, only consumed by management/UI (never by Instance Actors).
**Explicit non-goals.** No browse caching at site. No `BrowseNext` paging in this first cut — surface a `Truncated` flag instead; users hit manual-paste if a parent has more than the server's per-call cap. No type-system mirroring (data type, value rank, access level).
## Section 4 — Central↔site command + envelope
**New messages** (`src/ZB.MOM.WW.ScadaBridge.Commons/Messages/Management/BrowseCommands.cs`):
```csharp
public record BrowseOpcUaNodeCommand(
int DataConnectionId,
string? ParentNodeId); // null = server root (ObjectsFolder)
public record BrowseOpcUaNodeResult(
IReadOnlyList<BrowseNode> Children,
bool Truncated, // true if server hit the per-call cap
BrowseFailure? Failure); // null on success
public record BrowseFailure(
BrowseFailureKind Kind,
string Message);
public enum BrowseFailureKind
{
ConnectionNotFound,
ConnectionNotConnected,
NotBrowsable,
Timeout,
ServerError
}
```
Returning failure inside `BrowseOpcUaNodeResult` (rather than exceptions across ClusterClient) keeps the popup's error rendering clean.
**Wire flow.**
```
CentralUI.OpcUaBrowseService.BrowseChildrenAsync(siteId, connId, parent)
→ CommunicationService.SendCommandToSiteAsync<BrowseOpcUaNodeResult>(
siteId,
new BrowseOpcUaNodeCommand(connId, parent))
→ ManagementEnvelope { User, Command, CorrelationId } over ClusterClient
→ Site: CentralCommunicationActor unwraps envelope
→ Site: DataConnectionManagerActor receives BrowseOpcUaNodeCommand
- Look up IDataConnection by Id
- if not found → ConnectionNotFound
- if !(conn is IBrowsableDataConnection) → NotBrowsable
- else await conn.BrowseChildrenAsync(ParentNodeId, ct)
- Catch ConnectionNotConnectedException → ConnectionNotConnected
- Catch OperationCanceledException → Timeout
- Catch ServiceResultException → ServerError + verbatim msg
- Else success: BrowseOpcUaNodeResult(children, truncated, null)
→ Reply travels back via CentralCommunicationActor → CommunicationService
→ returned to CentralUI page
```
Handler lives in the **DCL coordinator actor** (the same actor that owns the per-connection `IDataConnection` instances) — keeps lifecycle and browse co-located so we don't race against reconnect.
**Auth check.** Site-side handler validates `ManagementEnvelope.User` has the Design role. Otherwise reply with `ServerError` ("Not authorized to browse data connections"). Reuses existing site-side role helper.
**Registry.** `ManagementCommandRegistry` auto-discovers `*Command` records — `BrowseOpcUaNodeCommand` shows up automatically. `BrowseOpcUaNodeResult` / `BrowseNode` ride the existing management-namespace serialization registration.
**Doc impact.** `docs/requirements/Component-DataConnectionLayer.md` — list the new command in the actor surface section.
**Explicit non-goals.** No streaming response. No batch-browse-many-parents. No central-side caching.
## Section 5 — Central UI popup + page integration
**5a. `IOpcUaBrowseService` (CentralUI service)**
```csharp
public interface IOpcUaBrowseService
{
Task<BrowseOpcUaNodeResult> BrowseChildrenAsync(
string siteId,
int dataConnectionId,
string? parentNodeId,
CancellationToken ct);
}
```
Registered scoped in `Program.cs` next to the other CentralUI services.
**5b. `<OpcUaBrowserDialog/>` component**
`Components/Dialogs/OpcUaBrowserDialog.razor` — Bootstrap 5 modal (no third-party UI framework). Layout, top to bottom:
```
┌─ Modal header ──────────────────────────────────────────┐
│ Browse OPC UA — <Connection Name> [×] │
├─ Modal body ────────────────────────────────────────────┤
│ [Error banner — shown only on failure] │
│ │
│ ▶ Objects │
│ ▶ Devices │
│ ▼ Pump1 │
│ • Speed (selectable — variable) │
│ • Status (selectable — variable) │
│ ▶ Diagnostics │
│ ▶ Pump2 │
│ │
│ ───────────────────────────────────────────────────── │
│ Manual node id: [ns=2;s=Pump1.Speed ] [Use] │
├─ Modal footer ──────────────────────────────────────────┤
│ Selected: ns=2;s=Pump1.Speed │
│ [Cancel] [Select] │
└─────────────────────────────────────────────────────────┘
```
Behavior:
- **Lazy load.** Open → `BrowseChildrenAsync(null)` for the server root. Each chevron click → `BrowseChildrenAsync(thatNode.NodeId)` once and caches the children in component-local state for the modal lifetime (closing drops the cache — live next time).
- **Selectable rows.** Variables render selectable (radio-style highlight); Objects/Methods navigable but non-selectable. Double-click a Variable = select + confirm.
- **Manual paste.** Text input below the tree with a "Use" button. Non-empty validation only — bad node ids surface at subscribe time (matches existing template `DataSourceReference` behavior).
- **Error banner.** Renders on `BrowseFailure`. Text varies by `BrowseFailureKind`; mapping lives in the component. Tree empties; manual-paste stays usable. Retry button issues the last Browse again.
- **Truncated.** When `Truncated == true`, render a "Results truncated — use manual entry if your tag isn't listed" hint under that subtree.
- **Cancel** discards pending selection; **Select** invokes `EventCallback<string> OnSelected` with the NodeId and closes.
Parameters:
```csharp
[Parameter] public string SiteId { get; set; } = "";
[Parameter] public int DataConnectionId { get; set; }
[Parameter] public string ConnectionName { get; set; } = "";
[Parameter] public string? InitialNodeId { get; set; }
[Parameter] public EventCallback<string> OnSelected { get; set; }
[Parameter] public EventCallback OnCancelled { get; set; }
```
**5c. Page integration — `InstanceConfigure.razor`**
The existing Connection Bindings table gets two new cells per attribute row:
| Attribute | Tag Path (template default) | Override | | Connection |
|---|---|---|---|---|
| Speed | `ns=2;s=Pump.Speed` | `ns=2;s=Pump1.Speed` | **[Browse…]** | `[ Site-A OPC UA ▾ ]` |
| Status | `ns=2;s=Pump.Status` | *(uses default)* | **[Browse…]** | `[ Site-A OPC UA ▾ ]` |
- **Override** cell — plain text input bound to `binding.DataSourceReferenceOverride`. Empty = use template default (greyed placeholder shows what that default is).
- **[Browse…]** button:
- Hidden when row's `DataConnection.Protocol` is not OPC UA.
- Disabled when no connection selected on that row (tooltip: "Pick a connection first").
- Otherwise opens `<OpcUaBrowserDialog/>` with the row's `SiteId`, `DataConnectionId`, and `InitialNodeId = override ?? templateDefault`.
- On `OnSelected`, the value writes into the row's `DataSourceReferenceOverride`. If the user picks back exactly the template default we don't auto-clear — the user can blank the field to revert explicitly.
- Existing Save / bulk-assign flows untouched. Override travels in the same `ConnectionBinding` record the page already submits.
**Styling.** Existing CentralUI Bootstrap conventions; same modal/table classes as elsewhere on the page. Use the `frontend-design` skill at implementation time per project convention.
**Doc impact.** `docs/requirements/Component-CentralUI.md` — extend Connection Bindings section with override column + popup description.
**Explicit non-goals.** No "test this binding" button (read once to verify). No multi-select in the dialog. No per-row "overridden" badge outside the table.
## Section 6 — Error handling, testing, rollout
**6a. Error handling**
| Failure | Where caught | User-visible result |
|---|---|---|
| Site offline / ClusterClient timeout | `CommunicationService.SendCommandToSiteAsync` throws | Dialog renders red banner "Site unreachable — try again or enter the node id manually"; manual-paste stays active |
| `ConnectionNotFound` | Site DCL handler | Banner "Connection no longer exists at site" |
| `ConnectionNotConnected` | `RealOpcUaClient.BrowseChildrenAsync` throws → wrapped | Banner "OPC UA session not connected — retry shortly or use manual entry". Retry re-issues the Browse |
| `NotBrowsable` | Site DCL handler | Should be impossible (button is protocol-gated). Defense-in-depth banner |
| `Timeout` | Site DCL handler / Ask timeout | Banner "Browse timed out — server may be slow" |
| `ServerError` (OPC UA `Bad_*`) | Site DCL handler | Banner with verbatim message |
| Bad node id from manual-paste | Not validated client-side | Surfaces at deploy/run time (matches today's behavior for template `DataSourceReference`) |
| Save fails | Existing instance-save handler | Unchanged — override rides along in the existing round-trip |
Two invariants:
- **Browse never blocks the page.** All calls cancellable; closing the modal cancels in-flight Asks.
- **Override edits never require a successful browse.** Manual-paste is always available — site outages can't block instance configuration.
**6b. Testing**
Unit, by layer:
- **Commons** — round-trip serialization of `BrowseOpcUaNodeCommand` / `BrowseOpcUaNodeResult` / `BrowseNode`.
- **DCL** — `RealOpcUaClient.BrowseChildrenAsync` against a fake `Opc.Ua.Client.Session`. Covers: root browse, child browse, `Truncated`, `ConnectionNotConnectedException`, `ServiceResultException` propagation.
- **Site command handler** — actor-spec test: unknown connection id → `ConnectionNotFound`; non-browsable → `NotBrowsable`; success path mapped correctly; canceled Ask → `Timeout`; OPC UA exception → `ServerError`.
- **Auth** — site handler rejects a `ManagementEnvelope` whose user lacks Design role.
- **CentralUI service** — `OpcUaBrowseService` unit test with a fake `CommunicationService`.
Integration:
- Live OPC UA against `infra/`'s OPC UA test server. A browse-roundtrip integration test deploys, opens a connection, issues `BrowseOpcUaNodeCommand`, asserts non-empty children at root, expands one level, asserts a known Variable is present.
UI:
- Manual smoke: `bash docker/deploy.sh`, sign in as `multi-role`, open an instance configure page, click Browse, walk tree, pick a variable, confirm the override saves and round-trips on reload. Repeat with the site stopped to verify the offline error banner + manual-paste fallback.
Not in this slice (deliberate): bUnit / Blazor render tests for the dialog — the repo doesn't use bUnit elsewhere today. Manual smoke + service-layer unit tests cover the behavior.
**6c. Rollout / migration**
One EF migration: add `DataSourceReferenceOverride NVARCHAR(512) NULL` to `InstanceConnectionBindings`. Auto-applies in dev (`docker/deploy.sh`); manual SQL script in prod. No backfill.
Wire compatibility: `BrowseOpcUaNodeCommand` is new — older site builds won't have a handler; central on a new build talking to a site on an older build gets a "command not handled" reply that the dialog treats as transient `ServerError`. The additive `DataSourceReferenceOverride` field on `ConnectionBinding` lets older sites deserialize it as null and behave as today — save-side stays compatible.
Order of merge (one PR each):
1. Commons messages + `IBrowsableDataConnection` interface + `InstanceConnectionBinding.DataSourceReferenceOverride` field + EF migration.
2. `RealOpcUaClient.BrowseChildrenAsync` + site DCL command handler + auth check.
3. Runtime resolution change (`effectiveAddress = override ?? template default`) + regression tests for value subscribes.
4. CentralUI service + `<OpcUaBrowserDialog/>` + page integration.
5. Doc updates (`Component-DataConnectionLayer`, `Component-TemplateEngine`, `Component-CentralUI`).
Each step is independently shippable.
**6d. Explicit non-goals (whole feature)**
- No browse caching at central beyond local component state for the modal's lifetime.
- No address-space search.
- No paging beyond the `Truncated` flag.
- No type-information surfacing (data type, value rank, access level) in `BrowseNode`.
- No bulk override import / CSV / find-and-replace tooling.