Files
ScadaBridge/docs/plans/2026-05-28-opcua-tag-browser-design.md
T
Joseph Doherty 43228185b4 docs: convert standard diagrams from draw.io PNGs to inline Mermaid
Gitea renders mermaid inline, so the flow/state/hierarchy/DAG diagrams
move to text-in-markdown: auto-layout (removes the manual overlap-prone
draw.io step), diffable source, no committed binaries, and a dark-text
theme so labels stay legible. Keep draw.io PNGs only for the two complex
bespoke diagrams (logical architecture, env2 topology) where pixel
control still wins. All 24 mermaid blocks validated by rendering.
2026-06-01 00:23:00 -04:00

22 KiB
Raw Blame History

OPC UA Tag Browser Popup on Instance Configure Page — Design

Date: 2026-05-28 Status: Approved, ready for implementation plan Scope: Add a popup OPC UA address-space browser to the template-instance config page so users can pick the actual physical tag for each attribute binding. Adds a per-instance address override that beats the template's DataSourceReference at runtime, plus a new OPC UA browse capability in the Data Connection Layer surfaced through a new ClusterClient management command.

Decisions

# Decision Selected option
1 Pick location & persistence On the instance config page — per-instance override field on InstanceConnectionBinding. Template still defines the default.
2 Browse source Live, on-demand browse to the site per click. No caching.
3 Selectable node types Variables only, with a manual-paste fallback. Objects/Methods navigable but non-selectable.
4 Offline behavior Popup opens, shows error banner, manual-paste field stays usable.
5 Search Lazy-loaded tree only — no search in this slice.
6 Transport ClusterClient command/control (BrowseOpcUaNodeCommand / BrowseOpcUaNodeResult). Not gRPC.
7 DCL surface New capability interface IBrowsableDataConnection implemented by RealOpcUaClient; IDataConnection unchanged.

Section 1 — Architecture

%%{init: {'theme':'base', 'themeVariables': {'textColor':'#111111','lineColor':'#555555','edgeLabelBackground':'#ffffff','fontSize':'15px'}}}%%
flowchart TD
    N1["Blazor Server browser"]
    N2["CentralUI: InstanceConfigure.razor"]
    N3["CentralUI: OpcUaBrowserDialog component"]
    N4["CentralUI: IOpcUaBrowseService"]
    N5["CommunicationService.SendCommandToSiteAsync of BrowseOpcUaNodeResult (siteId, BrowseOpcUaNodeCommand)"]
    N6["Site: CentralCommunicationActor → DataConnectionManagerActor"]
    N7["OPC UA server"]

    N1 -->|SignalR| N2
    N2 -->|opens| N3
    N3 -->|uses| N4
    N4 -->|implementation calls| N5
    N5 -->|"ClusterClient Ask<br/>ManagementEnvelope { User, Command, CorrelationId }"| N6
    N6 -->|"dispatches to IBrowsableDataConnection (RealOpcUaClient)<br/>OPC Foundation .NET SDK Browse service"| N7

    classDef start fill:#d5e8d4,stroke:#82b366,color:#111111;
    classDef proc fill:#dae8fc,stroke:#6c8ebf,color:#111111;
    classDef warn fill:#ffe6cc,stroke:#d79b00,color:#111111;
    classDef alt fill:#e1d5e7,stroke:#9673a6,color:#111111;
    class N1,N2,N3,N4 proc
    class N5 alt
    class N6 start
    class N7 warn

Three slices, top-to-bottom:

  1. Data model. Add a per-instance OPC UA address override (new column on InstanceConnectionBinding).
  2. DCL browse capability. New IBrowsableDataConnection capability interface; RealOpcUaClient.BrowseChildrenAsync; new site-side message handler.
  3. Central UI popup + plumbing. <OpcUaBrowserDialog/> Bootstrap modal, IOpcUaBrowseService, "Browse…" button on each row of the Connection Bindings table.

Cross-cutting:

  • Browse runs at the site against the currently selected DataConnection for that attribute row. If no connection is selected on that row, "Browse…" is disabled.
  • "Browse…" only appears when the selected connection's Protocol == OpcUa. For other protocols the plain text input stays as-is.
  • Auth: existing ManagementEnvelope.User carries the authenticated user; site-side handler requires the Design role (same role that authorizes editing instance bindings today).
  • Audit: edits to the override field already fall under the existing instance-edit audit flow — no new audit code path.

Section 2 — Data model

Commons entity (src/ZB.MOM.WW.ScadaBridge.Commons/Entities/Instances/InstanceConnectionBinding.cs):

public class InstanceConnectionBinding
{
    public int InstanceId { get; set; }
    public string AttributeName { get; set; } = "";
    public int DataConnectionId { get; set; }
    public string? DataSourceReferenceOverride { get; set; }   // NEW — nullable
}
  • null = use the template's DataSourceReference (today's behavior, unchanged).
  • non-null = override at this instance for this attribute. Full node identifier the runtime hands to the OPC UA client (no partial-path / template semantics).

Management message (InstanceCommands.cs):

public record ConnectionBinding(
    string AttributeName,
    int DataConnectionId,
    string? DataSourceReferenceOverride);   // NEW — additive

Additive — older sites that don't know the field deserialize it as null and behave as today.

EF mapping. New nullable column DataSourceReferenceOverride NVARCHAR(512) NULL on the InstanceConnectionBindings table. EF Core migration auto-applies in dev; manual SQL script in production. No backfill — NULL means "use template default".

Runtime resolution rule (in Site Runtime where bindings are resolved to physical tag paths):

effectiveAddress = binding.DataSourceReferenceOverride
                ?? templateAttribute.DataSourceReference;

One line, one place — the site-side resolver that today reads templateAttribute.DataSourceReference.

Doc impact.

  • docs/requirements/Component-TemplateEngine.md — the "DataSourceReference is fixed at instance level" claim becomes "DataSourceReference defines the default; instances may override per attribute via InstanceConnectionBinding.DataSourceReferenceOverride."
  • docs/requirements/Component-CentralUI.md — Connection Bindings section gains the override field + popup description.

Explicit non-goals. No audit table — instance edits are already audited. No "reset to template default" button (clearing the field is enough). No per-attribute lock on the override.

Section 3 — DCL browse capability

Capability interface (src/ZB.MOM.WW.ScadaBridge.Commons/Interfaces/Protocol/IBrowsableDataConnection.cs):

public interface IBrowsableDataConnection
{
    Task<IReadOnlyList<BrowseNode>> BrowseChildrenAsync(
        string? parentNodeId,        // null = browse from server's Objects folder
        CancellationToken ct);
}

public record BrowseNode(
    string NodeId,                   // e.g. "ns=2;s=Devices.Pump1.Speed"
    string DisplayName,              // e.g. "Speed"
    BrowseNodeClass NodeClass,       // Object | Variable | Method | Other
    bool HasChildren);

public enum BrowseNodeClass { Object, Variable, Method, Other }
  • null parent → browse from ObjectsFolder (ns=0;i=85).
  • HasChildren populated from the OPC UA Browse response so the tree can render the chevron without a second roundtrip.

RealOpcUaClient implements it. The existing client holds a connected Opc.Ua.Client.Session. New method calls session.Browse(...) with BrowseDirection.Forward, nodeClassMask = Object | Variable, resultMask = BrowseName | DisplayName | NodeClass | TypeDefinition. Each result row maps directly to a BrowseNode. Method nodes returned by the server are mapped (so the UI can grey them as non-selectable) — not hidden, but not promised to bind.

Connection-state gating. Browse only valid when Status == Connected. If not, throw a typed ConnectionNotConnectedException — the site-side command handler catches it and translates to a structured BrowseFailure. No silent-empty.

Custom (non-OPC UA) protocols. Do NOT implement IBrowsableDataConnection. The site command handler checks if (connection is IBrowsableDataConnection b) ... and returns a NotBrowsable failure otherwise. Central UI button is already protocol-gated; this is defense in depth.

Cancellation + timeout. Browse runs under the same CancellationToken as the ClusterClient Ask (default 30s in CommunicationService). OPC UA SDK Browse call wrapped consistent with the rest of RealOpcUaClient.

Doc impact. docs/requirements/Component-DataConnectionLayer.md — add a "Browsing" subsection: clean-data-pipe philosophy still holds, browse is an opt-in capability for protocols that support it, only consumed by management/UI (never by Instance Actors).

Explicit non-goals. No browse caching at site. No BrowseNext paging in this first cut — surface a Truncated flag instead; users hit manual-paste if a parent has more than the server's per-call cap. No type-system mirroring (data type, value rank, access level).

Section 4 — Central↔site command + envelope

New messages (src/ZB.MOM.WW.ScadaBridge.Commons/Messages/Management/BrowseCommands.cs):

public record BrowseOpcUaNodeCommand(
    int DataConnectionId,
    string? ParentNodeId);                  // null = server root (ObjectsFolder)

public record BrowseOpcUaNodeResult(
    IReadOnlyList<BrowseNode> Children,
    bool Truncated,                         // true if server hit the per-call cap
    BrowseFailure? Failure);                // null on success

public record BrowseFailure(
    BrowseFailureKind Kind,
    string Message);

public enum BrowseFailureKind
{
    ConnectionNotFound,
    ConnectionNotConnected,
    NotBrowsable,
    Timeout,
    ServerError
}

Returning failure inside BrowseOpcUaNodeResult (rather than exceptions across ClusterClient) keeps the popup's error rendering clean.

Wire flow.

%%{init: {'theme':'base', 'themeVariables': {'textColor':'#111111','lineColor':'#555555','edgeLabelBackground':'#ffffff','fontSize':'15px'}}}%%
flowchart TD
    S1["CentralUI.OpcUaBrowseService.BrowseChildrenAsync(siteId, connId, parent)"]
    S2["CommunicationService.SendCommandToSiteAsync of BrowseOpcUaNodeResult (siteId, new BrowseOpcUaNodeCommand(connId, parent))"]
    S3["ManagementEnvelope { User, Command, CorrelationId }<br/>over ClusterClient"]
    S4["Site: CentralCommunicationActor unwraps envelope"]
    S5["Site: DataConnectionManagerActor receives BrowseOpcUaNodeCommand<br/>(DCL coordinator actor — owns the per-connection IDataConnection instances)"]

    S1 --> S2 --> S3 --> S4 --> S5

    subgraph HANDLER["Handler logic"]
        direction TB
        HL["Look up IDataConnection by Id"]
        HNF["if not found → ConnectionNotFound"]
        HNB["if not (conn is IBrowsableDataConnection) → NotBrowsable"]
        HAW["else await conn.BrowseChildrenAsync(ParentNodeId, ct)"]
        HNC["Catch ConnectionNotConnectedException → ConnectionNotConnected"]
        HCN["Catch OperationCanceledException → Timeout"]
        HSVC["Catch ServiceResultException → ServerError + verbatim msg"]
        HSUC["Else success: BrowseOpcUaNodeResult(children, truncated, null)"]
        HL --- HNF --- HNB --- HAW --- HNC --- HCN --- HSVC --- HSUC
    end

    S5 -->|processes| HANDLER

    R1["Reply travels back via<br/>CentralCommunicationActor → CommunicationService"]
    R2["returned to CentralUI page"]
    HANDLER -->|result / failure| R1
    R1 --> R2

    classDef start fill:#d5e8d4,stroke:#82b366,color:#111111;
    classDef proc fill:#dae8fc,stroke:#6c8ebf,color:#111111;
    classDef dec fill:#fff2cc,stroke:#d6b656,color:#111111;
    classDef bad fill:#f8cecc,stroke:#b85450,color:#111111;
    classDef alt fill:#e1d5e7,stroke:#9673a6,color:#111111;
    class S1,R1,R2 proc
    class S2,S3 alt
    class S4,S5,HSUC start
    class HANDLER,HL,HAW dec
    class HNF,HNB,HNC,HCN,HSVC bad

Handler lives in the DCL coordinator actor (the same actor that owns the per-connection IDataConnection instances) — keeps lifecycle and browse co-located so we don't race against reconnect.

Auth check. Site-side handler validates ManagementEnvelope.User has the Design role. Otherwise reply with ServerError ("Not authorized to browse data connections"). Reuses existing site-side role helper.

Registry. ManagementCommandRegistry auto-discovers *Command records — BrowseOpcUaNodeCommand shows up automatically. BrowseOpcUaNodeResult / BrowseNode ride the existing management-namespace serialization registration.

Doc impact. docs/requirements/Component-DataConnectionLayer.md — list the new command in the actor surface section.

Explicit non-goals. No streaming response. No batch-browse-many-parents. No central-side caching.

Section 5 — Central UI popup + page integration

5a. IOpcUaBrowseService (CentralUI service)

public interface IOpcUaBrowseService
{
    Task<BrowseOpcUaNodeResult> BrowseChildrenAsync(
        string siteId,
        int dataConnectionId,
        string? parentNodeId,
        CancellationToken ct);
}

Registered scoped in Program.cs next to the other CentralUI services.

5b. <OpcUaBrowserDialog/> component

Components/Dialogs/OpcUaBrowserDialog.razor — Bootstrap 5 modal (no third-party UI framework). Layout, top to bottom:

┌─ Modal header ──────────────────────────────────────────┐
│  Browse OPC UA — <Connection Name>            [×]       │
├─ Modal body ────────────────────────────────────────────┤
│  [Error banner — shown only on failure]                 │
│                                                         │
│  ▶ Objects                                              │
│    ▶ Devices                                            │
│      ▼ Pump1                                            │
│        • Speed         (selectable — variable)          │
│        • Status        (selectable — variable)          │
│        ▶ Diagnostics                                    │
│      ▶ Pump2                                            │
│                                                         │
│  ─────────────────────────────────────────────────────  │
│  Manual node id:  [ns=2;s=Pump1.Speed         ] [Use]   │
├─ Modal footer ──────────────────────────────────────────┤
│  Selected: ns=2;s=Pump1.Speed                           │
│                              [Cancel]   [Select]        │
└─────────────────────────────────────────────────────────┘

Behavior:

  • Lazy load. Open → BrowseChildrenAsync(null) for the server root. Each chevron click → BrowseChildrenAsync(thatNode.NodeId) once and caches the children in component-local state for the modal lifetime (closing drops the cache — live next time).
  • Selectable rows. Variables render selectable (radio-style highlight); Objects/Methods navigable but non-selectable. Double-click a Variable = select + confirm.
  • Manual paste. Text input below the tree with a "Use" button. Non-empty validation only — bad node ids surface at subscribe time (matches existing template DataSourceReference behavior).
  • Error banner. Renders on BrowseFailure. Text varies by BrowseFailureKind; mapping lives in the component. Tree empties; manual-paste stays usable. Retry button issues the last Browse again.
  • Truncated. When Truncated == true, render a "Results truncated — use manual entry if your tag isn't listed" hint under that subtree.
  • Cancel discards pending selection; Select invokes EventCallback<string> OnSelected with the NodeId and closes.

Parameters:

[Parameter] public string SiteId { get; set; } = "";
[Parameter] public int DataConnectionId { get; set; }
[Parameter] public string ConnectionName { get; set; } = "";
[Parameter] public string? InitialNodeId { get; set; }
[Parameter] public EventCallback<string> OnSelected { get; set; }
[Parameter] public EventCallback OnCancelled { get; set; }

5c. Page integration — InstanceConfigure.razor

The existing Connection Bindings table gets two new cells per attribute row:

Attribute Tag Path (template default) Override Connection
Speed ns=2;s=Pump.Speed ns=2;s=Pump1.Speed [Browse…] [ Site-A OPC UA ▾ ]
Status ns=2;s=Pump.Status (uses default) [Browse…] [ Site-A OPC UA ▾ ]
  • Override cell — plain text input bound to binding.DataSourceReferenceOverride. Empty = use template default (greyed placeholder shows what that default is).
  • [Browse…] button:
    • Hidden when row's DataConnection.Protocol is not OPC UA.
    • Disabled when no connection selected on that row (tooltip: "Pick a connection first").
    • Otherwise opens <OpcUaBrowserDialog/> with the row's SiteId, DataConnectionId, and InitialNodeId = override ?? templateDefault.
  • On OnSelected, the value writes into the row's DataSourceReferenceOverride. If the user picks back exactly the template default we don't auto-clear — the user can blank the field to revert explicitly.
  • Existing Save / bulk-assign flows untouched. Override travels in the same ConnectionBinding record the page already submits.

Styling. Existing CentralUI Bootstrap conventions; same modal/table classes as elsewhere on the page. Use the frontend-design skill at implementation time per project convention.

Doc impact. docs/requirements/Component-CentralUI.md — extend Connection Bindings section with override column + popup description.

Explicit non-goals. No "test this binding" button (read once to verify). No multi-select in the dialog. No per-row "overridden" badge outside the table.

Section 6 — Error handling, testing, rollout

6a. Error handling

Failure Where caught User-visible result
Site offline / ClusterClient timeout CommunicationService.SendCommandToSiteAsync throws Dialog renders red banner "Site unreachable — try again or enter the node id manually"; manual-paste stays active
ConnectionNotFound Site DCL handler Banner "Connection no longer exists at site"
ConnectionNotConnected RealOpcUaClient.BrowseChildrenAsync throws → wrapped Banner "OPC UA session not connected — retry shortly or use manual entry". Retry re-issues the Browse
NotBrowsable Site DCL handler Should be impossible (button is protocol-gated). Defense-in-depth banner
Timeout Site DCL handler / Ask timeout Banner "Browse timed out — server may be slow"
ServerError (OPC UA Bad_*) Site DCL handler Banner with verbatim message
Bad node id from manual-paste Not validated client-side Surfaces at deploy/run time (matches today's behavior for template DataSourceReference)
Save fails Existing instance-save handler Unchanged — override rides along in the existing round-trip

Two invariants:

  • Browse never blocks the page. All calls cancellable; closing the modal cancels in-flight Asks.
  • Override edits never require a successful browse. Manual-paste is always available — site outages can't block instance configuration.

6b. Testing

Unit, by layer:

  • Commons — round-trip serialization of BrowseOpcUaNodeCommand / BrowseOpcUaNodeResult / BrowseNode.
  • DCLRealOpcUaClient.BrowseChildrenAsync against a fake Opc.Ua.Client.Session. Covers: root browse, child browse, Truncated, ConnectionNotConnectedException, ServiceResultException propagation.
  • Site command handler — actor-spec test: unknown connection id → ConnectionNotFound; non-browsable → NotBrowsable; success path mapped correctly; canceled Ask → Timeout; OPC UA exception → ServerError.
  • Auth — site handler rejects a ManagementEnvelope whose user lacks Design role.
  • CentralUI serviceOpcUaBrowseService unit test with a fake CommunicationService.

Integration:

  • Live OPC UA against infra/'s OPC UA test server. A browse-roundtrip integration test deploys, opens a connection, issues BrowseOpcUaNodeCommand, asserts non-empty children at root, expands one level, asserts a known Variable is present.

UI:

  • Manual smoke: bash docker/deploy.sh, sign in as multi-role, open an instance configure page, click Browse, walk tree, pick a variable, confirm the override saves and round-trips on reload. Repeat with the site stopped to verify the offline error banner + manual-paste fallback.

Not in this slice (deliberate): bUnit / Blazor render tests for the dialog — the repo doesn't use bUnit elsewhere today. Manual smoke + service-layer unit tests cover the behavior.

6c. Rollout / migration

One EF migration: add DataSourceReferenceOverride NVARCHAR(512) NULL to InstanceConnectionBindings. Auto-applies in dev (docker/deploy.sh); manual SQL script in prod. No backfill.

Wire compatibility: BrowseOpcUaNodeCommand is new — older site builds won't have a handler; central on a new build talking to a site on an older build gets a "command not handled" reply that the dialog treats as transient ServerError. The additive DataSourceReferenceOverride field on ConnectionBinding lets older sites deserialize it as null and behave as today — save-side stays compatible.

Order of merge (one PR each):

  1. Commons messages + IBrowsableDataConnection interface + InstanceConnectionBinding.DataSourceReferenceOverride field + EF migration.
  2. RealOpcUaClient.BrowseChildrenAsync + site DCL command handler + auth check.
  3. Runtime resolution change (effectiveAddress = override ?? template default) + regression tests for value subscribes.
  4. CentralUI service + <OpcUaBrowserDialog/> + page integration.
  5. Doc updates (Component-DataConnectionLayer, Component-TemplateEngine, Component-CentralUI).

Each step is independently shippable.

6d. Explicit non-goals (whole feature)

  • No browse caching at central beyond local component state for the modal's lifetime.
  • No address-space search.
  • No paging beyond the Truncated flag.
  • No type-information surfacing (data type, value rank, access level) in BrowseNode.
  • No bulk override import / CSV / find-and-replace tooling.