fix(dcl+centralui): MxGateway tag browse — lazy attributes, frame-size cap, wider scrollable picker

Expanding a Galaxy object in the tag picker hung on "loading…": the browse
reply inlined every child's full attribute set (~152 KB), exceeding Akka's
128 KB remote frame, and remoting silently discarded the oversized reply.

Browse path (DataConnectionLayer):
- RealMxGatewayClient: navigation now uses BrowseChildren(include_attributes=
  false) — child objects only — and an object's own attributes load lazily via
  DiscoverHierarchy(root, max_depth=0) when it's expanded. Payload drops from
  ~152 KB/level to a few KB. Seam contract unchanged.
- DataConnectionActor.CapBrowseChildren: protocol-agnostic byte-budget cap
  (~100 KB) on every BrowseNodeResult before it crosses the site→central
  frame, OR-ing the adapter's own Truncated flag. Byte budget, not a count —
  the only bound that holds regardless of NodeId/attribute-name length.
- RealOpcUaClient: requestedMaxReferencesPerNode 1000 → 500 to narrow the
  window before the byte budget applies.
- Graceful gRPC Unimplemented handling → NotSupportedException →
  BrowseFailureKind.NotBrowsable with an actionable message (older gateway
  builds lacking BrowseChildren).

Picker UI (CentralUI):
- NodeBrowserDialog: modal-lg → modal-xl; new scoped .razor.css caps the tree
  at 55vh with its own scrollbar so manual entry + Select/Cancel stay visible.
- Protocol-agnostic failure messages (was hardcoded "OPC UA …"); renamed the
  leftover opcua-browser-tree class to node-browser-tree.

Tests: new frame-budget cap test + NotSupported=>NotBrowsable mapping test;
DCL suite 88/88. Doc: Component-DataConnectionLayer.md records the lazy
attribute-light browse and the frame-size guard.
This commit is contained in:
Joseph Doherty
2026-05-29 09:53:19 -04:00
parent 0434fcee00
commit 4b6ff49822
7 changed files with 236 additions and 25 deletions
@@ -62,7 +62,7 @@ All protocols produce the same value tuple consumed by Instance Actors. Before t
- Connects to the **MxAccess Gateway** (AVEVA/Wonderware MXAccess-backed Galaxy) over gRPC using the `ZB.MOM.WW.MxGateway.Client` NuGet package (from the Gitea feed); `ZB.MOM.WW.MxGateway.Contracts` is pulled in transitively.
- Session-based: `OpenSession` + `Register` on connect; `AddItem` + `Advise` per subscription; value changes arrive on the gateway's server-streaming event feed (`StreamEvents`), resumable via `worker_sequence`.
- Read/Write via `ReadBulk` / `WriteBulk`; writes carry a configurable `WriteUserId`. Quality maps the OPC-style quality byte (≥192 Good, ≥64 Uncertain, else Bad), with a failing MXAccess status proxy treated as Bad.
- Galaxy hierarchy browse via the separate `GalaxyRepositoryClient` (`BrowseChildren`) — objects are navigable nodes (keyed by Galaxy gobject id), attributes are selectable leaves (keyed by full tag reference).
- Galaxy hierarchy browse via the separate `GalaxyRepositoryClient` — objects are navigable nodes (keyed by Galaxy gobject id), attributes are selectable leaves (keyed by full tag reference). Browse is **lazy and attribute-light**: navigation uses `BrowseChildren` with `include_attributes=false` (child objects only), and an object's own attributes are fetched only when it is expanded, via `DiscoverHierarchy(root=<object>, max_depth=0)` scoped to that single object. This keeps each browse level's reply small; inlining every child's full attribute set could exceed the Akka remote frame and silently drop the reply.
- Disconnect detection: a fault on the event stream raises `IDataConnection.Disconnected`, driving the same reconnection state machine as OPC UA.
- Implemented as `MxGatewayDataConnection` over an `IMxGatewayClient` seam; the seam is decoupled from the generated gRPC types (only `RealMxGatewayClient` references them), so the adapter is fully unit-testable with a fake.
@@ -171,6 +171,7 @@ DCL is a clean data pipe on the hot path. Browse is an **opt-in capability** for
- `DataConnectionManagerActor` handles `BrowseNodeCommand` (fields: `ConnectionName`, `ParentNodeId`) and replies with `BrowseNodeResult` (children + `Truncated` + structured `BrowseFailure?`). The Central UI facade is `IBrowseService`/`BrowseService`, backing the `NodeBrowserDialog` tag picker.
- Node ids are opaque protocol-specific strings: OPC UA uses NodeIds; MxGateway uses Galaxy gobject ids for navigable objects and full tag references for selectable attribute leaves.
- Browse runs against the live session; no caching at DCL.
- **Frame-size guard**: the reply crosses the site→central Akka frame (default 128 KB) on a temp Ask actor; an oversized reply is silently discarded by remoting, hanging the picker. The child handler caps each `BrowseNodeResult` to a byte budget (~100 KB) before replying, OR-ing the adapter's own truncation signal into `Truncated`. This is protocol-agnostic (every adapter's reply funnels through it). Per-protocol upstream caps narrow the window first: OPC UA requests at most 500 references per node (continuation point → `Truncated`); MxGateway relies on the gateway's `BrowseChildren` page cap. A `Truncated` level prompts manual node-id entry in the picker rather than auto-paging.
## Value Update Message Format
@@ -6,7 +6,7 @@
@if (_isVisible)
{
<div class="modal show d-block" tabindex="-1" role="dialog" style="background-color: rgba(0,0,0,0.5);">
<div class="modal-dialog modal-lg" role="document">
<div class="modal-dialog modal-xl" role="document">
<div class="modal-content">
<div class="modal-header">
<h5 class="modal-title">Browse — @ConnectionName</h5>
@@ -21,7 +21,7 @@
</div>
}
<div class="opcua-browser-tree">
<div class="node-browser-tree">
@if (_rootNodes.Count == 0 && _failure is null)
{
<em class="text-muted">Loading…</em>
@@ -167,20 +167,25 @@
_manualNodeId = node.NodeId;
}
// Task 17: map each BrowseFailureKind to a friendly UI message. The raw
// failure.Message is surfaced verbatim only for ServerError (which carries
// the OPC UA SDK's own Bad_* text) and as the default fallback for any
// future failure kind added without a UI mapping.
// Task 17: map each BrowseFailureKind to a friendly UI message. Messages are
// protocol-agnostic (the dialog serves every browsable protocol — OPC UA,
// MxGateway, …). The raw failure.Message is surfaced verbatim for ServerError
// (which carries the underlying protocol SDK's own error text), for
// NotBrowsable when the adapter supplied a reason (e.g. a gateway build that
// lacks the browse RPC), and as the default fallback for any future failure
// kind added without a UI mapping.
private void SetFailure(BrowseFailure failure)
{
_failure = failure;
_failureMessage = failure.Kind switch
{
BrowseFailureKind.ConnectionNotFound => "Connection no longer exists at the site.",
BrowseFailureKind.ConnectionNotConnected => "OPC UA session not connected — retry shortly or use manual entry.",
BrowseFailureKind.NotBrowsable => "This connection does not support browsing.",
BrowseFailureKind.ConnectionNotConnected => "Connection not connected — retry shortly or use manual entry.",
BrowseFailureKind.NotBrowsable => string.IsNullOrWhiteSpace(failure.Message)
? "This connection does not support browsing."
: failure.Message,
BrowseFailureKind.Timeout => "Browse timed out — the server may be slow. Try again or enter the node id manually.",
BrowseFailureKind.ServerError => $"OPC UA server error: {failure.Message}",
BrowseFailureKind.ServerError => $"Server error: {failure.Message}",
_ => failure.Message
};
StateHasChanged();
@@ -0,0 +1,14 @@
/* Scoped styles for the protocol-agnostic tag browse dialog. */
/* Cap the tree's height and let it scroll independently so deep hierarchies
(e.g. a Galaxy with many objects/attributes) don't push the manual-entry
field and Select/Cancel buttons off-screen. Both axes scroll: vertical for
long sibling lists, horizontal for deeply-indented nested nodes. */
.node-browser-tree {
max-height: 55vh;
overflow: auto;
border: 1px solid var(--bs-border-color, #dee2e6);
border-radius: 0.375rem;
padding: 0.5rem 0.75rem;
background-color: var(--bs-body-bg, #fff);
}
@@ -989,7 +989,7 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
///
/// Failure mapping:
/// <list type="bullet">
/// <item><see cref="BrowseFailureKind.NotBrowsable"/> — adapter is not <see cref="IBrowsableDataConnection"/>.</item>
/// <item><see cref="BrowseFailureKind.NotBrowsable"/> — adapter is not <see cref="IBrowsableDataConnection"/>, or it threw <see cref="NotSupportedException"/> (browsable adapter, but the server/protocol cannot browse — e.g. a gateway build predating the browse RPC); message carried verbatim in the latter case.</item>
/// <item><see cref="BrowseFailureKind.ConnectionNotConnected"/> — adapter threw <see cref="ConnectionNotConnectedException"/>.</item>
/// <item><see cref="BrowseFailureKind.Timeout"/> — adapter threw <see cref="OperationCanceledException"/>.</item>
/// <item><see cref="BrowseFailureKind.ServerError"/> — any other exception, message carried verbatim.</item>
@@ -1015,13 +1015,16 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
return;
}
_log.Debug("[{0}] Browsing OPC UA children of {1}", _connectionName, command.ParentNodeId ?? "(root)");
_log.Debug("[{0}] Browsing children of {1}", _connectionName, command.ParentNodeId ?? "(root)");
browsable.BrowseChildrenAsync(command.ParentNodeId).ContinueWith(t =>
{
if (t.IsCompletedSuccessfully)
{
return new BrowseNodeResult(t.Result.Children, t.Result.Truncated, Failure: null);
// Bound the reply to stay under Akka's remote frame size before it
// crosses the site→central boundary (see CapBrowseChildren).
var (children, truncated) = CapBrowseChildren(t.Result.Children, t.Result.Truncated);
return new BrowseNodeResult(children, truncated, Failure: null);
}
var baseEx = t.Exception?.GetBaseException();
@@ -1035,6 +1038,13 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
Array.Empty<BrowseNode>(),
Truncated: false,
new BrowseFailure(BrowseFailureKind.Timeout, "Browse cancelled.")),
// Adapter reachable but the protocol/server cannot browse (e.g. an
// MxGateway build that predates the BrowseChildren RPC). Carry the
// adapter's explanatory message through as NotBrowsable.
NotSupportedException notSupported => new BrowseNodeResult(
Array.Empty<BrowseNode>(),
Truncated: false,
new BrowseFailure(BrowseFailureKind.NotBrowsable, notSupported.Message)),
_ => new BrowseNodeResult(
Array.Empty<BrowseNode>(),
Truncated: false,
@@ -1045,6 +1055,47 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
}).PipeTo(sender);
}
/// <summary>
/// Estimated-byte ceiling for a single <see cref="BrowseNodeResult"/>, kept
/// comfortably below Akka's default 128 KB remote frame size. A browse reply
/// crosses the site→central frame on a temp Ask actor; an oversized reply is
/// silently discarded by remoting (the picker then hangs on "loading…"). The
/// limit is a byte budget rather than a child count because the only thing
/// that actually consumes frame space is serialized size — OPC UA NodeIds and
/// MxGateway tag references vary widely in length, so a fixed count is not a
/// safe proxy.
/// </summary>
private const int BrowseResultByteBudget = 100 * 1024;
/// <summary>
/// Truncates a browse child list to <see cref="BrowseResultByteBudget"/> using
/// a conservative per-node size estimate (JSON structural overhead plus the two
/// variable-length strings — ASCII NodeId/DisplayName ≈ 1 byte/char). Returns
/// the kept prefix and a <c>Truncated</c> flag OR-ed with the adapter's own
/// truncation signal, so the picker shows its "use manual entry" hint when the
/// level is clipped. Protocol-agnostic: every adapter's reply funnels through
/// here regardless of how it paginates upstream.
/// </summary>
private static (IReadOnlyList<BrowseNode> Children, bool Truncated) CapBrowseChildren(
IReadOnlyList<BrowseNode> children, bool truncated)
{
var budget = 0;
var kept = new List<BrowseNode>(children.Count);
foreach (var node in children)
{
budget += 64 + (node.NodeId?.Length ?? 0) + (node.DisplayName?.Length ?? 0);
if (budget > BrowseResultByteBudget)
{
truncated = true;
break;
}
kept.Add(node);
}
return (kept, truncated);
}
// ── Test Bindings (one-shot live read of bound tags) ──
/// <summary>
@@ -138,39 +138,83 @@ public sealed class RealMxGatewayClient : IMxGatewayClient
/// <inheritdoc />
public async Task<(IReadOnlyList<MxBrowseChild> Children, bool Truncated)> BrowseChildrenAsync(string? parentNodeId, CancellationToken ct = default)
{
var request = new BrowseChildrenRequest { IncludeAttributes = true };
// Navigation browse returns child OBJECTS only (IncludeAttributes = false).
// An object's own attributes are loaded lazily — via DiscoverHierarchy below,
// scoped to that single object — only when the object is expanded. This keeps
// each browse level's payload to a few KB. Inlining every child's full
// attribute set (the previous behaviour) produced replies that exceeded
// Akka's remote frame size (e.g. ~152 KB for one attribute-heavy area), and
// remoting silently discarded the oversized reply, hanging the picker.
var request = new BrowseChildrenRequest { IncludeAttributes = false };
// Object NodeIds are the Galaxy gobject id (encoded as a string); attribute
// NodeIds are FullTagReference leaves and never arrive here as a parent.
if (!string.IsNullOrEmpty(parentNodeId)
&& int.TryParse(parentNodeId, NumberStyles.Integer, CultureInfo.InvariantCulture, out var gobjectId))
var parentGobjectId = 0;
var haveParentObject = !string.IsNullOrEmpty(parentNodeId)
&& int.TryParse(parentNodeId, NumberStyles.Integer, CultureInfo.InvariantCulture, out parentGobjectId);
if (haveParentObject)
{
request.ParentGobjectId = gobjectId;
request.ParentGobjectId = parentGobjectId;
}
BrowseChildrenReply reply;
GalaxyObject? parentObject = null;
try
{
reply = await _galaxy!.BrowseChildrenRawAsync(request, ct).ConfigureAwait(false);
// When expanding a concrete object, fetch just that object (MaxDepth = 0)
// with its attributes so they can be listed as selectable leaves. The root
// level has no parent object, hence no attributes of its own.
if (haveParentObject)
{
var attrRequest = new DiscoverHierarchyRequest
{
RootGobjectId = parentGobjectId,
MaxDepth = 0,
IncludeAttributes = true,
};
var attrReply = await _galaxy!.DiscoverHierarchyRawAsync(attrRequest, ct).ConfigureAwait(false);
parentObject = attrReply.Objects.Count > 0 ? attrReply.Objects[0] : null;
}
}
catch (RpcException ex) when (ex.StatusCode == StatusCode.Unavailable)
{
throw new ConnectionNotConnectedException($"MxGateway repository unavailable: {ex.Status.Detail}");
}
catch (RpcException ex) when (ex.StatusCode == StatusCode.Unimplemented)
{
// The data pipe (read/subscribe/write) works against every gateway
// build, but Galaxy hierarchy browsing (the BrowseChildren RPC) was
// added later. An older gateway answers Unimplemented. Surface a
// clear, actionable reason instead of a raw gRPC fault — the actor
// maps NotSupportedException to BrowseFailureKind.NotBrowsable and
// carries this message through to the picker.
throw new NotSupportedException(
"The connected MxGateway build does not support hierarchy browsing. "
+ "Update the gateway to a build that implements BrowseChildren, "
+ "or enter the tag reference manually.");
}
var children = new List<MxBrowseChild>();
for (var i = 0; i < reply.Children.Count; i++)
// Navigable child objects, keyed by gobject id. Always marked expandable —
// every Galaxy object carries attributes (and may host sub-objects), both
// resolved on demand when the node is expanded.
foreach (var obj in reply.Children)
{
var obj = reply.Children[i];
var hasChildren = i < reply.ChildHasChildren.Count && reply.ChildHasChildren[i];
// Navigable container node, keyed by gobject id.
children.Add(new MxBrowseChild(
obj.GobjectId.ToString(CultureInfo.InvariantCulture),
string.IsNullOrEmpty(obj.TagName) ? obj.ContainedName : obj.TagName,
BrowseNodeClass.Object,
hasChildren || obj.Attributes.Count > 0));
true));
}
// Selectable attribute leaves, keyed by their full tag reference.
foreach (var attr in obj.Attributes)
// The expanded object's own attributes, as selectable leaves keyed by their
// full tag reference.
if (parentObject is not null)
{
foreach (var attr in parentObject.Attributes)
{
children.Add(new MxBrowseChild(
attr.FullTagReference,
@@ -353,11 +353,16 @@ public class RealOpcUaClient : IOpcUaClient
// Variables (selectable), Methods (display-only).
var nodeClassMask = (uint)(NodeClass.Object | NodeClass.Variable | NodeClass.Method);
// requestedMaxReferencesPerNode: cap the server's per-call references so a
// huge flat folder cannot return an unbounded set. 500 leaves headroom for
// the downstream frame-size budget (DataConnectionActor.CapBrowseChildren)
// even with long string NodeIds; a non-empty continuation point surfaces as
// Truncated, prompting manual entry rather than auto-paging.
var (_, continuationPoint, references) = await session.BrowseAsync(
null,
null,
nodeToBrowse,
1000u,
500u,
BrowseDirection.Forward,
ReferenceTypeIds.HierarchicalReferences,
true,
@@ -162,4 +162,95 @@ public class DataConnectionManagerBrowseHandlerTests : TestKit
Assert.Equal(BrowseFailureKind.ConnectionNotConnected, reply.Failure!.Kind);
Assert.Empty(reply.Children);
}
[Fact]
public void NotSupportedException_maps_to_NotBrowsable_carrying_message()
{
// A browsable adapter that is connected but whose server/protocol cannot
// browse (e.g. an MxGateway build predating the BrowseChildren RPC, which
// answers gRPC Unimplemented) throws NotSupportedException. The actor must
// surface this as NotBrowsable with the adapter's actionable message
// carried verbatim — distinct from the capability-check NotBrowsable
// (non-browsable adapter), which has no message.
const string reason =
"The connected MxGateway build does not support hierarchy browsing. "
+ "Update the gateway to a build that implements BrowseChildren, "
+ "or enter the tag reference manually.";
var adapter = Substitute.For<IDataConnection, IBrowsableDataConnection>();
((IDataConnection)adapter).ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
.Returns(Task.CompletedTask);
((IDataConnection)adapter).Status.Returns(ConnectionHealth.Connected);
((IBrowsableDataConnection)adapter)
.BrowseChildrenAsync(Arg.Any<string?>(), Arg.Any<CancellationToken>())
.Returns(Task.FromException<BrowseChildrenResult>(new NotSupportedException(reason)));
_factory.Create("MxGateway", Arg.Any<IDictionary<string, string>>())
.Returns((IDataConnection)adapter);
var manager = Sys.ActorOf(Props.Create(() =>
new DataConnectionManagerActor(_factory, _options, _healthCollector, null)));
manager.Tell(new CreateConnectionCommand(
"conn-old-gw", "MxGateway", new Dictionary<string, string>(), null, 3));
AwaitCondition(
() => _factory.ReceivedCalls().Any(c => c.GetMethodInfo().Name == "Create"),
TimeSpan.FromSeconds(2));
manager.Tell(new BrowseNodeCommand("conn-old-gw", ParentNodeId: null));
var reply = ExpectMsg<BrowseNodeResult>(TimeSpan.FromSeconds(3));
Assert.NotNull(reply.Failure);
Assert.Equal(BrowseFailureKind.NotBrowsable, reply.Failure!.Kind);
Assert.Equal(reason, reply.Failure.Message);
Assert.Empty(reply.Children);
}
[Fact]
public void Oversized_child_list_is_capped_to_frame_budget_and_marked_truncated()
{
// A level large enough to exceed Akka's 128 KB remote frame must be clipped
// by the actor BEFORE it crosses the site→central boundary — otherwise the
// reply is silently discarded and the picker hangs. This guards every
// protocol's reply, regardless of how the adapter paginates upstream. The
// adapter here reports Truncated=false; the actor must still truncate purely
// on serialized size.
const int byteBudget = 100 * 1024; // mirrors DataConnectionActor.BrowseResultByteBudget
var bigList = Enumerable.Range(0, 3000)
.Select(i => new BrowseNode($"ns=2;s=Item{i:D5}", $"Item{i:D5}", BrowseNodeClass.Variable, HasChildren: false))
.ToArray();
var adapter = Substitute.For<IDataConnection, IBrowsableDataConnection>();
((IDataConnection)adapter).ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
.Returns(Task.CompletedTask);
((IDataConnection)adapter).Status.Returns(ConnectionHealth.Connected);
((IBrowsableDataConnection)adapter)
.BrowseChildrenAsync(null, Arg.Any<CancellationToken>())
.Returns(new BrowseChildrenResult(bigList, Truncated: false));
_factory.Create("OpcUa", Arg.Any<IDictionary<string, string>>())
.Returns((IDataConnection)adapter);
var manager = Sys.ActorOf(Props.Create(() =>
new DataConnectionManagerActor(_factory, _options, _healthCollector, null)));
manager.Tell(new CreateConnectionCommand(
"conn-big", "OpcUa", new Dictionary<string, string>(), null, 3));
AwaitCondition(
() => _factory.ReceivedCalls().Any(c => c.GetMethodInfo().Name == "Create"),
TimeSpan.FromSeconds(2));
manager.Tell(new BrowseNodeCommand("conn-big", ParentNodeId: null));
var reply = ExpectMsg<BrowseNodeResult>(TimeSpan.FromSeconds(3));
Assert.Null(reply.Failure);
Assert.True(reply.Truncated, "an oversized level must be reported as truncated");
Assert.True(reply.Children.Count > 0, "the cap must still return a usable prefix");
Assert.True(reply.Children.Count < bigList.Length, "the level must actually be clipped");
// The kept prefix's estimated serialized size must respect the budget.
var keptBytes = reply.Children.Sum(n => 64 + n.NodeId.Length + n.DisplayName.Length);
Assert.True(keptBytes <= byteBudget, $"kept estimate {keptBytes} exceeds budget {byteBudget}");
}
}