From edb812d8597f6098e682468b1487e66ff5c62eda Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Thu, 28 May 2026 12:34:37 -0400 Subject: [PATCH] docs: design for lazy-browse BrowseChildren RPC OPC UA-style level-at-a-time browse across gRPC, dashboard, and the shared cache projector. Server still loads the full Galaxy hierarchy; laziness is wire-side and UI-side only. --- docs/plans/2026-05-28-lazy-browse-design.md | 269 ++++++++++++++++++++ 1 file changed, 269 insertions(+) create mode 100644 docs/plans/2026-05-28-lazy-browse-design.md diff --git a/docs/plans/2026-05-28-lazy-browse-design.md b/docs/plans/2026-05-28-lazy-browse-design.md new file mode 100644 index 0000000..3582b9f --- /dev/null +++ b/docs/plans/2026-05-28-lazy-browse-design.md @@ -0,0 +1,269 @@ +# Lazy Browse for Galaxy Repository + +Date: 2026-05-28 +Status: approved, ready for implementation plan + +## Problem + +`GalaxyRepository.DiscoverHierarchy` returns the whole hierarchy (paged, but +clients still walk every object). The dashboard `BrowsePage` then materializes +the full tree into a Blazor component graph on every render. For large +deployments this means oversized gRPC replies for external clients and a +heavy DOM for the dashboard, even when an operator only wants to expand a +single Area. + +External integrations (OPC UA bridges in particular) need OPC UA-style "browse +one level on demand" semantics instead of bulk download. + +## Scope + +Lazy browse lands at three layers: + +1. **gRPC** — a new `BrowseChildren` RPC that returns the direct children of a + parent. +2. **Dashboard** — `BrowsePage` loads roots only and fetches each level on + expand, in-process (no gRPC self-hop). +3. **Server projection** — a parent→children index added to the existing + `IGalaxyHierarchyCache` entry; reused by both the RPC and the dashboard + service. + +**Out of scope:** changing how the gateway loads Galaxy SQL. The hierarchy +cache still pulls the full deployment on each deploy change; persistence and +the dashboard summary are unchanged. The lazy story is wire-side and +UI-side only. + +## Architecture + +```text +gRPC client Dashboard (Blazor circuit) + | | + v v +GalaxyRepositoryGrpcService IDashboardBrowseService (in-process) + .BrowseChildren .GetRoots / .GetChildren + \ / + \ / + v v + GalaxyBrowseProjector.ProjectChildren + | + v + IGalaxyHierarchyCache.Current (GalaxyHierarchyCacheEntry) + | + +-- ObjectViews (existing) + +-- ChildrenByParent (NEW, built once per refresh) +``` + +`GalaxyBrowseProjector` is the single source of truth for "direct children of +a parent, filtered, ordered, paged." Both the gRPC handler and the dashboard +service call it. + +## Proto contract + +Added to `src/ZB.MOM.WW.MxGateway.Contracts/Protos/galaxy_repository.proto`, +additive only per the existing wire-compatibility policy. + +```proto +service GalaxyRepository { + // ... existing RPCs ... + rpc BrowseChildren(BrowseChildrenRequest) returns (BrowseChildrenReply); +} + +message BrowseChildrenRequest { + oneof parent { + int32 parent_gobject_id = 1; + string parent_tag_name = 2; + string parent_contained_path = 3; + } + int32 page_size = 4; // default 500, max 5000 + string page_token = 5; // opaque + + // Filter parity with DiscoverHierarchy. AND-combined. + repeated int32 category_ids = 6; + repeated string template_chain_contains = 7; + string tag_name_glob = 8; + optional bool include_attributes = 9; // default true + bool alarm_bearing_only = 10; + bool historized_only = 11; +} + +message BrowseChildrenReply { + repeated GalaxyObject children = 1; + string next_page_token = 2; + int32 total_child_count = 3; // matching direct children (post-filter) + repeated bool child_has_children = 4; // parallel-indexed with `children` + uint64 cache_sequence = 5; +} +``` + +`GalaxyObject` and `GalaxyAttribute` are unchanged. Root browse = empty +`parent` oneof. + +### Error mapping + +| Condition | Status | +|---|---| +| Unknown `parent_gobject_id` / `parent_tag_name` / `parent_contained_path` | `NotFound` | +| Stale `page_token` (cache deployed forward) | `FailedPrecondition`; current `cache_sequence` in trailers | +| Filter set differs between pages of the same token | `InvalidArgument` | +| First load not complete within 5s | `Unavailable` | +| API key missing `metadata:read` scope | `PermissionDenied` | +| No API key | `Unauthenticated` | + +`browse_subtrees` API-key constraints intersect with the request as today. + +## Server projection + +### New index on `GalaxyHierarchyCacheEntry` + +Built once per refresh, in `GalaxyHierarchyIndex`: + +```csharp +// gobject_id -> direct children, sorted areas-first then by display name. +// Roots (parent_gobject_id == 0 or absent) live under sentinel key 0. +IReadOnlyDictionary> ChildrenByParent; +``` + +Cost: one O(n) pass over `ObjectViews` during refresh. Memory: O(n) — one int +plus a list reference per object, negligible next to the existing object list. + +### `GalaxyBrowseProjector.ProjectChildren` + +1. Resolve `parent` selector to `parent_gobject_id` (root sentinel for empty). + Unknown parent → `NotFound`. +2. Look up direct children from `ChildrenByParent`. +3. Apply the same filter pipeline as `DiscoverHierarchy`. Extract the filter + logic from `GalaxyHierarchyProjector.ApplyFilters` into a shared helper + used by both projectors so filter semantics cannot drift. +4. Intersect with API-key `browse_subtrees` globs. +5. Memoize the filtered, ordered list keyed by + `(parent_id, filter_signature)` in a + `ConditionalWeakTable` — mirrors the + existing `GalaxyHierarchyProjector.FilteredViewCache` pattern; GC'd when + the entry is. +6. Compute `child_has_children` for each returned child against the filtered + descendant set (recurse down `ChildrenByParent`, short-circuit on first + match; memoize the per-child boolean alongside the filtered list). +7. Slice page; encode `next_page_token` = + `{cache_sequence, parent_id, filter_signature, offset}` base64 protobuf. + +Sort order: areas first, then `OrdinalIgnoreCase` by display name — +identical to `DashboardBrowseTreeBuilder.CompareNodes` so the dashboard and +external clients see the same ordering. + +## gRPC handler + +`GalaxyRepositoryGrpcService.BrowseChildren`: + +1. Wait up to 5s for first cache load (same `WaitForFirstLoadAsync` pattern + as `DiscoverHierarchy`). +2. Read `Current` entry. +3. Call `GalaxyBrowseProjector.ProjectChildren`. +4. Map `GalaxyObjectView` → `GalaxyObject` proto via existing + `GalaxyProtoMapper.MapObject`. Skeleton branch (`include_attributes=false`) + omits the attribute list — already supported by the mapper for + `DiscoverHierarchy`. + +Authorization: `GatewayGrpcScopeResolver.BrowseChildren` requires +`metadata:read` — same scope as `DiscoverHierarchy`. + +## Dashboard + +New in-process service: + +```csharp +public interface IDashboardBrowseService +{ + BrowseLevelResult GetRoots(BrowseFilterArgs filter); + BrowseLevelResult GetChildren(int parentGobjectId, BrowseFilterArgs filter); + GalaxyObject? FindByContainedPath(string path); // for deep-link expansion +} + +public sealed record BrowseLevelResult( + IReadOnlyList Nodes, + int TotalCount, + ulong CacheSequence); +``` + +Backed by the same `GalaxyBrowseProjector` the gRPC service uses — dashboard +and external clients render identical results. + +`BrowsePage.razor` changes: + +- Initial render fetches roots only. +- Each `DashboardBrowseNode` tracks `LoadState` + (`NotLoaded` / `Loading` / `Loaded` / `Error`). +- `BrowseTreeNodeView.razor` shows the expand triangle from + `HasChildrenHint` (the projector's `child_has_children`) regardless of + whether children have been loaded yet. First expand triggers an async load + and re-render. +- Loaded children are kept for the lifetime of the Blazor circuit OR until + the cache sequence advances (subscribed via the existing + `IGalaxyDeployNotifier`). On invalidation: collapse all expansions, show a + one-line "Galaxy redeployed, click to re-expand" hint. +- Attributes render inline under their object (unchanged) since + `include_attributes=true` is the default. +- Search box: when non-empty, fall back to the existing bulk + `DiscoverHierarchy` path + `DashboardBrowseTreeBuilder`. Lazy browse is for + unfiltered exploration; building a streaming search is out of scope. + +`DashboardBrowseTreeBuilder.Build` stays, used by the search fallback and +the existing tests. + +## Testing + +Unit tests (no live MXAccess / Galaxy required): + +- `GalaxyHierarchyIndexTests` — `ChildrenByParent` correctness: roots under + sentinel, deep nesting, self-parented objects, duplicate ids. +- `GalaxyBrowseProjectorTests` — + - Direct-children selection by `gobject_id`, `tag_name`, + `contained_path`. + - Filter parity with `DiscoverHierarchy`: same fixtures, same filter + permutations, identical filtered-set output. + - `child_has_children` correctness under filters that eliminate + descendants. + - Ordering matches `DashboardBrowseTreeBuilder` byte-for-byte. + - Sibling pagination across multiple pages. + - Page-token round trip (serialize → deserialize → same offset). + - Stale `page_token` → `FailedPrecondition`. + - Unknown parent → `NotFound`. + - Filter change between pages of the same token → `InvalidArgument`. +- `GalaxyRepositoryGrpcServiceTests` — new `BrowseChildren` happy path, + first-load wait, `browse_subtrees` intersection, `metadata:read` scope + enforcement. +- `DashboardBrowseServiceTests` — in-process service matches projector + output, deploy-sequence invalidation, error states. +- `ProtobufContractRoundTripTests` — extended for + `BrowseChildrenRequest` / `BrowseChildrenReply`. + +Manual verification on a live MXAccess host: expand a deep Area in the +dashboard, force a redeploy mid-expand, observe invalidation banner. + +No new opt-in live-only tests — the projector is pure over the cache, and +the cache itself is already exercised by existing live tests. + +## Documentation updates (same change) + +Per the repo rule that doc updates ride with the code change: + +- `docs/GalaxyRepository.md` — add `BrowseChildren` to the RPC Surface + table; new subsection covering semantics, filter parity, page-token + contract, and error mapping. Update the architecture diagram to show the + shared projector. +- `gateway.md` — one-line entry for `BrowseChildren` in the Galaxy RPC + table. +- `clients/{dotnet,python,rust,go,java}/README.md` — short "Browsing + lazily" snippet showing a `BrowseChildren` walk with `child_has_children` + driving expand-triangle UI. +- `docs/DesignDecisions.md` — entry noting the explicit choice to keep the + full hierarchy in the server cache; lazy is wire-only. + +## Non-goals + +- SQL-side incremental loading. The gateway still pulls the full Galaxy + hierarchy on each deploy change. +- Multi-snapshot retention. A deploy invalidates outstanding page tokens; + clients re-walk. +- Streaming search. Search keeps the bulk `DiscoverHierarchy` + tree-build + path. +- Reconciling a partially-expanded dashboard tree across a deploy — the + invalidation banner forces a re-expand.