docs: design for lazy-browse BrowseChildren RPC

OPC UA-style level-at-a-time browse across gRPC, dashboard, and the
shared cache projector. Server still loads the full Galaxy hierarchy;
laziness is wire-side and UI-side only.
This commit is contained in:
Joseph Doherty
2026-05-28 12:34:37 -04:00
parent 795eee72e3
commit edb812d859
+269
View File
@@ -0,0 +1,269 @@
# Lazy Browse for Galaxy Repository
Date: 2026-05-28
Status: approved, ready for implementation plan
## Problem
`GalaxyRepository.DiscoverHierarchy` returns the whole hierarchy (paged, but
clients still walk every object). The dashboard `BrowsePage` then materializes
the full tree into a Blazor component graph on every render. For large
deployments this means oversized gRPC replies for external clients and a
heavy DOM for the dashboard, even when an operator only wants to expand a
single Area.
External integrations (OPC UA bridges in particular) need OPC UA-style "browse
one level on demand" semantics instead of bulk download.
## Scope
Lazy browse lands at three layers:
1. **gRPC** — a new `BrowseChildren` RPC that returns the direct children of a
parent.
2. **Dashboard**`BrowsePage` loads roots only and fetches each level on
expand, in-process (no gRPC self-hop).
3. **Server projection** — a parent→children index added to the existing
`IGalaxyHierarchyCache` entry; reused by both the RPC and the dashboard
service.
**Out of scope:** changing how the gateway loads Galaxy SQL. The hierarchy
cache still pulls the full deployment on each deploy change; persistence and
the dashboard summary are unchanged. The lazy story is wire-side and
UI-side only.
## Architecture
```text
gRPC client Dashboard (Blazor circuit)
| |
v v
GalaxyRepositoryGrpcService IDashboardBrowseService (in-process)
.BrowseChildren .GetRoots / .GetChildren
\ /
\ /
v v
GalaxyBrowseProjector.ProjectChildren
|
v
IGalaxyHierarchyCache.Current (GalaxyHierarchyCacheEntry)
|
+-- ObjectViews (existing)
+-- ChildrenByParent (NEW, built once per refresh)
```
`GalaxyBrowseProjector` is the single source of truth for "direct children of
a parent, filtered, ordered, paged." Both the gRPC handler and the dashboard
service call it.
## Proto contract
Added to `src/ZB.MOM.WW.MxGateway.Contracts/Protos/galaxy_repository.proto`,
additive only per the existing wire-compatibility policy.
```proto
service GalaxyRepository {
// ... existing RPCs ...
rpc BrowseChildren(BrowseChildrenRequest) returns (BrowseChildrenReply);
}
message BrowseChildrenRequest {
oneof parent {
int32 parent_gobject_id = 1;
string parent_tag_name = 2;
string parent_contained_path = 3;
}
int32 page_size = 4; // default 500, max 5000
string page_token = 5; // opaque
// Filter parity with DiscoverHierarchy. AND-combined.
repeated int32 category_ids = 6;
repeated string template_chain_contains = 7;
string tag_name_glob = 8;
optional bool include_attributes = 9; // default true
bool alarm_bearing_only = 10;
bool historized_only = 11;
}
message BrowseChildrenReply {
repeated GalaxyObject children = 1;
string next_page_token = 2;
int32 total_child_count = 3; // matching direct children (post-filter)
repeated bool child_has_children = 4; // parallel-indexed with `children`
uint64 cache_sequence = 5;
}
```
`GalaxyObject` and `GalaxyAttribute` are unchanged. Root browse = empty
`parent` oneof.
### Error mapping
| Condition | Status |
|---|---|
| Unknown `parent_gobject_id` / `parent_tag_name` / `parent_contained_path` | `NotFound` |
| Stale `page_token` (cache deployed forward) | `FailedPrecondition`; current `cache_sequence` in trailers |
| Filter set differs between pages of the same token | `InvalidArgument` |
| First load not complete within 5s | `Unavailable` |
| API key missing `metadata:read` scope | `PermissionDenied` |
| No API key | `Unauthenticated` |
`browse_subtrees` API-key constraints intersect with the request as today.
## Server projection
### New index on `GalaxyHierarchyCacheEntry`
Built once per refresh, in `GalaxyHierarchyIndex`:
```csharp
// gobject_id -> direct children, sorted areas-first then by display name.
// Roots (parent_gobject_id == 0 or absent) live under sentinel key 0.
IReadOnlyDictionary<int, IReadOnlyList<GalaxyObjectView>> ChildrenByParent;
```
Cost: one O(n) pass over `ObjectViews` during refresh. Memory: O(n) — one int
plus a list reference per object, negligible next to the existing object list.
### `GalaxyBrowseProjector.ProjectChildren`
1. Resolve `parent` selector to `parent_gobject_id` (root sentinel for empty).
Unknown parent → `NotFound`.
2. Look up direct children from `ChildrenByParent`.
3. Apply the same filter pipeline as `DiscoverHierarchy`. Extract the filter
logic from `GalaxyHierarchyProjector.ApplyFilters` into a shared helper
used by both projectors so filter semantics cannot drift.
4. Intersect with API-key `browse_subtrees` globs.
5. Memoize the filtered, ordered list keyed by
`(parent_id, filter_signature)` in a
`ConditionalWeakTable<GalaxyHierarchyCacheEntry, ...>` — mirrors the
existing `GalaxyHierarchyProjector.FilteredViewCache` pattern; GC'd when
the entry is.
6. Compute `child_has_children` for each returned child against the filtered
descendant set (recurse down `ChildrenByParent`, short-circuit on first
match; memoize the per-child boolean alongside the filtered list).
7. Slice page; encode `next_page_token` =
`{cache_sequence, parent_id, filter_signature, offset}` base64 protobuf.
Sort order: areas first, then `OrdinalIgnoreCase` by display name —
identical to `DashboardBrowseTreeBuilder.CompareNodes` so the dashboard and
external clients see the same ordering.
## gRPC handler
`GalaxyRepositoryGrpcService.BrowseChildren`:
1. Wait up to 5s for first cache load (same `WaitForFirstLoadAsync` pattern
as `DiscoverHierarchy`).
2. Read `Current` entry.
3. Call `GalaxyBrowseProjector.ProjectChildren`.
4. Map `GalaxyObjectView``GalaxyObject` proto via existing
`GalaxyProtoMapper.MapObject`. Skeleton branch (`include_attributes=false`)
omits the attribute list — already supported by the mapper for
`DiscoverHierarchy`.
Authorization: `GatewayGrpcScopeResolver.BrowseChildren` requires
`metadata:read` — same scope as `DiscoverHierarchy`.
## Dashboard
New in-process service:
```csharp
public interface IDashboardBrowseService
{
BrowseLevelResult GetRoots(BrowseFilterArgs filter);
BrowseLevelResult GetChildren(int parentGobjectId, BrowseFilterArgs filter);
GalaxyObject? FindByContainedPath(string path); // for deep-link expansion
}
public sealed record BrowseLevelResult(
IReadOnlyList<DashboardBrowseNode> Nodes,
int TotalCount,
ulong CacheSequence);
```
Backed by the same `GalaxyBrowseProjector` the gRPC service uses — dashboard
and external clients render identical results.
`BrowsePage.razor` changes:
- Initial render fetches roots only.
- Each `DashboardBrowseNode` tracks `LoadState`
(`NotLoaded` / `Loading` / `Loaded` / `Error`).
- `BrowseTreeNodeView.razor` shows the expand triangle from
`HasChildrenHint` (the projector's `child_has_children`) regardless of
whether children have been loaded yet. First expand triggers an async load
and re-render.
- Loaded children are kept for the lifetime of the Blazor circuit OR until
the cache sequence advances (subscribed via the existing
`IGalaxyDeployNotifier`). On invalidation: collapse all expansions, show a
one-line "Galaxy redeployed, click to re-expand" hint.
- Attributes render inline under their object (unchanged) since
`include_attributes=true` is the default.
- Search box: when non-empty, fall back to the existing bulk
`DiscoverHierarchy` path + `DashboardBrowseTreeBuilder`. Lazy browse is for
unfiltered exploration; building a streaming search is out of scope.
`DashboardBrowseTreeBuilder.Build` stays, used by the search fallback and
the existing tests.
## Testing
Unit tests (no live MXAccess / Galaxy required):
- `GalaxyHierarchyIndexTests``ChildrenByParent` correctness: roots under
sentinel, deep nesting, self-parented objects, duplicate ids.
- `GalaxyBrowseProjectorTests`
- Direct-children selection by `gobject_id`, `tag_name`,
`contained_path`.
- Filter parity with `DiscoverHierarchy`: same fixtures, same filter
permutations, identical filtered-set output.
- `child_has_children` correctness under filters that eliminate
descendants.
- Ordering matches `DashboardBrowseTreeBuilder` byte-for-byte.
- Sibling pagination across multiple pages.
- Page-token round trip (serialize → deserialize → same offset).
- Stale `page_token``FailedPrecondition`.
- Unknown parent → `NotFound`.
- Filter change between pages of the same token → `InvalidArgument`.
- `GalaxyRepositoryGrpcServiceTests` — new `BrowseChildren` happy path,
first-load wait, `browse_subtrees` intersection, `metadata:read` scope
enforcement.
- `DashboardBrowseServiceTests` — in-process service matches projector
output, deploy-sequence invalidation, error states.
- `ProtobufContractRoundTripTests` — extended for
`BrowseChildrenRequest` / `BrowseChildrenReply`.
Manual verification on a live MXAccess host: expand a deep Area in the
dashboard, force a redeploy mid-expand, observe invalidation banner.
No new opt-in live-only tests — the projector is pure over the cache, and
the cache itself is already exercised by existing live tests.
## Documentation updates (same change)
Per the repo rule that doc updates ride with the code change:
- `docs/GalaxyRepository.md` — add `BrowseChildren` to the RPC Surface
table; new subsection covering semantics, filter parity, page-token
contract, and error mapping. Update the architecture diagram to show the
shared projector.
- `gateway.md` — one-line entry for `BrowseChildren` in the Galaxy RPC
table.
- `clients/{dotnet,python,rust,go,java}/README.md` — short "Browsing
lazily" snippet showing a `BrowseChildren` walk with `child_has_children`
driving expand-triangle UI.
- `docs/DesignDecisions.md` — entry noting the explicit choice to keep the
full hierarchy in the server cache; lazy is wire-only.
## Non-goals
- SQL-side incremental loading. The gateway still pulls the full Galaxy
hierarchy on each deploy change.
- Multi-snapshot retention. A deploy invalidates outstanding page tokens;
clients re-walk.
- Streaming search. Search keeps the bulk `DiscoverHierarchy` + tree-build
path.
- Reconciling a partially-expanded dashboard tree across a deploy — the
invalidation banner forces a re-expand.