docs(plan): Galaxy writer/subscription-registry item-handle sharing design
This commit is contained in:
@@ -0,0 +1,138 @@
|
|||||||
|
# Galaxy writer ⇄ subscription-registry item-handle sharing — Design
|
||||||
|
|
||||||
|
> Brainstormed 2026-06-18. Backlog item: `stillpending.md` §2.4 — "Galaxy — writer
|
||||||
|
> item-handle cache not shared with the subscription registry"
|
||||||
|
> (`Driver.Galaxy/Runtime/GatewayGalaxyDataWriter.cs`). Off master `70e1bde9`.
|
||||||
|
|
||||||
|
## Problem (verified in code)
|
||||||
|
|
||||||
|
The Galaxy data writer's item-handle cache is **already shipped and reconnect-wired**:
|
||||||
|
`GatewayGalaxyDataWriter` holds `_itemHandles` (fullRef → MXAccess hItem) plus
|
||||||
|
`_supervisedHandles`, and `GalaxyDriver.ReopenAsync` invalidates them after a session
|
||||||
|
recreate (`GalaxyDriver.cs:298`, commits `f05b5d79` / `f77488ee`). So "add a writer cache"
|
||||||
|
is **not** the open work.
|
||||||
|
|
||||||
|
What is open is exactly what the backlog says: the writer cache is **isolated from the
|
||||||
|
`SubscriptionRegistry`**. The registry already records `TagBinding(FullReference, ItemHandle)`
|
||||||
|
for every live subscription, and — the load-bearing fact — both subscribe-handles (from
|
||||||
|
`SubscribeBulk`) and write-handles (from the writer's `AddItem`) are issued against the **same
|
||||||
|
MXAccess server registration** (`GalaxyMxSession.ServerHandle`). Within one registration an
|
||||||
|
hItem is shared across the AddItem / Advise / Write call paths, so a handle the gateway returned
|
||||||
|
for a subscription is reusable by a `Write`.
|
||||||
|
|
||||||
|
Today the writer never consults the registry, so the **first write to an already-subscribed tag
|
||||||
|
pays a redundant `AddItem` round-trip** (subsequent writes hit the writer's own `_itemHandles`).
|
||||||
|
This is an `_Optimization._` — a first-write-only latency saving, plus unification of handle
|
||||||
|
provenance (one fewer source of truth for "which hItem is this tag").
|
||||||
|
|
||||||
|
## Load-bearing premise (proven live, not assumed)
|
||||||
|
|
||||||
|
> *A subscription's item handle is usable for a no-login supervisory `Write` that commits.*
|
||||||
|
|
||||||
|
If this were false, borrowing the subscribed handle would **regress writes** for subscribed
|
||||||
|
tags. The MXAccess Toolkit semantics say it holds (one hItem per item under a registration,
|
||||||
|
shared across Advise/Write), but this phase does **not** assume it — the live-gw test is the
|
||||||
|
**merge gate**, not decoration.
|
||||||
|
|
||||||
|
## Approach
|
||||||
|
|
||||||
|
Three options were considered:
|
||||||
|
|
||||||
|
1. **Delegate seam (chosen).** The writer ctor gains an optional
|
||||||
|
`Func<string, int?>? subscribedHandleSource = null`; `GalaxyDriver` wires it to the
|
||||||
|
registry's new `TryResolveItemHandle`. This is symmetric with the writer's existing
|
||||||
|
`securityResolver` delegate, preserves the writer's "independently testable" property
|
||||||
|
(`null` ⇒ today's exact behavior), and is the smallest surface.
|
||||||
|
2. **Direct `SubscriptionRegistry` reference** in the writer. Same assembly, registry is
|
||||||
|
test-constructible — but couples the writer to a concrete collaborator with no upside over (1).
|
||||||
|
3. **Unified shared cache object** both sides read/write. Matches the literal "wire bindings
|
||||||
|
into `_itemHandles`" phrasing but is the most invasive (changes the writer's ownership
|
||||||
|
model). YAGNI.
|
||||||
|
|
||||||
|
→ **Approach 1.** AdminUI untouched. No Commons/proto/EF/migration change. No bUnit.
|
||||||
|
|
||||||
|
## Resolution rule (self-healing — no stale-handle regression)
|
||||||
|
|
||||||
|
`GatewayGalaxyDataWriter.EnsureItemHandleAsync(fullRef)` becomes:
|
||||||
|
|
||||||
|
1. `_itemHandles` hit (the writer `AddItem`'d this tag itself before) → use it.
|
||||||
|
2. else `subscribedHandleSource?.Invoke(fullRef)` returns a **live** handle → use it, but
|
||||||
|
**do NOT store it in `_itemHandles`**. The registry owns the borrowed handle's lifecycle
|
||||||
|
(including reconnect `Rebind`), so after a reconnect the writer always re-borrows the *fresh*
|
||||||
|
handle on the next write — there is no stale-cache window introduced by the borrow.
|
||||||
|
3. else `AddItem` + store in `_itemHandles` (today's path), incrementing an `AddItemCallCount`
|
||||||
|
test seam.
|
||||||
|
|
||||||
|
`AdviseSupervisory` is unchanged: the writer still supervisory-advises the (possibly borrowed)
|
||||||
|
hItem once per handle via `_supervisedHandles`. Borrowing only skips the `AddItem` round-trip —
|
||||||
|
which is the entire point of the optimization. The borrowed item already carries the
|
||||||
|
subscriber's data-change advise; supervisory advise is an additional mode on the same item.
|
||||||
|
|
||||||
|
The decision in steps 1–2 is extracted into a synchronous internal seam
|
||||||
|
`TryResolveCachedOrBorrowed(fullRef) -> int?` so it is unit-testable **without a live session**
|
||||||
|
(the SDK `MxGatewaySession` is sealed with an internal ctor and cannot be faked — see Testing).
|
||||||
|
`EnsureItemHandleAsync` calls the seam first and only `AddItem`s on a null result.
|
||||||
|
|
||||||
|
## Registry change
|
||||||
|
|
||||||
|
`SubscriptionRegistry` gains a forward lookup:
|
||||||
|
|
||||||
|
- A `ConcurrentDictionary<string, int> _itemHandleByFullRef` (`StringComparer.OrdinalIgnoreCase`,
|
||||||
|
matching the writer's cache), maintained incrementally in `Register` / `Rebind` (add the
|
||||||
|
`fullRef → handle` for each `binding.ItemHandle > 0`) and best-effort dropped in `Remove` /
|
||||||
|
`Rebind`.
|
||||||
|
- `public int? TryResolveItemHandle(string fullRef)` that returns the mapped handle **only if
|
||||||
|
`_subscribersByItemHandle` still contains it** — a liveness guard. This means even a lingering
|
||||||
|
forward-map entry can never hand out a dead handle, because `_subscribersByItemHandle` is the
|
||||||
|
already-authoritative live-handle set. The guard de-risks the removal bookkeeping: the forward
|
||||||
|
map is an index, not the source of truth.
|
||||||
|
|
||||||
|
The event-dispatch hot path (`ResolveSubscribers`) is **untouched** — the forward map is consulted
|
||||||
|
only on writes (rare relative to events) and mutated only on subscribe/unsubscribe/rebind (which
|
||||||
|
already do O(bindings) work).
|
||||||
|
|
||||||
|
`GalaxyDriver` passes `_subscriptions.TryResolveItemHandle` as the writer's
|
||||||
|
`subscribedHandleSource` when it constructs the production writer.
|
||||||
|
|
||||||
|
## Error handling
|
||||||
|
|
||||||
|
Unchanged. A borrowed handle that fails a write surfaces its own Bad status via the existing
|
||||||
|
`TranslateReply` path; the writer does not cache borrowed handles, so the next write simply
|
||||||
|
re-resolves (re-borrow if still live, else `AddItem`). `TryResolveItemHandle` returns `null`
|
||||||
|
cleanly for unknown / dead handles.
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
|
||||||
|
**Unit (xUnit + Shouldly, `Driver.Galaxy.Tests`) — no live session required:**
|
||||||
|
|
||||||
|
- `GatewayGalaxyDataWriter`: seed `subscribedHandleSource` → `TryResolveCachedOrBorrowed`
|
||||||
|
returns the borrowed handle and leaves `CachedItemHandleCount == 0`; a `_itemHandles` hit wins
|
||||||
|
over the source; a `null` source ⇒ no borrow (returns null). (Mirrors the existing
|
||||||
|
`SeedHandleCachesForTest` / count-seam pattern, since the gw session cannot be faked.)
|
||||||
|
- `SubscriptionRegistry`: `TryResolveItemHandle` returns the handle after `Register`; returns
|
||||||
|
`null` after `Remove`; returns the **fresh** handle after `Rebind`; the liveness guard returns
|
||||||
|
`null` when the handle is absent from `_subscribersByItemHandle`.
|
||||||
|
|
||||||
|
**Live gw (the merge gate)** — extend `GatewayGalaxyLiveReopenAndWriteTests` (skip-gated; runs
|
||||||
|
against `MXGW_ENDPOINT=http://10.100.0.48:5120` with the key sourced via the durable
|
||||||
|
`docker exec … printenv GALAXY_MXGW_API_KEY` recipe, never echoed). Add an internal
|
||||||
|
`AddItemCallCount` seam on the writer, then:
|
||||||
|
|
||||||
|
- Subscribe a real tag (registry now holds its hItem) → write it through the registry-wired
|
||||||
|
writer → assert the write **commits** (Good status + value persists) **and `AddItemCallCount == 0`**.
|
||||||
|
- Control: write a **non-subscribed** tag → assert `AddItemCallCount == 1`.
|
||||||
|
|
||||||
|
This proves both halves: the borrowed handle is write-usable (premise holds) **and** the redundant
|
||||||
|
`AddItem` is actually skipped.
|
||||||
|
|
||||||
|
## Deferred / out of scope
|
||||||
|
|
||||||
|
- Reverse direction (subscriber borrowing the writer's `AddItem` handles) — subscriber handles
|
||||||
|
come from `SubscribeBulk`; no need.
|
||||||
|
- Unifying the two caches into one shared object (Approach 3).
|
||||||
|
- Any AdminUI / Commons / proto / EF change.
|
||||||
|
|
||||||
|
## Done =
|
||||||
|
|
||||||
|
Build clean + `dotnet test` (Driver.Galaxy) green + the live-gw test proves a subscribed-tag
|
||||||
|
write commits with zero `AddItem` → merge to master + push.
|
||||||
Reference in New Issue
Block a user