# Galaxy writer ⇄ subscription-registry item-handle sharing — Design > Brainstormed 2026-06-18. Backlog item: `stillpending.md` §2.4 — "Galaxy — writer > item-handle cache not shared with the subscription registry" > (`Driver.Galaxy/Runtime/GatewayGalaxyDataWriter.cs`). Off master `70e1bde9`. ## Problem (verified in code) The Galaxy data writer's item-handle cache is **already shipped and reconnect-wired**: `GatewayGalaxyDataWriter` holds `_itemHandles` (fullRef → MXAccess hItem) plus `_supervisedHandles`, and `GalaxyDriver.ReopenAsync` invalidates them after a session recreate (`GalaxyDriver.cs:298`, commits `f05b5d79` / `f77488ee`). So "add a writer cache" is **not** the open work. What is open is exactly what the backlog says: the writer cache is **isolated from the `SubscriptionRegistry`**. The registry already records `TagBinding(FullReference, ItemHandle)` for every live subscription, and — the load-bearing fact — both subscribe-handles (from `SubscribeBulk`) and write-handles (from the writer's `AddItem`) are issued against the **same MXAccess server registration** (`GalaxyMxSession.ServerHandle`). Within one registration an hItem is shared across the AddItem / Advise / Write call paths, so a handle the gateway returned for a subscription is reusable by a `Write`. Today the writer never consults the registry, so the **first write to an already-subscribed tag pays a redundant `AddItem` round-trip** (subsequent writes hit the writer's own `_itemHandles`). This is an `_Optimization._` — a first-write-only latency saving, plus unification of handle provenance (one fewer source of truth for "which hItem is this tag"). ## Load-bearing premise (proven live, not assumed) > *A subscription's item handle is usable for a no-login supervisory `Write` that commits.* If this were false, borrowing the subscribed handle would **regress writes** for subscribed tags. The MXAccess Toolkit semantics say it holds (one hItem per item under a registration, shared across Advise/Write), but this phase does **not** assume it — the live-gw test is the **merge gate**, not decoration. ## Approach Three options were considered: 1. **Delegate seam (chosen).** The writer ctor gains an optional `Func? subscribedHandleSource = null`; `GalaxyDriver` wires it to the registry's new `TryResolveItemHandle`. This is symmetric with the writer's existing `securityResolver` delegate, preserves the writer's "independently testable" property (`null` ⇒ today's exact behavior), and is the smallest surface. 2. **Direct `SubscriptionRegistry` reference** in the writer. Same assembly, registry is test-constructible — but couples the writer to a concrete collaborator with no upside over (1). 3. **Unified shared cache object** both sides read/write. Matches the literal "wire bindings into `_itemHandles`" phrasing but is the most invasive (changes the writer's ownership model). YAGNI. → **Approach 1.** AdminUI untouched. No Commons/proto/EF/migration change. No bUnit. ## Resolution rule (self-healing — no stale-handle regression) `GatewayGalaxyDataWriter.EnsureItemHandleAsync(fullRef)` becomes: 1. `_itemHandles` hit (the writer `AddItem`'d this tag itself before) → use it. 2. else `subscribedHandleSource?.Invoke(fullRef)` returns a **live** handle → use it, but **do NOT store it in `_itemHandles`**. The registry owns the borrowed handle's lifecycle (including reconnect `Rebind`), so after a reconnect the writer always re-borrows the *fresh* handle on the next write — there is no stale-cache window introduced by the borrow. 3. else `AddItem` + store in `_itemHandles` (today's path), incrementing an `AddItemCallCount` test seam. `AdviseSupervisory` is unchanged: the writer still supervisory-advises the (possibly borrowed) hItem once per handle via `_supervisedHandles`. Borrowing only skips the `AddItem` round-trip — which is the entire point of the optimization. The borrowed item already carries the subscriber's data-change advise; supervisory advise is an additional mode on the same item. The decision in steps 1–2 is extracted into a synchronous internal seam `TryResolveCachedOrBorrowed(fullRef) -> int?` so it is unit-testable **without a live session** (the SDK `MxGatewaySession` is sealed with an internal ctor and cannot be faked — see Testing). `EnsureItemHandleAsync` calls the seam first and only `AddItem`s on a null result. ## Registry change `SubscriptionRegistry` gains a forward lookup: - A `ConcurrentDictionary _itemHandleByFullRef` (`StringComparer.OrdinalIgnoreCase`, matching the writer's cache), maintained incrementally in `Register` / `Rebind` (add the `fullRef → handle` for each `binding.ItemHandle > 0`) and best-effort dropped in `Remove` / `Rebind`. - `public int? TryResolveItemHandle(string fullRef)` that returns the mapped handle **only if `_subscribersByItemHandle` still contains it** — a liveness guard. This means even a lingering forward-map entry can never hand out a dead handle, because `_subscribersByItemHandle` is the already-authoritative live-handle set. The guard de-risks the removal bookkeeping: the forward map is an index, not the source of truth. The event-dispatch hot path (`ResolveSubscribers`) is **untouched** — the forward map is consulted only on writes (rare relative to events) and mutated only on subscribe/unsubscribe/rebind (which already do O(bindings) work). `GalaxyDriver` passes `_subscriptions.TryResolveItemHandle` as the writer's `subscribedHandleSource` when it constructs the production writer. ## Error handling Unchanged. A borrowed handle that fails a write surfaces its own Bad status via the existing `TranslateReply` path; the writer does not cache borrowed handles, so the next write simply re-resolves (re-borrow if still live, else `AddItem`). `TryResolveItemHandle` returns `null` cleanly for unknown / dead handles. ## Testing **Unit (xUnit + Shouldly, `Driver.Galaxy.Tests`) — no live session required:** - `GatewayGalaxyDataWriter`: seed `subscribedHandleSource` → `TryResolveCachedOrBorrowed` returns the borrowed handle and leaves `CachedItemHandleCount == 0`; a `_itemHandles` hit wins over the source; a `null` source ⇒ no borrow (returns null). (Mirrors the existing `SeedHandleCachesForTest` / count-seam pattern, since the gw session cannot be faked.) - `SubscriptionRegistry`: `TryResolveItemHandle` returns the handle after `Register`; returns `null` after `Remove`; returns the **fresh** handle after `Rebind`; the liveness guard returns `null` when the handle is absent from `_subscribersByItemHandle`. **Live gw (the merge gate)** — extend `GatewayGalaxyLiveReopenAndWriteTests` (skip-gated; runs against `MXGW_ENDPOINT=http://10.100.0.48:5120` with the key sourced via the durable `docker exec … printenv GALAXY_MXGW_API_KEY` recipe, never echoed). Add an internal `AddItemCallCount` seam on the writer, then: - Subscribe a real tag (registry now holds its hItem) → write it through the registry-wired writer → assert the write **commits** (Good status + value persists) **and `AddItemCallCount == 0`**. - Control: write a **non-subscribed** tag → assert `AddItemCallCount == 1`. This proves both halves: the borrowed handle is write-usable (premise holds) **and** the redundant `AddItem` is actually skipped. ## Deferred / out of scope - Reverse direction (subscriber borrowing the writer's `AddItem` handles) — subscriber handles come from `SubscribeBulk`; no need. - Unifying the two caches into one shared object (Approach 3). - Any AdminUI / Commons / proto / EF change. ## Done = Build clean + `dotnet test` (Driver.Galaxy) green + the live-gw test proves a subscribed-tag write commits with zero `AddItem` → merge to master + push.