docs(plan): Galaxy writer/subscription-registry item-handle sharing design
This commit is contained in:
@@ -0,0 +1,138 @@
|
||||
# Galaxy writer ⇄ subscription-registry item-handle sharing — Design
|
||||
|
||||
> Brainstormed 2026-06-18. Backlog item: `stillpending.md` §2.4 — "Galaxy — writer
|
||||
> item-handle cache not shared with the subscription registry"
|
||||
> (`Driver.Galaxy/Runtime/GatewayGalaxyDataWriter.cs`). Off master `70e1bde9`.
|
||||
|
||||
## Problem (verified in code)
|
||||
|
||||
The Galaxy data writer's item-handle cache is **already shipped and reconnect-wired**:
|
||||
`GatewayGalaxyDataWriter` holds `_itemHandles` (fullRef → MXAccess hItem) plus
|
||||
`_supervisedHandles`, and `GalaxyDriver.ReopenAsync` invalidates them after a session
|
||||
recreate (`GalaxyDriver.cs:298`, commits `f05b5d79` / `f77488ee`). So "add a writer cache"
|
||||
is **not** the open work.
|
||||
|
||||
What is open is exactly what the backlog says: the writer cache is **isolated from the
|
||||
`SubscriptionRegistry`**. The registry already records `TagBinding(FullReference, ItemHandle)`
|
||||
for every live subscription, and — the load-bearing fact — both subscribe-handles (from
|
||||
`SubscribeBulk`) and write-handles (from the writer's `AddItem`) are issued against the **same
|
||||
MXAccess server registration** (`GalaxyMxSession.ServerHandle`). Within one registration an
|
||||
hItem is shared across the AddItem / Advise / Write call paths, so a handle the gateway returned
|
||||
for a subscription is reusable by a `Write`.
|
||||
|
||||
Today the writer never consults the registry, so the **first write to an already-subscribed tag
|
||||
pays a redundant `AddItem` round-trip** (subsequent writes hit the writer's own `_itemHandles`).
|
||||
This is an `_Optimization._` — a first-write-only latency saving, plus unification of handle
|
||||
provenance (one fewer source of truth for "which hItem is this tag").
|
||||
|
||||
## Load-bearing premise (proven live, not assumed)
|
||||
|
||||
> *A subscription's item handle is usable for a no-login supervisory `Write` that commits.*
|
||||
|
||||
If this were false, borrowing the subscribed handle would **regress writes** for subscribed
|
||||
tags. The MXAccess Toolkit semantics say it holds (one hItem per item under a registration,
|
||||
shared across Advise/Write), but this phase does **not** assume it — the live-gw test is the
|
||||
**merge gate**, not decoration.
|
||||
|
||||
## Approach
|
||||
|
||||
Three options were considered:
|
||||
|
||||
1. **Delegate seam (chosen).** The writer ctor gains an optional
|
||||
`Func<string, int?>? subscribedHandleSource = null`; `GalaxyDriver` wires it to the
|
||||
registry's new `TryResolveItemHandle`. This is symmetric with the writer's existing
|
||||
`securityResolver` delegate, preserves the writer's "independently testable" property
|
||||
(`null` ⇒ today's exact behavior), and is the smallest surface.
|
||||
2. **Direct `SubscriptionRegistry` reference** in the writer. Same assembly, registry is
|
||||
test-constructible — but couples the writer to a concrete collaborator with no upside over (1).
|
||||
3. **Unified shared cache object** both sides read/write. Matches the literal "wire bindings
|
||||
into `_itemHandles`" phrasing but is the most invasive (changes the writer's ownership
|
||||
model). YAGNI.
|
||||
|
||||
→ **Approach 1.** AdminUI untouched. No Commons/proto/EF/migration change. No bUnit.
|
||||
|
||||
## Resolution rule (self-healing — no stale-handle regression)
|
||||
|
||||
`GatewayGalaxyDataWriter.EnsureItemHandleAsync(fullRef)` becomes:
|
||||
|
||||
1. `_itemHandles` hit (the writer `AddItem`'d this tag itself before) → use it.
|
||||
2. else `subscribedHandleSource?.Invoke(fullRef)` returns a **live** handle → use it, but
|
||||
**do NOT store it in `_itemHandles`**. The registry owns the borrowed handle's lifecycle
|
||||
(including reconnect `Rebind`), so after a reconnect the writer always re-borrows the *fresh*
|
||||
handle on the next write — there is no stale-cache window introduced by the borrow.
|
||||
3. else `AddItem` + store in `_itemHandles` (today's path), incrementing an `AddItemCallCount`
|
||||
test seam.
|
||||
|
||||
`AdviseSupervisory` is unchanged: the writer still supervisory-advises the (possibly borrowed)
|
||||
hItem once per handle via `_supervisedHandles`. Borrowing only skips the `AddItem` round-trip —
|
||||
which is the entire point of the optimization. The borrowed item already carries the
|
||||
subscriber's data-change advise; supervisory advise is an additional mode on the same item.
|
||||
|
||||
The decision in steps 1–2 is extracted into a synchronous internal seam
|
||||
`TryResolveCachedOrBorrowed(fullRef) -> int?` so it is unit-testable **without a live session**
|
||||
(the SDK `MxGatewaySession` is sealed with an internal ctor and cannot be faked — see Testing).
|
||||
`EnsureItemHandleAsync` calls the seam first and only `AddItem`s on a null result.
|
||||
|
||||
## Registry change
|
||||
|
||||
`SubscriptionRegistry` gains a forward lookup:
|
||||
|
||||
- A `ConcurrentDictionary<string, int> _itemHandleByFullRef` (`StringComparer.OrdinalIgnoreCase`,
|
||||
matching the writer's cache), maintained incrementally in `Register` / `Rebind` (add the
|
||||
`fullRef → handle` for each `binding.ItemHandle > 0`) and best-effort dropped in `Remove` /
|
||||
`Rebind`.
|
||||
- `public int? TryResolveItemHandle(string fullRef)` that returns the mapped handle **only if
|
||||
`_subscribersByItemHandle` still contains it** — a liveness guard. This means even a lingering
|
||||
forward-map entry can never hand out a dead handle, because `_subscribersByItemHandle` is the
|
||||
already-authoritative live-handle set. The guard de-risks the removal bookkeeping: the forward
|
||||
map is an index, not the source of truth.
|
||||
|
||||
The event-dispatch hot path (`ResolveSubscribers`) is **untouched** — the forward map is consulted
|
||||
only on writes (rare relative to events) and mutated only on subscribe/unsubscribe/rebind (which
|
||||
already do O(bindings) work).
|
||||
|
||||
`GalaxyDriver` passes `_subscriptions.TryResolveItemHandle` as the writer's
|
||||
`subscribedHandleSource` when it constructs the production writer.
|
||||
|
||||
## Error handling
|
||||
|
||||
Unchanged. A borrowed handle that fails a write surfaces its own Bad status via the existing
|
||||
`TranslateReply` path; the writer does not cache borrowed handles, so the next write simply
|
||||
re-resolves (re-borrow if still live, else `AddItem`). `TryResolveItemHandle` returns `null`
|
||||
cleanly for unknown / dead handles.
|
||||
|
||||
## Testing
|
||||
|
||||
**Unit (xUnit + Shouldly, `Driver.Galaxy.Tests`) — no live session required:**
|
||||
|
||||
- `GatewayGalaxyDataWriter`: seed `subscribedHandleSource` → `TryResolveCachedOrBorrowed`
|
||||
returns the borrowed handle and leaves `CachedItemHandleCount == 0`; a `_itemHandles` hit wins
|
||||
over the source; a `null` source ⇒ no borrow (returns null). (Mirrors the existing
|
||||
`SeedHandleCachesForTest` / count-seam pattern, since the gw session cannot be faked.)
|
||||
- `SubscriptionRegistry`: `TryResolveItemHandle` returns the handle after `Register`; returns
|
||||
`null` after `Remove`; returns the **fresh** handle after `Rebind`; the liveness guard returns
|
||||
`null` when the handle is absent from `_subscribersByItemHandle`.
|
||||
|
||||
**Live gw (the merge gate)** — extend `GatewayGalaxyLiveReopenAndWriteTests` (skip-gated; runs
|
||||
against `MXGW_ENDPOINT=http://10.100.0.48:5120` with the key sourced via the durable
|
||||
`docker exec … printenv GALAXY_MXGW_API_KEY` recipe, never echoed). Add an internal
|
||||
`AddItemCallCount` seam on the writer, then:
|
||||
|
||||
- Subscribe a real tag (registry now holds its hItem) → write it through the registry-wired
|
||||
writer → assert the write **commits** (Good status + value persists) **and `AddItemCallCount == 0`**.
|
||||
- Control: write a **non-subscribed** tag → assert `AddItemCallCount == 1`.
|
||||
|
||||
This proves both halves: the borrowed handle is write-usable (premise holds) **and** the redundant
|
||||
`AddItem` is actually skipped.
|
||||
|
||||
## Deferred / out of scope
|
||||
|
||||
- Reverse direction (subscriber borrowing the writer's `AddItem` handles) — subscriber handles
|
||||
come from `SubscribeBulk`; no need.
|
||||
- Unifying the two caches into one shared object (Approach 3).
|
||||
- Any AdminUI / Commons / proto / EF change.
|
||||
|
||||
## Done =
|
||||
|
||||
Build clean + `dotnet test` (Driver.Galaxy) green + the live-gw test proves a subscribed-tag
|
||||
write commits with zero `AddItem` → merge to master + push.
|
||||
Reference in New Issue
Block a user