12 KiB
Galaxy writer ⇄ subscription-registry item-handle sharing — Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers-extended-cc:subagent-driven-development to implement this plan task-by-task.
Goal: Let GatewayGalaxyDataWriter borrow a live MXAccess item handle from the
SubscriptionRegistry so the first write to an already-subscribed tag skips a redundant AddItem
round-trip, without introducing any stale-handle regression.
Architecture: Delegate seam (symmetric with the writer's existing securityResolver). The
registry exposes a liveness-guarded int? TryResolveItemHandle(fullRef); the writer takes an
optional Func<string,int?>? subscribedHandleSource and consults it (never caching borrowed
handles); GalaxyDriver wires the two together. Proven by unit tests for the resolution logic and
a live-gw smoke that asserts a subscribed-tag write commits with zero AddItem.
Tech Stack: C# / .NET 10, xUnit + Shouldly, MXAccess gateway (ZB.MOM.WW.MxGateway.*).
Design: docs/plans/2026-06-18-galaxy-writer-handle-sharing-design.md (committed c85c4e5c).
Branch: feat/galaxy-writer-handle-sharing (off master 70e1bde9).
Hard rules: stage by explicit path (never git add .); never stage sql_login.txt,
src/Server/ZB.MOM.WW.OtOpcUa.Host/pki/, pending.md, stillpending.md,
docker-dev/docker-compose.yml; never echo/commit the gateway API key or any secret; no
force-push; no --no-verify; NO EF migration / Commons / proto change; NO bUnit. Use
dangerouslyDisableSandbox: true for all build/test/rig commands. Finish = merge to master + push.
Dependency graph: {T1 ∥ T2} → T3 → T4. T1 (SubscriptionRegistry.cs) and T2
(GatewayGalaxyDataWriter.cs) touch disjoint files — dispatch their implementers concurrently.
Task 1: SubscriptionRegistry forward fullRef → handle lookup
Classification: small Estimated implement time: ~4 min Parallelizable with: Task 2
Files:
- Modify:
src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Runtime/SubscriptionRegistry.cs - Test:
tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/SubscriptionRegistryHandleResolveTests.cs(create)
What:
- Add a forward index field next to the existing dictionaries:
private readonly ConcurrentDictionary<string, int> _itemHandleByFullRef = new(StringComparer.OrdinalIgnoreCase); - Maintain it incrementally (only for
binding.ItemHandle > 0):- In
Registerand in the add loop ofRebind:_itemHandleByFullRef[binding.FullReference] = binding.ItemHandle; - In
Removeand in the drop loop ofRebind: best-effort drop — when the handle's reverse-map set became empty (the existingremaining.IsEmptybranch), also_itemHandleByFullRef.TryRemove(binding.FullReference, out _)only if the current mapped handle still equals this binding's handle (TryGetValue+ equality before remove), so a concurrent re-add for the same ref isn't clobbered.
- In
- Add the public lookup with the liveness guard (the guard is what makes the best-effort removal
safe — a lingering entry can never resolve to a dead handle):
/// <summary> /// Resolve the live MXAccess item handle a current subscription holds for <paramref name="fullRef"/>, /// or null when no live subscription covers it. The writer borrows this handle to skip a /// redundant AddItem. Guarded by the authoritative live-handle set (<c>_subscribersByItemHandle</c>) /// so a stale forward-map entry can never hand out a dead handle. /// </summary> public int? TryResolveItemHandle(string fullRef) { if (fullRef is null) return null; if (_itemHandleByFullRef.TryGetValue(fullRef, out var handle) && _subscribersByItemHandle.ContainsKey(handle)) return handle; return null; }
Tests (xUnit + Shouldly; the registry needs no gateway — construct it and Register fake TagBindings):
RegisterthenTryResolveItemHandle("Tag.A")→ the registered handle; case-insensitive ("tag.a"resolves).- A fullRef never registered →
null. Removethe subscription →TryResolveItemHandle→null.Rebindwith a new handle for the same ref → resolves the new handle (not the old).- Bindings with
ItemHandle <= 0are not resolvable (null). - Liveness guard: a ref whose handle is no longer in the reverse map resolves
null(reachable viaRemoveof the only subscriber).
Steps: write the failing tests → run dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests --filter FullyQualifiedName~SubscriptionRegistryHandleResolve (FAIL) → implement → re-run (PASS) → git add the two paths + commit feat(galaxy): SubscriptionRegistry.TryResolveItemHandle forward lookup.
Task 2: Writer borrow seam + AddItemCallCount
Classification: small Estimated implement time: ~4 min Parallelizable with: Task 1
Files:
- Modify:
src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Runtime/GatewayGalaxyDataWriter.cs - Test:
tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/GatewayGalaxyDataWriterTests.cs(extend)
What:
- Add an optional last ctor param (keeps every existing call site compiling — current calls
are
new GatewayGalaxyDataWriter(session, writeUserId, logger?)):Store it in apublic GatewayGalaxyDataWriter( GalaxyMxSession session, int writeUserId, ILogger? logger = null, Func<string, int?>? subscribedHandleSource = null)private readonly Func<string, int?>? _subscribedHandleSource;field. - Add an
AddItemCallCountseam (proves "AddItem was skipped" live):private int _addItemCallCount; internal int AddItemCallCount => Volatile.Read(ref _addItemCallCount); - Extract the cached-or-borrowed decision into a synchronous internal seam (unit-testable
without a live session — the SDK session is sealed/unfakeable):
/// <summary> /// Resolve an item handle WITHOUT touching the gateway: a prior writer-AddItem'd handle /// (_itemHandles), else a live subscription handle borrowed from the registry. Returns null /// when neither is available (caller must AddItem). A borrowed handle is intentionally NOT /// cached in _itemHandles — the registry owns its lifecycle (incl. reconnect Rebind), so the /// next write re-borrows the fresh handle and no stale-cache window is introduced. /// </summary> internal int? TryResolveCachedOrBorrowed(string fullRef) { if (_itemHandles.TryGetValue(fullRef, out var existing)) return existing; if (_subscribedHandleSource?.Invoke(fullRef) is int borrowed && borrowed > 0) return borrowed; return null; } - Rewire
EnsureItemHandleAsyncto use the seam, AddItem only on a null result, and count it:(Keep the class-summary remark honest: note the cache may now be augmented by borrowed subscription handles that are consulted but not stored.)private async Task<int> EnsureItemHandleAsync( MxGatewaySession session, int serverHandle, string fullRef, CancellationToken ct) { if (TryResolveCachedOrBorrowed(fullRef) is int resolved) return resolved; var handle = await session.AddItemAsync(serverHandle, fullRef, ct).ConfigureAwait(false); Interlocked.Increment(ref _addItemCallCount); _itemHandles[fullRef] = handle; return handle; }
Tests (extend GatewayGalaxyDataWriterTests, same no-live-session pattern):
subscribedHandleSourcereturns a handle for a ref →TryResolveCachedOrBorrowedreturns it, andCachedItemHandleCountstays0(borrow is not cached) andAddItemCallCountstays0.- A
_itemHandleshit (viaSeedHandleCachesForTest) wins over the source. nullsource →TryResolveCachedOrBorrowedreturnsnull(today's behavior; would AddItem).- Source returns
0/negative (failed subscribe) → treated as no-borrow (null).
Steps: failing tests → run filter FullyQualifiedName~GatewayGalaxyDataWriter (FAIL) → implement → re-run (PASS) → git add the two paths + commit feat(galaxy): writer borrows live subscription item handles (skip redundant AddItem).
Task 3: Wire the registry into the production writer in GalaxyDriver
Classification: small Estimated implement time: ~2 min Parallelizable with: none (blocked by Task 1 + Task 2)
Files:
- Modify:
src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/GalaxyDriver.cs(the writer construction at ~255-256)
What: pass the registry resolver as the new 4th writer arg:
_dataWriter = new TracedGalaxyDataWriter(
new GatewayGalaxyDataWriter(
_ownedMxSession, _options.MxAccess.WriteUserId, _logger,
_subscriptions.TryResolveItemHandle),
...);
_subscriptions (line 72) is already a live field at this point. No other change. (The reconnect
path already invalidates the writer cache at GalaxyDriver.cs:298; borrowed handles aren't cached,
and the registry is Rebind'd in ReplayAsync, so the writer naturally re-borrows fresh handles
post-reconnect — nothing to add here.)
Steps: edit → dotnet build the Galaxy driver project (0 errors) → run the full
Driver.Galaxy.Tests suite (all green, confirms T1+T2+T3 integrate) → git add the path + commit
feat(galaxy): wire SubscriptionRegistry handle resolver into the production writer.
Task 4: Live-gw borrow smoke + verification + finish
Classification: small Estimated implement time: ~5 min (+ live run) Parallelizable with: none (blocked by Task 3)
Files:
- Modify:
tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/GatewayGalaxyLiveReopenAndWriteTests.cs - Modify:
docs/Galaxy.Performance.mdor the relevant Galaxy doc — a short note that the writer borrows subscription handles (only if a natural home exists; otherwise skip docs — no overclaim).
What — add one skip-gated [Fact] reusing the harness's RequireLiveGatewayOrSkip /
BuildClientOptions and the dedicated writable WriteRef = "TestMachine_002.TestFloat":
- Open a session +
GatewayGalaxySubscriber;SubscribeBulkAsync([WriteRef], …)and capture the real item handle the gateway returned forWriteRef(from theSubscribeResult— check howGalaxyDrivermaps subscribe results intoTagBindings to extract the handle). - Build a
SubscriptionRegistry,Registera binding(WriteRef, thatHandle). - Construct the writer with
registry.TryResolveItemHandleas the source; writeWriteRef. Assert: status Good (0u) andwriter.AddItemCallCount == 0(the borrow skipped AddItem). - Control (same safe tag): a writer with the same session but an empty registry (or
nullsource) writesWriteRef→ Good andAddItemCallCount == 1(no borrow ⇒ AddItem happened). Both write onlyTestMachine_002.TestFloat.
Verify:
dotnet build ZB.MOM.WW.OtOpcUa.slnx— 0 errors.dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests— all green (live test skips without env vars).- Live run (the gate) — source the key without echoing and run only the new + existing live smokes:
Expect: the existing 2 + the new borrow smoke pass (borrow write Good with
KEY=$(docker exec otopcua-dev-central-1-1 printenv GALAXY_MXGW_API_KEY) MXGW_ENDPOINT=http://10.100.0.48:5120 GALAXY_MXGW_API_KEY="$KEY" \ dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests \ --filter FullyQualifiedName~GatewayGalaxyLiveReopenAndWriteAddItemCallCount==0, control with==1). If the borrowed handle does NOT commit, the premise is false — stop, do not merge, report.
Finish: commit the test (+ optional doc) by explicit path; then
superpowers-extended-cc:finishing-a-development-branch → merge to master + push; delete the branch;
update project_stillpending_backlog.md + MEMORY.md (mark §2.4 shipped).
Steps: add the live [Fact] → build → unit suite green → live run PASS → commit → merge + push.