From 490c6b749840f9cd29740ba0ee1cf8b369a0e9db Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Thu, 18 Jun 2026 04:13:51 -0400 Subject: [PATCH] docs(plan): Galaxy writer handle-sharing implementation plan + tasks --- ...2026-06-18-galaxy-writer-handle-sharing.md | 224 ++++++++++++++++++ ...galaxy-writer-handle-sharing.md.tasks.json | 17 ++ 2 files changed, 241 insertions(+) create mode 100644 docs/plans/2026-06-18-galaxy-writer-handle-sharing.md create mode 100644 docs/plans/2026-06-18-galaxy-writer-handle-sharing.md.tasks.json diff --git a/docs/plans/2026-06-18-galaxy-writer-handle-sharing.md b/docs/plans/2026-06-18-galaxy-writer-handle-sharing.md new file mode 100644 index 00000000..56b72fc6 --- /dev/null +++ b/docs/plans/2026-06-18-galaxy-writer-handle-sharing.md @@ -0,0 +1,224 @@ +# Galaxy writer ⇄ subscription-registry item-handle sharing — Implementation Plan + +> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:subagent-driven-development to implement this plan task-by-task. + +**Goal:** Let `GatewayGalaxyDataWriter` borrow a live MXAccess item handle from the +`SubscriptionRegistry` so the first write to an already-subscribed tag skips a redundant `AddItem` +round-trip, without introducing any stale-handle regression. + +**Architecture:** Delegate seam (symmetric with the writer's existing `securityResolver`). The +registry exposes a liveness-guarded `int? TryResolveItemHandle(fullRef)`; the writer takes an +optional `Func? subscribedHandleSource` and consults it (never caching borrowed +handles); `GalaxyDriver` wires the two together. Proven by unit tests for the resolution logic and +a live-gw smoke that asserts a subscribed-tag write commits with **zero** `AddItem`. + +**Tech Stack:** C# / .NET 10, xUnit + Shouldly, MXAccess gateway (`ZB.MOM.WW.MxGateway.*`). + +**Design:** `docs/plans/2026-06-18-galaxy-writer-handle-sharing-design.md` (committed `c85c4e5c`). +**Branch:** `feat/galaxy-writer-handle-sharing` (off master `70e1bde9`). + +**Hard rules:** stage by explicit path (never `git add .`); never stage `sql_login.txt`, +`src/Server/ZB.MOM.WW.OtOpcUa.Host/pki/`, `pending.md`, `stillpending.md`, +`docker-dev/docker-compose.yml`; never echo/commit the gateway API key or any secret; no +force-push; no `--no-verify`; **NO EF migration / Commons / proto change; NO bUnit.** Use +`dangerouslyDisableSandbox: true` for all build/test/rig commands. Finish = merge to master + push. + +**Dependency graph:** `{T1 ∥ T2} → T3 → T4`. T1 (`SubscriptionRegistry.cs`) and T2 +(`GatewayGalaxyDataWriter.cs`) touch disjoint files — dispatch their implementers concurrently. + +--- + +### Task 1: SubscriptionRegistry forward `fullRef → handle` lookup + +**Classification:** small +**Estimated implement time:** ~4 min +**Parallelizable with:** Task 2 + +**Files:** +- Modify: `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Runtime/SubscriptionRegistry.cs` +- Test: `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/SubscriptionRegistryHandleResolveTests.cs` (create) + +**What:** +1. Add a forward index field next to the existing dictionaries: + ```csharp + private readonly ConcurrentDictionary _itemHandleByFullRef = + new(StringComparer.OrdinalIgnoreCase); + ``` +2. Maintain it incrementally (only for `binding.ItemHandle > 0`): + - In `Register` and in the **add** loop of `Rebind`: `_itemHandleByFullRef[binding.FullReference] = binding.ItemHandle;` + - In `Remove` and in the **drop** loop of `Rebind`: best-effort drop — when the handle's + reverse-map set became empty (the existing `remaining.IsEmpty` branch), also + `_itemHandleByFullRef.TryRemove(binding.FullReference, out _)` **only if** the current mapped + handle still equals this binding's handle (`TryGetValue` + equality before remove), so a + concurrent re-add for the same ref isn't clobbered. +3. Add the public lookup with the liveness guard (the guard is what makes the best-effort removal + safe — a lingering entry can never resolve to a dead handle): + ```csharp + /// + /// Resolve the live MXAccess item handle a current subscription holds for , + /// or null when no live subscription covers it. The writer borrows this handle to skip a + /// redundant AddItem. Guarded by the authoritative live-handle set (_subscribersByItemHandle) + /// so a stale forward-map entry can never hand out a dead handle. + /// + public int? TryResolveItemHandle(string fullRef) + { + if (fullRef is null) return null; + if (_itemHandleByFullRef.TryGetValue(fullRef, out var handle) + && _subscribersByItemHandle.ContainsKey(handle)) + return handle; + return null; + } + ``` + +**Tests** (xUnit + Shouldly; the registry needs no gateway — construct it and `Register` fake `TagBinding`s): +- `Register` then `TryResolveItemHandle("Tag.A")` → the registered handle; case-insensitive (`"tag.a"` resolves). +- A fullRef never registered → `null`. +- `Remove` the subscription → `TryResolveItemHandle` → `null`. +- `Rebind` with a new handle for the same ref → resolves the **new** handle (not the old). +- Bindings with `ItemHandle <= 0` are not resolvable (`null`). +- Liveness guard: a ref whose handle is no longer in the reverse map resolves `null` (reachable + via `Remove` of the only subscriber). + +**Steps:** write the failing tests → run `dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests --filter FullyQualifiedName~SubscriptionRegistryHandleResolve` (FAIL) → implement → re-run (PASS) → `git add` the two paths + commit `feat(galaxy): SubscriptionRegistry.TryResolveItemHandle forward lookup`. + +--- + +### Task 2: Writer borrow seam + `AddItemCallCount` + +**Classification:** small +**Estimated implement time:** ~4 min +**Parallelizable with:** Task 1 + +**Files:** +- Modify: `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Runtime/GatewayGalaxyDataWriter.cs` +- Test: `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/GatewayGalaxyDataWriterTests.cs` (extend) + +**What:** +1. Add an optional **last** ctor param (keeps every existing call site compiling — current calls + are `new GatewayGalaxyDataWriter(session, writeUserId, logger?)`): + ```csharp + public GatewayGalaxyDataWriter( + GalaxyMxSession session, int writeUserId, ILogger? logger = null, + Func? subscribedHandleSource = null) + ``` + Store it in a `private readonly Func? _subscribedHandleSource;` field. +2. Add an `AddItemCallCount` seam (proves "AddItem was skipped" live): + ```csharp + private int _addItemCallCount; + internal int AddItemCallCount => Volatile.Read(ref _addItemCallCount); + ``` +3. Extract the cached-or-borrowed decision into a synchronous internal seam (unit-testable + **without a live session** — the SDK session is sealed/unfakeable): + ```csharp + /// + /// Resolve an item handle WITHOUT touching the gateway: a prior writer-AddItem'd handle + /// (_itemHandles), else a live subscription handle borrowed from the registry. Returns null + /// when neither is available (caller must AddItem). A borrowed handle is intentionally NOT + /// cached in _itemHandles — the registry owns its lifecycle (incl. reconnect Rebind), so the + /// next write re-borrows the fresh handle and no stale-cache window is introduced. + /// + internal int? TryResolveCachedOrBorrowed(string fullRef) + { + if (_itemHandles.TryGetValue(fullRef, out var existing)) return existing; + if (_subscribedHandleSource?.Invoke(fullRef) is int borrowed && borrowed > 0) return borrowed; + return null; + } + ``` +4. Rewire `EnsureItemHandleAsync` to use the seam, AddItem only on a null result, and count it: + ```csharp + private async Task EnsureItemHandleAsync( + MxGatewaySession session, int serverHandle, string fullRef, CancellationToken ct) + { + if (TryResolveCachedOrBorrowed(fullRef) is int resolved) return resolved; + var handle = await session.AddItemAsync(serverHandle, fullRef, ct).ConfigureAwait(false); + Interlocked.Increment(ref _addItemCallCount); + _itemHandles[fullRef] = handle; + return handle; + } + ``` + (Keep the class-summary remark honest: note the cache may now be augmented by borrowed + subscription handles that are consulted but not stored.) + +**Tests** (extend `GatewayGalaxyDataWriterTests`, same no-live-session pattern): +- `subscribedHandleSource` returns a handle for a ref → `TryResolveCachedOrBorrowed` returns it, + and `CachedItemHandleCount` stays `0` (borrow is not cached) and `AddItemCallCount` stays `0`. +- A `_itemHandles` hit (via `SeedHandleCachesForTest`) wins over the source. +- `null` source → `TryResolveCachedOrBorrowed` returns `null` (today's behavior; would AddItem). +- Source returns `0`/negative (failed subscribe) → treated as no-borrow (`null`). + +**Steps:** failing tests → run filter `FullyQualifiedName~GatewayGalaxyDataWriter` (FAIL) → implement → re-run (PASS) → `git add` the two paths + commit `feat(galaxy): writer borrows live subscription item handles (skip redundant AddItem)`. + +--- + +### Task 3: Wire the registry into the production writer in `GalaxyDriver` + +**Classification:** small +**Estimated implement time:** ~2 min +**Parallelizable with:** none (blocked by Task 1 + Task 2) + +**Files:** +- Modify: `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/GalaxyDriver.cs` (the writer construction at ~`255-256`) + +**What:** pass the registry resolver as the new 4th writer arg: +```csharp +_dataWriter = new TracedGalaxyDataWriter( + new GatewayGalaxyDataWriter( + _ownedMxSession, _options.MxAccess.WriteUserId, _logger, + _subscriptions.TryResolveItemHandle), + ...); +``` +`_subscriptions` (line 72) is already a live field at this point. No other change. (The reconnect +path already invalidates the writer cache at `GalaxyDriver.cs:298`; borrowed handles aren't cached, +and the registry is `Rebind`'d in `ReplayAsync`, so the writer naturally re-borrows fresh handles +post-reconnect — nothing to add here.) + +**Steps:** edit → `dotnet build` the Galaxy driver project (0 errors) → run the full +`Driver.Galaxy.Tests` suite (all green, confirms T1+T2+T3 integrate) → `git add` the path + commit +`feat(galaxy): wire SubscriptionRegistry handle resolver into the production writer`. + +--- + +### Task 4: Live-gw borrow smoke + verification + finish + +**Classification:** small +**Estimated implement time:** ~5 min (+ live run) +**Parallelizable with:** none (blocked by Task 3) + +**Files:** +- Modify: `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/GatewayGalaxyLiveReopenAndWriteTests.cs` +- Modify: `docs/Galaxy.Performance.md` **or** the relevant Galaxy doc — a short note that the writer + borrows subscription handles (only if a natural home exists; otherwise skip docs — no overclaim). + +**What — add one skip-gated `[Fact]`** reusing the harness's `RequireLiveGatewayOrSkip` / +`BuildClientOptions` and the dedicated writable `WriteRef = "TestMachine_002.TestFloat"`: + +1. Open a session + `GatewayGalaxySubscriber`; `SubscribeBulkAsync([WriteRef], …)` and capture the + **real** item handle the gateway returned for `WriteRef` (from the `SubscribeResult` — check how + `GalaxyDriver` maps subscribe results into `TagBinding`s to extract the handle). +2. Build a `SubscriptionRegistry`, `Register` a binding `(WriteRef, thatHandle)`. +3. Construct the writer **with** `registry.TryResolveItemHandle` as the source; write `WriteRef`. + Assert: status Good (`0u`) **and `writer.AddItemCallCount == 0`** (the borrow skipped AddItem). +4. **Control** (same safe tag): a writer with the **same session but an empty registry** (or `null` + source) writes `WriteRef` → Good **and `AddItemCallCount == 1`** (no borrow ⇒ AddItem happened). + Both write only `TestMachine_002.TestFloat`. + +**Verify:** +- `dotnet build ZB.MOM.WW.OtOpcUa.slnx` — 0 errors. +- `dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests` — all green (live test skips + without env vars). +- **Live run (the gate)** — source the key without echoing and run only the new + existing live smokes: + ```bash + KEY=$(docker exec otopcua-dev-central-1-1 printenv GALAXY_MXGW_API_KEY) + MXGW_ENDPOINT=http://10.100.0.48:5120 GALAXY_MXGW_API_KEY="$KEY" \ + dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests \ + --filter FullyQualifiedName~GatewayGalaxyLiveReopenAndWrite + ``` + Expect: the existing 2 + the new borrow smoke pass (borrow write Good with `AddItemCallCount==0`, + control with `==1`). If the borrowed handle does NOT commit, the premise is false — **stop, do not + merge**, report. + +**Finish:** commit the test (+ optional doc) by explicit path; then +superpowers-extended-cc:finishing-a-development-branch → merge to master + push; delete the branch; +update `project_stillpending_backlog.md` + `MEMORY.md` (mark §2.4 shipped). + +**Steps:** add the live `[Fact]` → build → unit suite green → live run PASS → commit → merge + push. diff --git a/docs/plans/2026-06-18-galaxy-writer-handle-sharing.md.tasks.json b/docs/plans/2026-06-18-galaxy-writer-handle-sharing.md.tasks.json new file mode 100644 index 00000000..405ce9ca --- /dev/null +++ b/docs/plans/2026-06-18-galaxy-writer-handle-sharing.md.tasks.json @@ -0,0 +1,17 @@ +{ + "planPath": "docs/plans/2026-06-18-galaxy-writer-handle-sharing.md", + "designPath": "docs/plans/2026-06-18-galaxy-writer-handle-sharing-design.md", + "designCommit": "c85c4e5c", + "baseMaster": "70e1bde9", + "branch": "feat/galaxy-writer-handle-sharing", + "scope": "Galaxy writer borrows live MXAccess item handles from the SubscriptionRegistry so the first write to an already-subscribed tag skips a redundant AddItem round-trip. T1: SubscriptionRegistry.TryResolveItemHandle forward fullRef->handle lookup with a liveness guard. T2: writer optional Func subscribedHandleSource + TryResolveCachedOrBorrowed seam (borrowed handles NOT cached) + AddItemCallCount seam. T3: GalaxyDriver wires _subscriptions.TryResolveItemHandle into the production writer ctor. T4: skip-gated live-gw smoke proving a subscribed-tag write commits with AddItemCallCount==0 (control: ==1). NO Commons/proto/EF/migration change; NO bUnit; AdminUI untouched.", + "dependencyGraph": "{T1 ∥ T2} -> T3 -> T4", + "tasks": [ + {"id": 1, "subject": "SubscriptionRegistry.TryResolveItemHandle forward lookup + tests", "classification": "small", "status": "pending", "parallelizableWith": [2], "files": ["src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Runtime/SubscriptionRegistry.cs", "tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/SubscriptionRegistryHandleResolveTests.cs"]}, + {"id": 2, "subject": "Writer borrow seam + AddItemCallCount + tests", "classification": "small", "status": "pending", "parallelizableWith": [1], "files": ["src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Runtime/GatewayGalaxyDataWriter.cs", "tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/GatewayGalaxyDataWriterTests.cs"]}, + {"id": 3, "subject": "Wire registry resolver into the production writer in GalaxyDriver", "classification": "small", "status": "pending", "blockedBy": [1, 2], "files": ["src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/GalaxyDriver.cs"]}, + {"id": 4, "subject": "Live-gw borrow smoke + build/test + live run + finish", "classification": "small", "status": "pending", "blockedBy": [3], "files": ["tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/GatewayGalaxyLiveReopenAndWriteTests.cs"]} + ], + "executionState": "PENDING — subagent-driven execution, this session.", + "lastUpdated": "2026-06-18" +}