225 lines
12 KiB
Markdown
225 lines
12 KiB
Markdown
# Galaxy writer ⇄ subscription-registry item-handle sharing — Implementation Plan
|
|
|
|
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:subagent-driven-development to implement this plan task-by-task.
|
|
|
|
**Goal:** Let `GatewayGalaxyDataWriter` borrow a live MXAccess item handle from the
|
|
`SubscriptionRegistry` so the first write to an already-subscribed tag skips a redundant `AddItem`
|
|
round-trip, without introducing any stale-handle regression.
|
|
|
|
**Architecture:** Delegate seam (symmetric with the writer's existing `securityResolver`). The
|
|
registry exposes a liveness-guarded `int? TryResolveItemHandle(fullRef)`; the writer takes an
|
|
optional `Func<string,int?>? subscribedHandleSource` and consults it (never caching borrowed
|
|
handles); `GalaxyDriver` wires the two together. Proven by unit tests for the resolution logic and
|
|
a live-gw smoke that asserts a subscribed-tag write commits with **zero** `AddItem`.
|
|
|
|
**Tech Stack:** C# / .NET 10, xUnit + Shouldly, MXAccess gateway (`ZB.MOM.WW.MxGateway.*`).
|
|
|
|
**Design:** `docs/plans/2026-06-18-galaxy-writer-handle-sharing-design.md` (committed `c85c4e5c`).
|
|
**Branch:** `feat/galaxy-writer-handle-sharing` (off master `70e1bde9`).
|
|
|
|
**Hard rules:** stage by explicit path (never `git add .`); never stage `sql_login.txt`,
|
|
`src/Server/ZB.MOM.WW.OtOpcUa.Host/pki/`, `pending.md`, `stillpending.md`,
|
|
`docker-dev/docker-compose.yml`; never echo/commit the gateway API key or any secret; no
|
|
force-push; no `--no-verify`; **NO EF migration / Commons / proto change; NO bUnit.** Use
|
|
`dangerouslyDisableSandbox: true` for all build/test/rig commands. Finish = merge to master + push.
|
|
|
|
**Dependency graph:** `{T1 ∥ T2} → T3 → T4`. T1 (`SubscriptionRegistry.cs`) and T2
|
|
(`GatewayGalaxyDataWriter.cs`) touch disjoint files — dispatch their implementers concurrently.
|
|
|
|
---
|
|
|
|
### Task 1: SubscriptionRegistry forward `fullRef → handle` lookup
|
|
|
|
**Classification:** small
|
|
**Estimated implement time:** ~4 min
|
|
**Parallelizable with:** Task 2
|
|
|
|
**Files:**
|
|
- Modify: `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Runtime/SubscriptionRegistry.cs`
|
|
- Test: `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/SubscriptionRegistryHandleResolveTests.cs` (create)
|
|
|
|
**What:**
|
|
1. Add a forward index field next to the existing dictionaries:
|
|
```csharp
|
|
private readonly ConcurrentDictionary<string, int> _itemHandleByFullRef =
|
|
new(StringComparer.OrdinalIgnoreCase);
|
|
```
|
|
2. Maintain it incrementally (only for `binding.ItemHandle > 0`):
|
|
- In `Register` and in the **add** loop of `Rebind`: `_itemHandleByFullRef[binding.FullReference] = binding.ItemHandle;`
|
|
- In `Remove` and in the **drop** loop of `Rebind`: best-effort drop — when the handle's
|
|
reverse-map set became empty (the existing `remaining.IsEmpty` branch), also
|
|
`_itemHandleByFullRef.TryRemove(binding.FullReference, out _)` **only if** the current mapped
|
|
handle still equals this binding's handle (`TryGetValue` + equality before remove), so a
|
|
concurrent re-add for the same ref isn't clobbered.
|
|
3. Add the public lookup with the liveness guard (the guard is what makes the best-effort removal
|
|
safe — a lingering entry can never resolve to a dead handle):
|
|
```csharp
|
|
/// <summary>
|
|
/// Resolve the live MXAccess item handle a current subscription holds for <paramref name="fullRef"/>,
|
|
/// or null when no live subscription covers it. The writer borrows this handle to skip a
|
|
/// redundant AddItem. Guarded by the authoritative live-handle set (<c>_subscribersByItemHandle</c>)
|
|
/// so a stale forward-map entry can never hand out a dead handle.
|
|
/// </summary>
|
|
public int? TryResolveItemHandle(string fullRef)
|
|
{
|
|
if (fullRef is null) return null;
|
|
if (_itemHandleByFullRef.TryGetValue(fullRef, out var handle)
|
|
&& _subscribersByItemHandle.ContainsKey(handle))
|
|
return handle;
|
|
return null;
|
|
}
|
|
```
|
|
|
|
**Tests** (xUnit + Shouldly; the registry needs no gateway — construct it and `Register` fake `TagBinding`s):
|
|
- `Register` then `TryResolveItemHandle("Tag.A")` → the registered handle; case-insensitive (`"tag.a"` resolves).
|
|
- A fullRef never registered → `null`.
|
|
- `Remove` the subscription → `TryResolveItemHandle` → `null`.
|
|
- `Rebind` with a new handle for the same ref → resolves the **new** handle (not the old).
|
|
- Bindings with `ItemHandle <= 0` are not resolvable (`null`).
|
|
- Liveness guard: a ref whose handle is no longer in the reverse map resolves `null` (reachable
|
|
via `Remove` of the only subscriber).
|
|
|
|
**Steps:** write the failing tests → run `dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests --filter FullyQualifiedName~SubscriptionRegistryHandleResolve` (FAIL) → implement → re-run (PASS) → `git add` the two paths + commit `feat(galaxy): SubscriptionRegistry.TryResolveItemHandle forward lookup`.
|
|
|
|
---
|
|
|
|
### Task 2: Writer borrow seam + `AddItemCallCount`
|
|
|
|
**Classification:** small
|
|
**Estimated implement time:** ~4 min
|
|
**Parallelizable with:** Task 1
|
|
|
|
**Files:**
|
|
- Modify: `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Runtime/GatewayGalaxyDataWriter.cs`
|
|
- Test: `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/GatewayGalaxyDataWriterTests.cs` (extend)
|
|
|
|
**What:**
|
|
1. Add an optional **last** ctor param (keeps every existing call site compiling — current calls
|
|
are `new GatewayGalaxyDataWriter(session, writeUserId, logger?)`):
|
|
```csharp
|
|
public GatewayGalaxyDataWriter(
|
|
GalaxyMxSession session, int writeUserId, ILogger? logger = null,
|
|
Func<string, int?>? subscribedHandleSource = null)
|
|
```
|
|
Store it in a `private readonly Func<string, int?>? _subscribedHandleSource;` field.
|
|
2. Add an `AddItemCallCount` seam (proves "AddItem was skipped" live):
|
|
```csharp
|
|
private int _addItemCallCount;
|
|
internal int AddItemCallCount => Volatile.Read(ref _addItemCallCount);
|
|
```
|
|
3. Extract the cached-or-borrowed decision into a synchronous internal seam (unit-testable
|
|
**without a live session** — the SDK session is sealed/unfakeable):
|
|
```csharp
|
|
/// <summary>
|
|
/// Resolve an item handle WITHOUT touching the gateway: a prior writer-AddItem'd handle
|
|
/// (_itemHandles), else a live subscription handle borrowed from the registry. Returns null
|
|
/// when neither is available (caller must AddItem). A borrowed handle is intentionally NOT
|
|
/// cached in _itemHandles — the registry owns its lifecycle (incl. reconnect Rebind), so the
|
|
/// next write re-borrows the fresh handle and no stale-cache window is introduced.
|
|
/// </summary>
|
|
internal int? TryResolveCachedOrBorrowed(string fullRef)
|
|
{
|
|
if (_itemHandles.TryGetValue(fullRef, out var existing)) return existing;
|
|
if (_subscribedHandleSource?.Invoke(fullRef) is int borrowed && borrowed > 0) return borrowed;
|
|
return null;
|
|
}
|
|
```
|
|
4. Rewire `EnsureItemHandleAsync` to use the seam, AddItem only on a null result, and count it:
|
|
```csharp
|
|
private async Task<int> EnsureItemHandleAsync(
|
|
MxGatewaySession session, int serverHandle, string fullRef, CancellationToken ct)
|
|
{
|
|
if (TryResolveCachedOrBorrowed(fullRef) is int resolved) return resolved;
|
|
var handle = await session.AddItemAsync(serverHandle, fullRef, ct).ConfigureAwait(false);
|
|
Interlocked.Increment(ref _addItemCallCount);
|
|
_itemHandles[fullRef] = handle;
|
|
return handle;
|
|
}
|
|
```
|
|
(Keep the class-summary remark honest: note the cache may now be augmented by borrowed
|
|
subscription handles that are consulted but not stored.)
|
|
|
|
**Tests** (extend `GatewayGalaxyDataWriterTests`, same no-live-session pattern):
|
|
- `subscribedHandleSource` returns a handle for a ref → `TryResolveCachedOrBorrowed` returns it,
|
|
and `CachedItemHandleCount` stays `0` (borrow is not cached) and `AddItemCallCount` stays `0`.
|
|
- A `_itemHandles` hit (via `SeedHandleCachesForTest`) wins over the source.
|
|
- `null` source → `TryResolveCachedOrBorrowed` returns `null` (today's behavior; would AddItem).
|
|
- Source returns `0`/negative (failed subscribe) → treated as no-borrow (`null`).
|
|
|
|
**Steps:** failing tests → run filter `FullyQualifiedName~GatewayGalaxyDataWriter` (FAIL) → implement → re-run (PASS) → `git add` the two paths + commit `feat(galaxy): writer borrows live subscription item handles (skip redundant AddItem)`.
|
|
|
|
---
|
|
|
|
### Task 3: Wire the registry into the production writer in `GalaxyDriver`
|
|
|
|
**Classification:** small
|
|
**Estimated implement time:** ~2 min
|
|
**Parallelizable with:** none (blocked by Task 1 + Task 2)
|
|
|
|
**Files:**
|
|
- Modify: `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/GalaxyDriver.cs` (the writer construction at ~`255-256`)
|
|
|
|
**What:** pass the registry resolver as the new 4th writer arg:
|
|
```csharp
|
|
_dataWriter = new TracedGalaxyDataWriter(
|
|
new GatewayGalaxyDataWriter(
|
|
_ownedMxSession, _options.MxAccess.WriteUserId, _logger,
|
|
_subscriptions.TryResolveItemHandle),
|
|
...);
|
|
```
|
|
`_subscriptions` (line 72) is already a live field at this point. No other change. (The reconnect
|
|
path already invalidates the writer cache at `GalaxyDriver.cs:298`; borrowed handles aren't cached,
|
|
and the registry is `Rebind`'d in `ReplayAsync`, so the writer naturally re-borrows fresh handles
|
|
post-reconnect — nothing to add here.)
|
|
|
|
**Steps:** edit → `dotnet build` the Galaxy driver project (0 errors) → run the full
|
|
`Driver.Galaxy.Tests` suite (all green, confirms T1+T2+T3 integrate) → `git add` the path + commit
|
|
`feat(galaxy): wire SubscriptionRegistry handle resolver into the production writer`.
|
|
|
|
---
|
|
|
|
### Task 4: Live-gw borrow smoke + verification + finish
|
|
|
|
**Classification:** small
|
|
**Estimated implement time:** ~5 min (+ live run)
|
|
**Parallelizable with:** none (blocked by Task 3)
|
|
|
|
**Files:**
|
|
- Modify: `tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/Runtime/GatewayGalaxyLiveReopenAndWriteTests.cs`
|
|
- Modify: `docs/Galaxy.Performance.md` **or** the relevant Galaxy doc — a short note that the writer
|
|
borrows subscription handles (only if a natural home exists; otherwise skip docs — no overclaim).
|
|
|
|
**What — add one skip-gated `[Fact]`** reusing the harness's `RequireLiveGatewayOrSkip` /
|
|
`BuildClientOptions` and the dedicated writable `WriteRef = "TestMachine_002.TestFloat"`:
|
|
|
|
1. Open a session + `GatewayGalaxySubscriber`; `SubscribeBulkAsync([WriteRef], …)` and capture the
|
|
**real** item handle the gateway returned for `WriteRef` (from the `SubscribeResult` — check how
|
|
`GalaxyDriver` maps subscribe results into `TagBinding`s to extract the handle).
|
|
2. Build a `SubscriptionRegistry`, `Register` a binding `(WriteRef, thatHandle)`.
|
|
3. Construct the writer **with** `registry.TryResolveItemHandle` as the source; write `WriteRef`.
|
|
Assert: status Good (`0u`) **and `writer.AddItemCallCount == 0`** (the borrow skipped AddItem).
|
|
4. **Control** (same safe tag): a writer with the **same session but an empty registry** (or `null`
|
|
source) writes `WriteRef` → Good **and `AddItemCallCount == 1`** (no borrow ⇒ AddItem happened).
|
|
Both write only `TestMachine_002.TestFloat`.
|
|
|
|
**Verify:**
|
|
- `dotnet build ZB.MOM.WW.OtOpcUa.slnx` — 0 errors.
|
|
- `dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests` — all green (live test skips
|
|
without env vars).
|
|
- **Live run (the gate)** — source the key without echoing and run only the new + existing live smokes:
|
|
```bash
|
|
KEY=$(docker exec otopcua-dev-central-1-1 printenv GALAXY_MXGW_API_KEY)
|
|
MXGW_ENDPOINT=http://10.100.0.48:5120 GALAXY_MXGW_API_KEY="$KEY" \
|
|
dotnet test tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests \
|
|
--filter FullyQualifiedName~GatewayGalaxyLiveReopenAndWrite
|
|
```
|
|
Expect: the existing 2 + the new borrow smoke pass (borrow write Good with `AddItemCallCount==0`,
|
|
control with `==1`). If the borrowed handle does NOT commit, the premise is false — **stop, do not
|
|
merge**, report.
|
|
|
|
**Finish:** commit the test (+ optional doc) by explicit path; then
|
|
superpowers-extended-cc:finishing-a-development-branch → merge to master + push; delete the branch;
|
|
update `project_stillpending_backlog.md` + `MEMORY.md` (mark §2.4 shipped).
|
|
|
|
**Steps:** add the live `[Fact]` → build → unit suite green → live run PASS → commit → merge + push.
|