# Pending — open follow-ups & deferrals As of 2026-06-14. master HEAD `c24abc8a` (synced with origin; feature branch `feat/galaxy-phase-c-historian` ff-merged + deleted). Working tree is clean except the expected DISK-ONLY files: `docker-dev/docker-compose.yml` (M — uncommitted rig config, never staged) and `pending.md` (M — these notes, never staged), plus two untracked pre-existing `docs/plans/2026-06-14-write-outcome-self-correction-plan.md*` docs. HARD RULE: never `git add .`; never stage `pending.md` / `current.md` / `docker-dev/docker-compose.yml` / `sql_login.txt` / `src/Server/ZB.MOM.WW.OtOpcUa.Host/pki/`; never commit secrets. **GALAXY PHASE C — SERVER-SIDE OPC UA HistoryRead MERGED + PUSHED to master `c24abc8a` (2026-06-14, fast-forward, 14 commits = 2 design/plan docs + 12 feature/test/doc).** The server now answers OPC UA **HistoryRead** (Raw / Processed / AtTime over historized variable nodes; **Events** over alarm-owning equipment-folder event-notifier nodes) for any equipment tag flagged historized, driver-agnostically, by dispatching to the registered `IHistorianDataSource` (the Wonderware historian TCP client, which already implemented that interface). **NO EF migration** — the flag rides in the existing `TagConfig` JSON blob (`{"FullName":"…","isHistorized":true,"historianTagname":"…"?}`, the Phase-B `alarm`-object carrier), `historianTagname` defaults to the tag's driver `FullName`. Design/plan `docs/plans/2026-06-14-galaxy-phase-c-historian-*.md`; guide `docs/Historian.md`. Pipeline: `Phase7Composer.ExtractTagHistorize` + byte-parity `DeploymentArtifact.ExtractTagHistorize` → `EquipmentTagPlan.{IsHistorized,HistorianTagname}` → `Phase7Applier` resolves `IsHistorized ? (HistorianTagname ?? FullName) : null` → sink seam (`IOpcUaAddressSpaceSink.EnsureVariable` +`string? historianTagname`) → `OtOpcUaNodeManager.EnsureVariable` sets `Historizing`+`AccessLevels.HistoryRead`+registers `_historizedTagnames`; the node manager overrides the four `CustomNodeManager2` HistoryRead virtuals, block-bridging to the `HistorianDataSource` property (volatile, default `NullHistorianDataSource.Instance`). DI mirrors `AlarmHistorian`: `AddServerHistorian` (config-gated, `ServerHistorian` appsettings section, Null default via `TryAddSingleton`) + `OtOpcUaSdkServer.SetHistorianDataSource` + Host `Program.cs`/`OtOpcUaServerHostedService` Start/Stop wiring. **Graceful degrade:** historized node + Null/unconfigured source → `Good_NoData` (empty), non-historized node → `BadHistoryOperationUnsupported`. **KEY FACTS/GOTCHAS:** the SDK base filters event reads by the `EventNotifier.HistoryRead` bit (variable nodes never reach the events arm); the SDK master propagates `errors[i].Code → results[i].StatusCode` (confirmed by decompiling `MasterNodeManager.HistoryReadAsync`), so the override signals per-node status via `errors[handle.Index]`; the two `HistoryReadResult` types (SDK `Opc.Ua.HistoryReadResult` vs Core.Abstractions DTO) are aliased `SdkHistoryReadResult`/`HistorianRead`; `ReadRawModifiedDetails.IsReadModified` defaults TRUE (Initialize() sets it) so a plain raw read must clear it — modified-history is unsupported; the events arm registers folder→sourceName (= equipment id) only when a non-Null historian is wired at promotion time (Host wires the source at StartAsync before any deployment materialises, so normal boot ordering is correct). Built via subagent-driven dev (T1/T2/T5 standard parallel review, T3/T4 high-risk serial spec→code, + a final integration review READY-TO-MERGE). **Build clean (0 errors); OpcUaServer.Tests 152/0, Runtime.Tests 234/0, Core.Abstractions.Tests 88/0.** **LIVE `/run` GATE (T7) DEFERRED — operator-driven: it needs the Wonderware sidecar + AVEVA Historian on the WW Historian VM `10.100.0.48`, which is NOT on the local docker-dev rig.** When run: author a historized Galaxy tag (`TagConfig` `"isHistorized":true`), set `ServerHistorian:Enabled=true` → sidecar (Host/Port/SharedSecret/TLS), deploy on `MAIN-galaxy-eq`, then `Client.CLI historyread -n "ns=2;s=/" --start … --end …` → samples; a non-historized tag → `BadHistoryOperationUnsupported`. KNOWN follow-ups (non-blocking, documented): single-shot reads only (no server-managed continuation-point paging yet); no modified-value history; no explicit timeout at the block-bridge call site (bounded by the `WonderwareHistorianClient` 30s `CallTimeout`); the StopAsync→DisposeAsync warm-shutdown window returns Good_NoData (mirrors the `SetNodeWriteGateway` pattern). **DRIVER-RECONFIGURE-WHILE-FAULTED (#7) MERGED + PUSHED to master `56f73e49` (2026-06-14, fast-forward, 5 commits = 2 docs + fix + review-nit + task-status).** A `DriverInstanceActor` stuck `Connecting`/`Reconnecting` now **adopts a corrected config delivered via `ApplyDelta`** and re-initialises with it, instead of dead-lettering the message and retrying the stale config forever (old workaround = restart node). Design/plan `docs/plans/2026-06-14-driver-reconfigure-while-faulted-*.md`. Mechanism (approach B): a monotonic `_initGeneration` tags each `InitializeAsync`; `InitializeSucceeded(int Generation)`/`InitializeFailed(string Reason, int Generation)` carry it; the `Connecting`/`Reconnecting` result handlers **drop superseded (stale-generation) results** so a corrected config always wins against an old init still in flight; a new `AdoptConfigDuringInit` (wired into both not-connected states) calls `InitializeAsync(newConfig)` (swaps `_currentConfigJson`, bumps the generation, retries immediately) + replies `ApplyResult(true,…)`. Contained to ONE file (`src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs`) + its test — NO host/contract/EF change; `Connected`/`Stubbed` `ApplyDelta` paths untouched; the two result records have ZERO external consumers (grep-verified) so the shape change is fully contained. Built via subagent-driven dev (high-risk full chain: spec ✅ · code ✅ · final integration ✅ — the integration pass traced the no-strand lifecycle in both states, host↔child contract, subscription/desired-refs/alarm safety, health/redundancy, double-adopt, and test fidelity, all clean). **Build clean (0 errors); Runtime 224/224.** **LIVE `/run` GATE DEFERRED per user ("skip the live test part").** When run: put `MAIN-opcua-eq` into a faulted/`Reconnecting` state via a bad `DriverConfig`, deploy a corrected config (`POST http://localhost:9200/api/deployments`, `X-Api-Key: docker-dev-deploy-key`), and confirm from central-1 logs that the driver adopts the new config + connects WITHOUT a node restart. **GALAXY PHASE B — NATIVE ALARMS ON THE EQUIPMENT-TAG PATH MERGED + PUSHED to master `f9be3843` (2026-06-14, fast-forward, 12 feature + 2 doc commits).** A Galaxy equipment `Tag` marked as a native alarm via its `TagConfig.alarm` object — `{"FullName":"tag.attr","alarm":{"alarmType":"OffNormalAlarm","severity":700}}`, **NO EF migration** — now materialises a real OPC UA Part 9 `AlarmConditionState` under its equipment folder, driven live by the driver's `IAlarmSource.OnAlarmEvent`; transitions fan out to `/alerts` + the historian (Primary-gated). Design/plan `docs/plans/2026-06-14-galaxy-phase-b-native-alarms-*.md`. NEW seam (mirrors the scripted-alarm seam, reuses the condition sink UNCHANGED): `AlarmEventArgs.Kind` (additive contract; Galaxy populates it) → `DriverInstanceActor` subscribes `OnAlarmEvent` → `AttributeAlarmPublished` → `DriverHostActor._alarmNodeIdByDriverRef` + `NativeAlarmProjector` (transition→`AlarmConditionSnapshot`) → `OpcUaPublishActor.AlarmStateUpdate` → reused `OtOpcUaNodeManager.WriteAlarmCondition`. Built via subagent-driven dev (full per-task review chain). **Build clean; Core.Abstractions 81/0, OpcUaServer 118/0, Runtime 222/0, Galaxy 262/0(+1 live-gw skip).** **THE FINAL INTEGRATION REVIEW CAUGHT A CRITICAL SEAM BUG every unit test missed (fixed in `f9be3843`):** the alarm map is keyed by the dotted `FullName`, but `GalaxyDriver` puts the BARE owning object in `AlarmEventArgs.SourceNodeId` and the DOTTED alarm ref in `ConditionId` (`GalaxyDriver.cs:1148-1149`; `AlarmFullReference`) — so `ForwardNativeAlarm` MUST resolve on `msg.Args.ConditionId` (= `AlarmFullReference` = the authored `FullName`), NOT `SourceNodeId`. The unit test had masked it by setting `SourceNodeId==FullName` (never true in prod); it is now production-shaped (`SourceNodeId="Temp"` ≠ `ConditionId="Temp.HiHi"=FullName`) so it genuinely guards the seam. **LIVE `/run` GATE (T9) NOT YET DONE — user-driven, deferred at merge (user choice "merge now").** When run: author a Galaxy alarm equipment tag whose `FullName` EXACTLY matches the gateway's `AlarmFullReference` (discoverable via the Galaxy picker/a probe), deploy on `MAIN-galaxy-eq`, trip the alarm → Part 9 condition goes active under the equipment + the `/alerts` row appears; clear → inactive. RESIDUAL non-blocking follow-ups (review-surfaced): (a) `DetachSubscription` alarm-coupling doc note + a dead-letter-during-reconnect regression test (WS-4b); (b) ack/comment-path test + assert `evt.Comment` (WS-5); (c) a `docs/ScriptedAlarms.md` note that authored severity (1..1000, seeds the condition at materialise) snaps to the projector's 4-bucket value (200/500/700/900) on the first transition; (d) DEFERRED by design: inbound device-ack (client Ack → `IAlarmSource.AcknowledgeAsync` → AVEVA), driving `SubscribeAlarmsAsync` from the materialised alarm-ref set (Galaxy doesn't need it), AdminUI Galaxy-picker `alarm` pre-fill, carrying raw OPC UA severity end-to-end. Phase **C** (server `HistoryRead`) **DONE — merged `c24abc8a` (2026-06-14); see the Phase C banner at top.** **WRITE-OUTCOME SELF-CORRECTION (#5) MERGED + PUSHED to master `1d797c1c` (2026-06-14, fast-forward, 6 commits).** A failed inbound device write now reverts the node to its real pre-write value (compare-and-revert) instead of leaving the optimistic-`Good` phantom. Design `docs/plans/2026-06-14-write-outcome-self-correction-design.md`; plan `…-plan.md`. NEW `IOpcUaNodeWriteGateway`/`NodeWriteOutcome` (Commons) + `ActorNodeWriteGateway` (Runtime, Asks `RouteNodeWrite`, returns the outcome) replace the fire-and-forget `Action` router; `OnEquipmentTagWrite` captures the prior value + fires an off-Lock (`RunContinuationsAsynchronously`) continuation that reverts on a failed outcome IF the node still holds the optimistic value (`ShouldRevert`). Build clean; Commons 39/0, Runtime 201/0, OpcUaServer 111/0. High-risk review verified the prior-value capture against the actual UA-.NETStandard source (`OnWriteValue` fires before `m_value=value`). **LIVE-PROVEN end-to-end** via a local Modbus exception-injector (FC06 reject on HR[20]): authorized failing write → gateway-logged `0x808B0000` reject → node reverts 99→20; authorized success (HR200=7777) stays; anon → BadUserAccessDenied. KEY FINDING: the **Galaxy gateway worker's `ExecuteWrite` is fire-and-forget** (returns OK without awaiting the MXAccess commit), so Galaxy writes ALWAYS return Success at the OPC UA layer and can NEVER surface a device-write failure to this revert — only protocol drivers (which await + return real status) can; this is the same gateway-side limitation noted under "optimistic-write phantom" (out of our scope). Two MINOR deferred follow-ups remain: a Bad-quality blip / OPC UA AuditWriteUpdateEvent on failure, and synchronous structural fail-fast (both explicitly out of scope per the chosen mechanism). **HARDEN MILESTONE 1b cluster MERGED + PUSHED to master `945c2380` (2026-06-14, fast-forward, 9 commits).** Follow-ups #3 (data-plane role docs), #4 (write-pipeline review nits), and #6 (Galaxy driver nits) below are CLOSED. Plan: `docs/plans/2026-06-14-harden-milestone-1b-plan.md`. Build clean; Runtime 197/0, FOCAS 185/0, Galaxy 257/0(+1 pre-existing skip); final integration review READY-TO-MERGE. Two MINOR residual follow-ups surfaced by review (both deferred, non-blocking): (a) a *driver-level* regression test that `GalaxyDriver.ReopenAsync` actually calls `InvalidateHandleCaches` — needs a live gw (`RecreateAsync` can't be faked), so it's an integration test; (b) stub-driver test-class duplication between `DriverInstanceActorTests` + `DriverInstanceActorWriteAndSubscribeTests` (hygiene — extract a shared harness). ## STATE SUMMARY (post-compaction pickup) **ALL feature work is SHIPPED + PUSHED to master `c24abc8a` (synced with origin). Nothing is blocking.** Milestone 1b (equipment-tag live values: live READ + authorized inbound WRITE across OpcUaClient / the 6 protocol drivers / Galaxy, via the `FullName→NodeId` router) is COMPLETE, and all three Galaxy phases shipped: **A** standard Equipment driver `c3c56172`, **B** native alarms `f9be3843`, **C** server-side HistoryRead `c24abc8a`. The session's cluster-harden / write-outcome-self-correction / driver-reconfigure-while-faulted follow-ups merged too (`945c2380` / `1d797c1c` / `56f73e49`). The six banners above carry each feature's mechanism + gotchas + deferred live gate; the closed open-follow-ups #1–#7 (Phase B, Phase C, data-plane role docs, write-pipeline nits, write-outcome, Galaxy driver nits, reconfigure-while-faulted) are all DONE at those SHAs. **The ONLY genuinely open items (all user-driven / deferred — pick up here):** 1. **User-driven live `/run` gates** — the agent does NOT sign in; all code is merged + unit-verified; these are the operator's end-to-end confirmations: - **Phase C HistoryRead** (T7) — needs the Wonderware sidecar + AVEVA on the WW Historian VM `10.100.0.48` (NOT on the local docker-dev rig). Recipe: Phase C banner + `docs/Historian.md`. - **Phase B native alarms** (T9) — author a Galaxy alarm tag whose `FullName` == the gateway `AlarmFullReference`, deploy on `MAIN-galaxy-eq`, trip → Part 9 condition + `/alerts` row. Recipe: Phase B banner. - **Driver-reconfigure-while-faulted** — fault `MAIN-opcua-eq` with a bad config, deploy a corrected one (`POST http://localhost:9200/api/deployments`, `X-Api-Key: docker-dev-deploy-key`), confirm it adopts WITHOUT a node restart. Recipe: that banner. 2. **Rig cleanups** (operational, user-deferred) — see "Operational deferral" at the bottom. 3. **Minor non-blocking residual follow-ups** (review-surfaced, all explicitly deferred, none gate anything): Phase B residuals (a)–(d) in its banner; write-outcome residuals (Bad-quality blip / AuditWriteUpdateEvent / synchronous fail-fast); harden-1b two residuals (`945c2380` banner: Galaxy-reopen integration test, stub-driver test-class de-dup); Phase C documented follow-ups (no continuation-point paging, no modified-value history, block-bridge timeout bounded only by the client's 30s `CallTimeout`); the data-plane `GroupToRole` production-default note; Galaxy `_itemHandles`/`_supervisedHandles` not cleared on reconnect + the cosmetic `SubscriptionEstablished` self-dead-letter. **No queued feature remains** — Milestone 1b + Galaxy A/B/C were the headline deliverables and are all done. Future directions (NOT requested): the Phase C HistoryRead follow-ups above, or new driver/UNS work. --- The **six historian code follow-ups** (HistorizeToAveva opt-out, drain/capacity/retention config knobs, SharedSecret/DatabasePath/non-positive-knob startup validation, operator-recording for shelve/enable/disable, and the `SqliteStoreAndForwardSink` thread-safety nits) were **all resolved** on branch `feat/alarm-historian-followups` (plan: `docs/plans/2026-06-11-alarm-historian-followups.md`). They are no longer listed here. ## Equipment-tag live values — MILESTONE 1b COMPLETE (2026-06-13) The Galaxy standard-driver effort shipped Phase A (`c3c56172`) + the **`FullName→NodeId` live-value ROUTER** (`c4435e4f`, both pushed). The router is done + verified (322 tests + integration review READY-TO-MERGE). **All three driver-publish gaps are now CLOSED** — an equipment tag bound to OpcUaClient, any protocol driver, OR Galaxy publishes a live value delivered by the router (full detail in `current.md` "Milestone 1b" + `docs/plans/2026-06-13-equipment-tag-live-values-design.md`): 1. ~~**OpcUaClient has NO factory (real bug — always stubbed).**~~ **DONE — SHIPPED+PUSHED master `22d553af` 2026-06-13.** Added `OpcUaClientDriverFactoryExtensions` (mirror Modbus) + registered it in `DriverFactoryBootstrap`. **First live equipment-tag value PROVEN end-to-end:** OpcUaClient driver `MAIN-opcua-eq` spawns `stub=False`, connects to opc-plc, subscribes to `ns=3;s=FastUInt1`; the `FullName→NodeId` router (`c4435e4f`) delivers it to the materialised variable `ns=2;s=EQ-55297329838d/FastUInt1`, which reads a live **changing** value (10135→10141, Good) via Client.CLI. Design/plan `docs/plans/2026-06-13-opcuaclient-factory-*.md`. Two incidental findings while live-verifying (see below). 2. ~~**Protocol drivers (Modbus/S7/AbCip/…) — equipment-tag↔driver tag-table linkage unbuilt.**~~ **DONE — SHIPPED+PUSHED master `8d8c05f5` 2026-06-13 (+ full inbound operator WRITE pipeline).** Approach B (driver-side direct-ref): a shared `EquipmentTagRefResolver` (Core.Abstractions) resolves an equipment-tag ref (the raw `TagConfig` JSON blob the router already keys on) into a transient driver tag-def on a `_tagsByName` miss — wired into READ + WRITE for **all six** drivers (Modbus/S7/AbCip/AbLegacy/TwinCAT/FOCAS), each with a hardened never-throw `EquipmentTagParser`. **Part B (write-through):** writable nodes (`Tag.AccessLevel==ReadWrite`→`CurrentReadWrite`, byte-parity in Phase7Composer+DeploymentArtifact), an `OnWriteValue` gate on the `WriteOperate` data-plane role (mirrors the alarm-ack bridge; fire-and-forget dispatch since the SDK holds the node-manager Lock during `OnWriteValue`), a `NodeWriteRouter` on the node manager, and `DriverHostActor.RouteNodeWrite` (NodeId→driver reverse map, primary-gated). **LIVE-PROVEN end-to-end:** Modbus equipment tag (HR[100]) reads a live changing value; an authorized write (`opc-writeop`/WriteOperate) to HR[200] changes the register + persists; an anonymous write → BadUserAccessDenied. Design/plan `docs/plans/2026-06-13-protocol-equipment-tag-linkage-*.md`. Findings + rig artifacts below. 3. ~~**Galaxy — needs a reachable mxaccessgw.**~~ **DONE — LIVE-PROVEN 2026-06-13 (no code change; config-only).** The code-investigation confirmed Galaxy was already fully wired: `GalaxyDriverFactoryExtensions` IS registered in `DriverFactoryBootstrap.cs:103` (not the missing-factory bug OpcUaClient had), and the Galaxy driver keys subscriptions on the FullReference (`tag_name.AttributeName`) DIRECTLY (no `_tagsByName` miss). gap (c) was purely a misconfigured dev driver-instance + placeholder tag ref + unset key — ALL data in existing columns, NO EF/schema change. Fixes applied to the dev rig (`otopcua-dev-sql-1`/`OtOpcUa`): `MAIN-galaxy-eq` `DriverConfig` `gateway.endpoint` `https://10.100.0.35:5001`→`http://10.100.0.48:5120`, `useTls` `true`→`false`, `apiKeySecretRef` `env:MX_API_KEY`(unset)→`env:GALAXY_MXGW_API_KEY` (the var the compose already wires on every node); `GalaxyTestTag` `TagConfig.FullName` `TestMachine_002.SomeAttr`(placeholder)→`TestMachine_002.TestDuration` (a real galaxy Float attr). The gateway API key was injected via **ephemeral shell env** at `docker compose up -d --no-deps --force-recreate central-1 central-2` time (NEVER written to a tracked file; the compose's `${GALAXY_MXGW_API_KEY:-stale-default}` substitution picks it up — the running containers carry the real key only until the next recreate-without-the-env-var). **Live (central-1 logs):** `spawned GalaxyMxGateway driver MAIN-galaxy-eq (stub=False)` → `GalaxyMxSession connected — clientName=OtOpcUa` (auth OK) → `initialized — endpoint=http://10.100.0.48:5120` → `subscribed to 1 refs (galaxy-sub-1)` (TestMachine_002.TestDuration accepted, no BadNodeIdUnknown). **Value:** `Client.CLI read ns=2;s=EQ-55297329838d/GalaxyTestTag` → Value `0`, Status `0x00000000` (Good), Source Time `2026-05-07T07:14:26Z` (a real galaxy timestamp — a genuine attribute snapshot, NOT BadWaitingForInitialData; static because that attr isn't actively moving). Restore-the-rig SQL saved at `/tmp/galaxy-gapc-snapshot.sql`. **Milestone 1b is now COMPLETE — all three gaps closed.** Findings/follow-ups below. Then: Phase **B** = native `IAlarmSource` alarms on the equipment-tag path **(DONE — `f9be3843`)**; Phase **C** = server-side `HistoryRead` backend over the Wonderware reader **(DONE — `c24abc8a`, 2026-06-14; design sections in `docs/plans/2026-06-12-galaxy-standard-driver-design.md` + the dedicated `docs/plans/2026-06-14-galaxy-phase-c-historian-*.md`)**. ### Findings + follow-ups from the Galaxy gap-(c) live-verify (2026-06-13) - **Benign dead-letter (minor, pre-existing in the Galaxy driver — NOT introduced here).** On subscribe the driver logs: `Message [SubscriptionEstablished] from drv-MAIN-galaxy-eq to drv-MAIN-galaxy-eq was unhandled. [N] dead letters`. The Galaxy `DriverInstanceActor`/driver sends itself a `SubscriptionEstablished` message that has no `Receive<>` handler. Harmless (the subscription IS established + delivering values), but noisy — add a handler (or stop self-Telling it). Cosmetic. - **CHANGING-value read PROVEN (2026-06-13).** Repointed `GalaxyTestTag.FullName`→`TestMachine_002.TestChangingInt` (a script-driven Integer, `sec=Operate`): three Client.CLI reads returned **810 → 787 → 764** with **real galaxy source timestamps advancing ~7s each** (`02:28:41`/`:48`/`:55`) — a genuine live moving Galaxy value through the router (not optimistic/phantom). The dev rig is now left with `GalaxyTestTag` pointing here (DataType `Int32`, AccessLevel `Read`). Discovery was done with a throwaway probe (now deleted) using `GalaxyDriverBrowser.OpenAsync`→`session.AttributesAsync("TestMachine_002")`, which lists every attribute's `SecurityClass` (`ViewOnly`=read-only; `FreeAccess`/`Operate`/`Tune`/`Configure`=writable). Useful attrs on `TestMachine_002`: `TestFloat`(Float,Operate), `TestDouble`(Double,Operate), `TestChangingInt`(Integer,Operate,**moves**), `TestDuration`(**ElapsedTime**,Operate), `AlarmInhibit`(Boolean,FreeAccess). - **GALAXY WRITE-THROUGH — FIXED + MERGED to master `f05b5d79` (`AdviseSupervisory` before raw `Write`).** Symptom was every Galaxy operator write returning `MxaccessFailure "ArgumentException: HRESULT 0x80070057"` (`E_INVALIDARG`). **TWO-LAYER root cause (debugged by SSH-reading the gateway source on 10.100.0.48 `C:\Users\dohertj2\Desktop\mxaccessgw` — there is NO live gateway file log: console-only/uncaptured, NSSM `stdout.log` stale, dashboard :5130 is Blazor/no-REST):** (1) the writer `AddItem`'d an UN-advised handle → MXAccess `Write` threw E_INVALIDARG (worker chain `ExecuteWrite`→`MxAccessSession.Write`→`MxAccessComServer.Write`→`AsProxyServer().Write(...)`). (2) DEEPER — a plain `Write` runs with **no user login** (`WriteUserId`=0), and MXAccess only **COMMITS** such a write when the item is advised in **SUPERVISORY** mode; a *regular* `Advise` removed the E_INVALIDARG but never committed (proven by a persistence check: read-back showed the value, but a `--force-recreate`+fresh-resubscribe reverted to the original `0 @ 2026-05-07`; the worker's `ExecuteWrite` is fire-and-forget, returns OK without awaiting `OnWriteComplete`). Confirmed against the sister **ScadaBridge** driver (`~/Desktop/ScadaBridge/.../RealMxGatewayClient.cs`): it commits the OTHER way — a configured **non-zero `WriteUserId`** + regular `Advise` + `WriteBulk`. We have no galaxy login → supervisory context. **FIX:** `GatewayGalaxyDataWriter` calls `AdviseSupervisory` (raw `MxCommand{Kind=AdviseSupervisory, AdviseSupervisory=new AdviseSupervisoryCommand{ServerHandle,ItemHandle}}` via `session.InvokeAsync`, mirroring `InvokeWriteSecuredAsync`; idempotent per handle via `_supervisedHandles`) before each raw `WriteRawAsync`; `SecuredWrite`/`VerifiedWrite` tags keep their own user-identity path (`NeedsSecuredWrite` unchanged — WriteSecured is ONLY for those special-security tags). The dead-end "reuse the subscription's advised handle" resolver attempt was reverted. **LIVE-PROVEN:** authorized write (`opc-writeop`/WriteOperate) of `TestMachine_002.TestFloat`=1234.5 then 8888.25 COMMITS + **PERSISTS across recreate/re-subscribe** (galaxy-sourced timestamp); anonymous → `BadUserAccessDenied`. 254 Galaxy tests green; central `--build` clean. **OPEN follow-ups from this:** (a) the worker's fire-and-forget `ExecuteWrite` can't surface an async write failure — with supervisory advise the write commits, but only a read-back confirms a *specific* write (gateway-side; out of our scope). (b) `_itemHandles`/`_supervisedHandles` caches aren't cleared on reconnect (pre-existing for `_itemHandles`) — a write right after a reconnect could use a stale handle; minor. - **OPTIMISTIC-WRITE PHANTOM (open follow-up — surface real write status to the client).** The inbound write dispatch is fire-and-forget: it returns optimistic `Good` before the driver result (required — `OnWriteValue` runs under the node-manager Lock), and the SDK applies the written value to the node locally. So a write whose DEVICE write FAILS still returns `Good`, and for a STATIC attribute that never re-pushes, the wrong value LINGERS (a phantom the device never accepted). The pipeline already computes the real status in `NodeWriteResult.Success/Reason` but only LOGS it — consider surfacing it to the client. (How it was caught live: a failed Galaxy write showed the written value on read-back with a SERVER-clock source timestamp + a `rejected` driver log; a committed write shows a GALAXY-clock timestamp + no rejection, and persists across a re-subscribe.) - **Dev-rig Galaxy config is CORRECT + WORKING (left in place).** The `MAIN-galaxy-eq` driver-instance is deployed and connecting to the live gateway `http://10.100.0.48:5120`. `GalaxyTestTag` (on `EQ-55297329838d`/filler-02, `nw-uns` namespace) is currently `{"FullName":"TestMachine_002.TestFloat"}`, DataType `Float`, AccessLevel `ReadWrite` (the write demo; galaxy now holds the last written value `8888.25`). Other useful `TestMachine_002` attrs (from the discovery probe): `TestChangingInt`(Integer,Operate,**moves on its own** — the live-changing READ demo), `TestDouble`(Double,Operate), `TestDuration`(**ElapsedTime**,Operate — reads as Float but a Float write is a type-mismatch), `AlarmInhibit`(Boolean,FreeAccess). To restore the original placeholder tag: `/tmp/galaxy-gapc-snapshot.sql`. The base seed `docker-dev/seed/seed-clusters.sql` still seeds the *legacy* SystemPlatform-namespace Galaxy driver (`MAIN-galaxy-mxgw`, tags `TestMachine_001.TestAlarm001..003`) — pre-Phase-A model, untouched/separate. **The injected gateway key is EPHEMERAL** — key=`mxgw_otopcuakey2_so0…` is supplied via shell env `GALAXY_MXGW_API_KEY='…'` at `docker compose up --no-deps --force-recreate central-1 central-2`; a recreate WITHOUT it re-exported falls back to the compose's stale default and Galaxy auth fails. ORDER on a redeploy: POST deploy FIRST, THEN recreate (a faulted driver ignores `ApplyDelta`). ### Findings + follow-ups from the protocol-linkage + write-through work (2026-06-13) - **DATA-PLANE ROLE CONFIG REQUIREMENT (important, deployment-facing).** The OPC UA session's roles come from two sources unioned: the **DB `LdapGroupRoleMapping`** (its `Role` column is the **`AdminRole` enum** — Administrator/Designer/Viewer only, for the AdminUI) AND the **appsettings `Security:Ldap:GroupToRole`** baseline (free-form `string→string`). The OPC UA **data-plane** gates (`WriteOperate`, `AlarmAck`, …) read literal role STRINGS that the AdminRole-typed DB mapping **cannot** produce — so a deployment MUST map its LDAP data-plane groups → data-plane role strings via `GroupToRole`, or write-through (and scripted-alarm OPC UA ack) is inert (every write → `BadUserAccessDenied`). The shared dev GLAuth already has dedicated groups+users (group `WriteOperate`, user `opc-writeop`, `multi-role` in all; `opc-readonly`); the dev rig just never seeded the `GroupToRole`. **Consider a docs note (and/or a documented default) so production deployments wire this.** (Same latent requirement applies to the pre-existing alarm-ack gate.) - **Write-pipeline review follow-ups (non-blocking, from the final integration review):** (a) `DriverHostActor.Stale` (and `DriverInstanceActor.Connecting`/`Reconnecting`) have **no `RouteNodeWrite`/`WriteAttribute` handler** → an operator write while stale/reconnecting dead-letters and the 10s Ask times out with a generic log (client got optimistic Good). Add fast-fail handlers returning a clear status. (b) Drop `TaskContinuationOptions.ExecuteSynchronously` on the router `ContinueWith`; `List.Contains`→`HashSet` in the forward-map build (micro). (c) FOCAS re-parses the address on every equipment-tag write (`_parsedAddressesByTagName` miss; perf only, rare). (d) `DriverHostActorWriteRoutingTests` seeds a Galaxy-style `{"FullName":...}` artifact, not a raw protocol-driver TagConfig blob — add a raw-blob case for belt-and-suspenders (runtime path is identical + live-verified). (e) Task-9 parity test is a faithful simulation of `ConfigComposer` (`ToSnapshot` casts AccessLevel to int) not a through-the-real-serializer proof; add an `InlineData(2,false)` future-enum trap. ### Dev-rig artifacts created for the protocol-linkage live-verify (left in place, NOT committed) - **`docker-dev/docker-compose.yml`** gained `Security__Ldap__GroupToRole__{ReadOnly,WriteOperate,WriteTune,WriteConfigure,AlarmAck}` identity entries on both central nodes (needed for data-plane roles — see above). **Uncommitted** (rig config; the file was already modified at session start). - **DB seeds** on `otopcua-dev-sql-1`/`OtOpcUa`: driver `MAIN-modbus-eq` (DriverType=Modbus, `{"Host":"10.100.0.35","Port":5020,"UnitId":1,"Tags":[]}`, namespace `nw-uns`, cluster MAIN) + tags `tag-modbus-hr100` (HR[100] auto-increment, Read — read demo) and `tag-modbus-hr200` (HR[200] scratch, ReadWrite — write demo), both on equipment `EQ-55297329838d` (filler-02). The pymodbus `standard` sim (`10.100.0.35:5020`) serves HR[0..31]=addr-as-value, HR[100]=auto-increment, HR[200..209]=writable scratch. ### Incidental findings from the OpcUaClient live-verify (2026-06-13) - **Driver-reconfigure-while-faulted gap — FIXED + MERGED `56f73e49` (2026-06-14).** (Was: a `DriverInstanceActor` stuck in `Reconnecting`/`Connecting` had **no `ApplyDelta` handler**, so a corrected config dead-lettered and the actor retried the OLD `_currentConfigJson` forever; workaround = restart the node.) Now `Connecting`/`Reconnecting` handle `ApplyDelta` via `AdoptConfigDuringInit`, re-initialising with the new config; a monotonic `_initGeneration` guard supersedes the in-flight old init so the corrected config always wins. See the top banner + `docs/plans/2026-06-14-driver-reconfigure-while-faulted-*.md`. Live `/run` gate deferred (user choice). - **Dev-rig config edit applied directly in DB.** The `MAIN-opcua-eq` `DriverConfig.targetNamespaceKind` was `0` (Equipment, which requires a `UnsMappingTable` → `InitializeAsync` rejected it). Set to `1` (SystemPlatform — the direct-ref mode the equipment-tag model wants; empty `unsMappingTable:{}` passes validation) via a direct `JSON_MODIFY` UPDATE on `otopcua-dev-sql-1` (DB `OtOpcUa`, `SET QUOTED_IDENTIFIER ON` required for JSON fns; `sqlcmd -h` and `-y 0` are mutually exclusive — pick one). The AdminUI driver-edit combobox for "Target namespace kind" did **not** persist the change (suspected live-only Blazor binding bug — unverified; the DB edit sidestepped it). Deploy snapshots the **live** config DB directly (`AdminOperationsActor` → `DraftSnapshotFactory.FromConfigDbAsync` + `ConfigComposer.SnapshotAndFlattenAsync`), so a DB edit flows through on the next `POST /api/deployments` (new revisionHash). ## Operational deferral (user choice) 1. **docker-dev rig cleanup (round-1 T9) deferred.** The local docker-dev rig still has the live-verify seed artifacts deployed: the `t12-overheat` scripted alarm, the `SC-ba675b168a85` predicate script, the `layer0-logcheck` vtag/script, and filler-02's modified `cycle-time-s` line. Left as-is to inspect the working double-emit fix. **To clean up:** delete those artifacts in the AdminUI (or DB), revert filler-02's `cycle-time-s` to `return ctx.GetTag("TestMachine_002.TestDuration").Value;`, then redeploy (`POST http://localhost:9200/api/deployments`, header `X-Api-Key: docker-dev-deploy-key`). 2. **Equipment-tag live-value verify artifacts (left in place — all now FUNCTIONAL).** The docker-dev rig carries verify artifacts under the `nw-uns` Equipment namespace on `EQ-55297329838d` (filler-02), all three now working: `MAIN-galaxy-eq` (GalaxyMxGateway → live gateway, `GalaxyTestTag` = `TestMachine_002.TestFloat` RW, write-proven; needs the ephemeral `GALAXY_MXGW_API_KEY` re-exported on recreate — see Galaxy dev-rig note above); `MAIN-opcua-eq` (OpcUaClient, factory shipped `22d553af`) + `FastUInt1` tag (`{"FullName":"ns=3;s=FastUInt1"}`); `MAIN-modbus-eq` (Modbus) + `tag-modbus-hr100` (read) / `tag-modbus-hr200` (RW write demo). docker-dev is **LOCAL on this Mac** (OrbStack); central-1 @ `localhost:4840`/AdminUI+deploy @ `localhost:9200`, sql @ `localhost:14330` (sa/`OtOpcUa!Dev123`), login disabled. Sims for the protocol/opcua verifies run on the docker host `10.100.0.35` (`otopcua-pymodbus-standard` :5020, `otopcua-opc-plc` :50000) — leave up or `docker compose down` per `/opt/otopcua-modbus` + `/opt/otopcua-opcuaclient`. Phase A backup `OtOpcUa-prePhaseA-20260612-224908.bak` is in the SQL volume `/var/opt/mssql/backup/`.