diff --git a/current.md b/current.md new file mode 100644 index 00000000..9f7bc98c --- /dev/null +++ b/current.md @@ -0,0 +1,53 @@ +# Current status — Galaxy standard-driver effort + +_Updated 2026-06-13._ + +## Where we are + +**Phase A (Galaxy = standard Equipment-kind driver) is DONE, merged to master (`c3c56172`), pushed, and live-verified.** + +- `GalaxyMxGateway` is now an ordinary **Equipment-kind** driver. Retired: the `SystemPlatform` `NamespaceKind` + its mirror (`Phase7Applier.MaterialiseGalaxyTags`, the composer/artifact `GalaxyTags` contract + `GalaxyTagPlan`), the byte-parity `|| DriverType=="GalaxyMxGateway"` exception clauses, the authz `NodeHierarchyKind.SystemPlatform` + `PermissionTrie.WalkSystemPlatform`, and the **entire alias-tag/relay AdminUI machinery**. +- A Galaxy point is now a plain equipment `Tag{EquipmentId set, DriverInstanceId=GalaxyMxGateway, TagConfig={"FullName":"tag.attr"}}`, authored through the standard `TagModal` + the "Browse Galaxy" picker. +- Forward-only data-only migration `CleanupSystemPlatformNamespaces` deletes orphaned SystemPlatform config + `NodeAcl` grants and releases `ExternalIdReservation`s. Kept `UX_Namespace_Cluster_Kind` (Simulated kind still reserved). +- Gate: 1009 unit tests green + full docker-dev live `/run` (migration applied clean, server boots on migrated DB, alias UI gone, namespace picker drops SystemPlatform, Galaxy driver created **under Equipment**, Galaxy tag authored via the new picker). + +Design: `docs/plans/2026-06-12-galaxy-standard-driver-design.md` (3-phase). Phase A plan: `docs/plans/2026-06-12-galaxy-standard-driver-phase-a.md` (+ `.tasks.json`). + +--- + +## The rest of the plan (all NOT started) + +### ✅ Milestone 1 — Equipment-tag live-value DELIVERY (the `FullName→NodeId` router) — DONE, merged `c4435e4f`, pushed + +Design `docs/plans/2026-06-13-equipment-tag-live-values-design.md`; plan `…-plan.md` (+ `.tasks.json`). Mirrored the proven `VirtualTagHostActor._nodeIdByVtag` pattern for DRIVER values: +- NEW `ZB.MOM.WW.OtOpcUa.Commons.OpcUa.EquipmentNodeIds` = single source of truth for the folder-scoped NodeId formula (`Variable(eq,fp,name)`); repointed `Phase7Applier` + `VirtualTagHostActor` to it (byte-identical, kills the drift risk). +- `DriverInstanceId` added to `AttributeValuePublished` (the router key — FullName is unique only WITHIN a driver instance). +- `DriverHostActor` builds a `(DriverInstanceId, FullName) → NodeId[]` map each apply (in `PushDesiredSubscriptions`, via `EquipmentNodeIds.Variable`) and resolves it in `ForwardToMux`, emitting the existing `AttributeValueUpdate` per resolved NodeId (1→many handled; no-match drops+debug-logs). NO `OpcUaPublishActor` change. +- Gate: build 0 errors; **322 tests** (incl. 3 Akka.TestKit tests driving the full router through a real deploy/apply); final integration review = READY TO MERGE (end-to-end NodeId coherence confirmed). Live `/run`: the deploy applied + the variable **materialised at the exact router NodeId** + SubscribeBulk pushed refs — the materialisation + router wiring is confirmed live. + +### 🔴 Milestone 1b — Per-driver "actually PUBLISH a value" gaps (NOT started — the router has nothing to route until these land; live value could NOT be shown in dev for this reason) + +The router is correct, but **no dev driver currently publishes an equipment-tag value**, so a variable still won't show data until the owning driver's publish path is wired. Three orthogonal, independent gaps (each its own follow-up): +- **OpcUaClient — MISSING FACTORY (real bug).** `src/Server/.../Host/Drivers/DriverFactoryBootstrap.cs:98-107` registers AbCip/AbLegacy/FOCAS/Galaxy/Modbus/S7/TwinCAT but **OMITS OpcUaClient** — and the `OpcUaClient` driver project has **no `…FactoryExtensions` class at all** (only `OpcUaClientDriver.cs` + `OpcUaClientDriverProbe.cs`). So OpcUaClient is **always stubbed** (`"no factory for driver type OpcUaClient … falling back to stub"`) — completely non-functional. Fix: add an `OpcUaClientDriverFactoryExtensions` (CreateInstance parsing `driverConfigJson`→`OpcUaClientDriverOptions` + `new OpcUaClientDriver(options, id, …)`, mirroring `ModbusDriverFactoryExtensions`) + register it in the bootstrap. **This is the cheapest path to a live value** (OpcUaClient is direct-ref: `TagConfig.FullName="ns=3;s=FastUInt1"` → subscribes to opc-plc → publishes → router delivers). opc-plc sim is up at `opc.tcp://10.100.0.35:50000` (`ns=3;s=FastUInt1`). +- **Protocol drivers (Modbus/S7/AbCip/…) — equipment-tag↔driver-subscription linkage.** The Modbus driver subscribes by **tag NAME from `DriverConfig.Tags`** (`_tagsByName`, keyed by `t.Name`), but an equipment tag's `TagConfig` (region/address, via `ModbusTagConfigModel`) carries **no top-level `FullName`**, so `ExtractTagFullName` falls back to the raw JSON blob → never matches a `DriverConfig.Tags` name → `SubscribeAsync` silently skips it (`if (!_tagsByName.TryGetValue(...)) continue;`). The equipment tag's register config is never connected to the driver. This is the bigger "equipment tags work for protocol drivers" gap (needs the deploy to build the driver's tag table from equipment tags, or the equipment-tag FullName to address a register directly). +- **Galaxy — needs a reachable mxaccessgw.** Direct-ref (FullName = MxAccess ref), so the router would deliver — but the gateway must be up + `ApiKeySecretRef` resolvable (dev `MX_API_KEY` unset). + +> Recommended next: build the OpcUaClient factory (small, fixes a real bug, gives the first live value). Then the protocol-driver linkage (larger). + +### 🟡 Phase B — Native alarms on the equipment-tag path (NOT started) + +Galaxy implements `IAlarmSource` (native MXAccess alarms), but that wiring lived **only** on the retired SystemPlatform-mirror path (`GenericDriverNodeManager` registers an alarm sink per `MarkAsAlarmCondition` variable keyed by `FullName`, routes `OnAlarmEvent.SourceNodeId` to it). After Phase A, **native Galaxy alarms have no materialization path** until this is ported. + +**Work:** teach `MaterialiseEquipmentTags`/`OtOpcUaNodeManager` to register an alarm-condition sink for `IsAlarm` equipment tags (keyed by `FullName`) and subscribe the owning driver's `IAlarmSource`, routing `OnAlarmEvent.SourceNodeId == FullName` → sink. Reuse the existing Part 9 `AlarmConditionState` materialization + `alarm-commands`/ack plumbing (scripted alarms already use it). Extract `GenericDriverNodeManager`'s forwarder so both paths share it. Mirror its unsubscribe-before-rewalk for `IRediscoverable` redeploys. + +### 🟢 Phase C — Server-side historian / `HistoryRead` (NOT started) + +Driver-agnostic server-side history backend over the existing Wonderware Historian reader (`Historian.Wonderware.Client` already implements `IHistoryProvider`). No mxaccessgw history RPC needed. + +**Work:** add an OPC UA `HistoryRead` service override in `OtOpcUaNodeManager` (none exists today) → a router that, for an `IsHistorized` node, resolves a historian tagname (default = the tag's driver `FullName`, optional override) and calls the registered `IHistorianDataSource`. Apply `DriverAttributeInfo.IsHistorized` → UA `Historizing` + `AccessLevel.HistoryRead` at materialization (currently hardcoded `false`). Config: a server `Historian` section (Null default; Wonderware when configured), mirroring the existing `AlarmHistorian` pattern. Works for any `IsHistorized` tag, not just Galaxy. + +--- + +## Dev rig state + +docker-dev runs **locally** (Docker 29.4.0 on this Mac); login disabled in dev (`Security__Auth__DisableLogin: "true"`); central UI at `http://localhost:9200`, SQL at `localhost:14330` (`sa`/`OtOpcUa!Dev123`). Currently on the **Phase-A build** with the clean-break migration applied. Verification artifacts left in place: `MAIN-galaxy-eq` Galaxy driver + `GalaxyTestTag` under the `nw-uns` Equipment namespace. Prior 1,391-tag Galaxy config is restorable from `OtOpcUa-prePhaseA-20260612-224908.bak` (in the SQL volume `/var/opt/mssql/backup/`). Rebuild central from a branch: `docker compose -f docker-dev/docker-compose.yml up -d --build migrator central-1 central-2`. diff --git a/pending.md b/pending.md new file mode 100644 index 00000000..251cf89b --- /dev/null +++ b/pending.md @@ -0,0 +1,82 @@ +# Pending — open follow-ups & deferrals + +As of 2026-06-13. master HEAD `f05b5d79` (synced with origin). Working tree: only `docker-dev/docker-compose.yml` (uncommitted rig config) + untracked `current.md`/`pending.md`. + +## STATE SUMMARY (for compaction pickup) + +**Milestone 1b (equipment-tag live values) is COMPLETE** — an equipment tag bound to OpcUaClient, any of the 6 protocol drivers, OR Galaxy now READS a live value AND (authorized) WRITES it back, all delivered by the `FullName→NodeId` router (`c4435e4f`). Shipped this session, all pushed to master: +- OpcUaClient factory `22d553af`; protocol-driver linkage + inbound write pipeline `8d8c05f5`; Galaxy gap-(c) config-only (no commit); **Galaxy write-through `f05b5d79` (`AdviseSupervisory` before raw Write)**. + +**OPEN FOLLOW-UPS (none blocking; pick up here):** +1. **Phase B** — native `IAlarmSource` alarms on the equipment-tag path (port `GenericDriverNodeManager`'s forwarder onto `MaterialiseEquipmentTags`). Deferred; design §in `docs/plans/2026-06-12-galaxy-standard-driver-design.md`. +2. **Phase C** — server-side `HistoryRead` backend over the Wonderware reader. Deferred; same design doc. +3. **Data-plane role config (deployment-facing)** — document that `Security:Ldap:GroupToRole` MUST map data-plane LDAP groups → role strings (`WriteOperate`/`AlarmAck`/…), else write-through + OPC UA alarm-ack are silently inert. Detail below. +4. **Write-pipeline review nits** — fast-fail `RouteNodeWrite`/`WriteAttribute` in `DriverHostActor.Stale` + `DriverInstanceActor.Connecting`/`Reconnecting`; drop `ExecuteSynchronously`; `List.Contains`→`HashSet`; FOCAS per-write reparse; raw-blob routing test; Task-9 parity future-enum trap. Detail below. +5. **Surface real device-write status to the client** — the inbound write is fire-and-forget optimistic `Good` (the "optimistic-write phantom"); `NodeWriteResult.Success/Reason` exists but is only logged. Detail below. +6. **Galaxy driver nits** — benign `SubscriptionEstablished` self-dead-letter (cosmetic); writer `_itemHandles`/`_supervisedHandles` caches not cleared on reconnect (stale-handle risk right after a reconnect). Detail in the Galaxy findings below. +7. **Driver-reconfigure-while-faulted** — a `Reconnecting`/`Connecting` `DriverInstanceActor` ignores `ApplyDelta` (retries old config forever; workaround = restart node). High-risk actor-state-machine change → own design/plan. Detail below. +8. **Rig cleanups** (operational, user-deferred) — see bottom. + +--- + +The **six historian code follow-ups** (HistorizeToAveva opt-out, drain/capacity/retention config +knobs, SharedSecret/DatabasePath/non-positive-knob startup validation, operator-recording for +shelve/enable/disable, and the `SqliteStoreAndForwardSink` thread-safety nits) were **all resolved** +on branch `feat/alarm-historian-followups` (plan: `docs/plans/2026-06-11-alarm-historian-followups.md`). +They are no longer listed here. + +## Equipment-tag live values — MILESTONE 1b COMPLETE (2026-06-13) + +The Galaxy standard-driver effort shipped Phase A (`c3c56172`) + the **`FullName→NodeId` live-value ROUTER** (`c4435e4f`, both pushed). The router is done + verified (322 tests + integration review READY-TO-MERGE). **All three driver-publish gaps are now CLOSED** — an equipment tag bound to OpcUaClient, any protocol driver, OR Galaxy publishes a live value delivered by the router (full detail in `current.md` "Milestone 1b" + `docs/plans/2026-06-13-equipment-tag-live-values-design.md`): + +1. ~~**OpcUaClient has NO factory (real bug — always stubbed).**~~ **DONE — SHIPPED+PUSHED master `22d553af` 2026-06-13.** Added `OpcUaClientDriverFactoryExtensions` (mirror Modbus) + registered it in `DriverFactoryBootstrap`. **First live equipment-tag value PROVEN end-to-end:** OpcUaClient driver `MAIN-opcua-eq` spawns `stub=False`, connects to opc-plc, subscribes to `ns=3;s=FastUInt1`; the `FullName→NodeId` router (`c4435e4f`) delivers it to the materialised variable `ns=2;s=EQ-55297329838d/FastUInt1`, which reads a live **changing** value (10135→10141, Good) via Client.CLI. Design/plan `docs/plans/2026-06-13-opcuaclient-factory-*.md`. Two incidental findings while live-verifying (see below). +2. ~~**Protocol drivers (Modbus/S7/AbCip/…) — equipment-tag↔driver tag-table linkage unbuilt.**~~ **DONE — SHIPPED+PUSHED master `8d8c05f5` 2026-06-13 (+ full inbound operator WRITE pipeline).** Approach B (driver-side direct-ref): a shared `EquipmentTagRefResolver` (Core.Abstractions) resolves an equipment-tag ref (the raw `TagConfig` JSON blob the router already keys on) into a transient driver tag-def on a `_tagsByName` miss — wired into READ + WRITE for **all six** drivers (Modbus/S7/AbCip/AbLegacy/TwinCAT/FOCAS), each with a hardened never-throw `EquipmentTagParser`. **Part B (write-through):** writable nodes (`Tag.AccessLevel==ReadWrite`→`CurrentReadWrite`, byte-parity in Phase7Composer+DeploymentArtifact), an `OnWriteValue` gate on the `WriteOperate` data-plane role (mirrors the alarm-ack bridge; fire-and-forget dispatch since the SDK holds the node-manager Lock during `OnWriteValue`), a `NodeWriteRouter` on the node manager, and `DriverHostActor.RouteNodeWrite` (NodeId→driver reverse map, primary-gated). **LIVE-PROVEN end-to-end:** Modbus equipment tag (HR[100]) reads a live changing value; an authorized write (`opc-writeop`/WriteOperate) to HR[200] changes the register + persists; an anonymous write → BadUserAccessDenied. Design/plan `docs/plans/2026-06-13-protocol-equipment-tag-linkage-*.md`. Findings + rig artifacts below. +3. ~~**Galaxy — needs a reachable mxaccessgw.**~~ **DONE — LIVE-PROVEN 2026-06-13 (no code change; config-only).** The code-investigation confirmed Galaxy was already fully wired: `GalaxyDriverFactoryExtensions` IS registered in `DriverFactoryBootstrap.cs:103` (not the missing-factory bug OpcUaClient had), and the Galaxy driver keys subscriptions on the FullReference (`tag_name.AttributeName`) DIRECTLY (no `_tagsByName` miss). gap (c) was purely a misconfigured dev driver-instance + placeholder tag ref + unset key — ALL data in existing columns, NO EF/schema change. Fixes applied to the dev rig (`otopcua-dev-sql-1`/`OtOpcUa`): `MAIN-galaxy-eq` `DriverConfig` `gateway.endpoint` `https://10.100.0.35:5001`→`http://10.100.0.48:5120`, `useTls` `true`→`false`, `apiKeySecretRef` `env:MX_API_KEY`(unset)→`env:GALAXY_MXGW_API_KEY` (the var the compose already wires on every node); `GalaxyTestTag` `TagConfig.FullName` `TestMachine_002.SomeAttr`(placeholder)→`TestMachine_002.TestDuration` (a real galaxy Float attr). The gateway API key was injected via **ephemeral shell env** at `docker compose up -d --no-deps --force-recreate central-1 central-2` time (NEVER written to a tracked file; the compose's `${GALAXY_MXGW_API_KEY:-stale-default}` substitution picks it up — the running containers carry the real key only until the next recreate-without-the-env-var). **Live (central-1 logs):** `spawned GalaxyMxGateway driver MAIN-galaxy-eq (stub=False)` → `GalaxyMxSession connected — clientName=OtOpcUa` (auth OK) → `initialized — endpoint=http://10.100.0.48:5120` → `subscribed to 1 refs (galaxy-sub-1)` (TestMachine_002.TestDuration accepted, no BadNodeIdUnknown). **Value:** `Client.CLI read ns=2;s=EQ-55297329838d/GalaxyTestTag` → Value `0`, Status `0x00000000` (Good), Source Time `2026-05-07T07:14:26Z` (a real galaxy timestamp — a genuine attribute snapshot, NOT BadWaitingForInitialData; static because that attr isn't actively moving). Restore-the-rig SQL saved at `/tmp/galaxy-gapc-snapshot.sql`. **Milestone 1b is now COMPLETE — all three gaps closed.** Findings/follow-ups below. + +Then: Phase **B** = native `IAlarmSource` alarms on the equipment-tag path; Phase **C** = server-side `HistoryRead` backend over the Wonderware reader (both deferred, design sections in `docs/plans/2026-06-12-galaxy-standard-driver-design.md`). + +### Findings + follow-ups from the Galaxy gap-(c) live-verify (2026-06-13) + +- **Benign dead-letter (minor, pre-existing in the Galaxy driver — NOT introduced here).** On subscribe the driver logs: `Message [SubscriptionEstablished] from drv-MAIN-galaxy-eq to drv-MAIN-galaxy-eq was unhandled. [N] dead letters`. The Galaxy `DriverInstanceActor`/driver sends itself a `SubscriptionEstablished` message that has no `Receive<>` handler. Harmless (the subscription IS established + delivering values), but noisy — add a handler (or stop self-Telling it). Cosmetic. +- **CHANGING-value read PROVEN (2026-06-13).** Repointed `GalaxyTestTag.FullName`→`TestMachine_002.TestChangingInt` (a script-driven Integer, `sec=Operate`): three Client.CLI reads returned **810 → 787 → 764** with **real galaxy source timestamps advancing ~7s each** (`02:28:41`/`:48`/`:55`) — a genuine live moving Galaxy value through the router (not optimistic/phantom). The dev rig is now left with `GalaxyTestTag` pointing here (DataType `Int32`, AccessLevel `Read`). Discovery was done with a throwaway probe (now deleted) using `GalaxyDriverBrowser.OpenAsync`→`session.AttributesAsync("TestMachine_002")`, which lists every attribute's `SecurityClass` (`ViewOnly`=read-only; `FreeAccess`/`Operate`/`Tune`/`Configure`=writable). Useful attrs on `TestMachine_002`: `TestFloat`(Float,Operate), `TestDouble`(Double,Operate), `TestChangingInt`(Integer,Operate,**moves**), `TestDuration`(**ElapsedTime**,Operate), `AlarmInhibit`(Boolean,FreeAccess). +- **GALAXY WRITE-THROUGH — FIXED + MERGED to master `f05b5d79` (`AdviseSupervisory` before raw `Write`).** Symptom was every Galaxy operator write returning `MxaccessFailure "ArgumentException: HRESULT 0x80070057"` (`E_INVALIDARG`). **TWO-LAYER root cause (debugged by SSH-reading the gateway source on 10.100.0.48 `C:\Users\dohertj2\Desktop\mxaccessgw` — there is NO live gateway file log: console-only/uncaptured, NSSM `stdout.log` stale, dashboard :5130 is Blazor/no-REST):** (1) the writer `AddItem`'d an UN-advised handle → MXAccess `Write` threw E_INVALIDARG (worker chain `ExecuteWrite`→`MxAccessSession.Write`→`MxAccessComServer.Write`→`AsProxyServer().Write(...)`). (2) DEEPER — a plain `Write` runs with **no user login** (`WriteUserId`=0), and MXAccess only **COMMITS** such a write when the item is advised in **SUPERVISORY** mode; a *regular* `Advise` removed the E_INVALIDARG but never committed (proven by a persistence check: read-back showed the value, but a `--force-recreate`+fresh-resubscribe reverted to the original `0 @ 2026-05-07`; the worker's `ExecuteWrite` is fire-and-forget, returns OK without awaiting `OnWriteComplete`). Confirmed against the sister **ScadaBridge** driver (`~/Desktop/ScadaBridge/.../RealMxGatewayClient.cs`): it commits the OTHER way — a configured **non-zero `WriteUserId`** + regular `Advise` + `WriteBulk`. We have no galaxy login → supervisory context. **FIX:** `GatewayGalaxyDataWriter` calls `AdviseSupervisory` (raw `MxCommand{Kind=AdviseSupervisory, AdviseSupervisory=new AdviseSupervisoryCommand{ServerHandle,ItemHandle}}` via `session.InvokeAsync`, mirroring `InvokeWriteSecuredAsync`; idempotent per handle via `_supervisedHandles`) before each raw `WriteRawAsync`; `SecuredWrite`/`VerifiedWrite` tags keep their own user-identity path (`NeedsSecuredWrite` unchanged — WriteSecured is ONLY for those special-security tags). The dead-end "reuse the subscription's advised handle" resolver attempt was reverted. **LIVE-PROVEN:** authorized write (`opc-writeop`/WriteOperate) of `TestMachine_002.TestFloat`=1234.5 then 8888.25 COMMITS + **PERSISTS across recreate/re-subscribe** (galaxy-sourced timestamp); anonymous → `BadUserAccessDenied`. 254 Galaxy tests green; central `--build` clean. **OPEN follow-ups from this:** (a) the worker's fire-and-forget `ExecuteWrite` can't surface an async write failure — with supervisory advise the write commits, but only a read-back confirms a *specific* write (gateway-side; out of our scope). (b) `_itemHandles`/`_supervisedHandles` caches aren't cleared on reconnect (pre-existing for `_itemHandles`) — a write right after a reconnect could use a stale handle; minor. +- **OPTIMISTIC-WRITE PHANTOM (open follow-up — surface real write status to the client).** The inbound write dispatch is fire-and-forget: it returns optimistic `Good` before the driver result (required — `OnWriteValue` runs under the node-manager Lock), and the SDK applies the written value to the node locally. So a write whose DEVICE write FAILS still returns `Good`, and for a STATIC attribute that never re-pushes, the wrong value LINGERS (a phantom the device never accepted). The pipeline already computes the real status in `NodeWriteResult.Success/Reason` but only LOGS it — consider surfacing it to the client. (How it was caught live: a failed Galaxy write showed the written value on read-back with a SERVER-clock source timestamp + a `rejected` driver log; a committed write shows a GALAXY-clock timestamp + no rejection, and persists across a re-subscribe.) +- **Dev-rig Galaxy config is CORRECT + WORKING (left in place).** The `MAIN-galaxy-eq` driver-instance is deployed and connecting to the live gateway `http://10.100.0.48:5120`. `GalaxyTestTag` (on `EQ-55297329838d`/filler-02, `nw-uns` namespace) is currently `{"FullName":"TestMachine_002.TestFloat"}`, DataType `Float`, AccessLevel `ReadWrite` (the write demo; galaxy now holds the last written value `8888.25`). Other useful `TestMachine_002` attrs (from the discovery probe): `TestChangingInt`(Integer,Operate,**moves on its own** — the live-changing READ demo), `TestDouble`(Double,Operate), `TestDuration`(**ElapsedTime**,Operate — reads as Float but a Float write is a type-mismatch), `AlarmInhibit`(Boolean,FreeAccess). To restore the original placeholder tag: `/tmp/galaxy-gapc-snapshot.sql`. The base seed `docker-dev/seed/seed-clusters.sql` still seeds the *legacy* SystemPlatform-namespace Galaxy driver (`MAIN-galaxy-mxgw`, tags `TestMachine_001.TestAlarm001..003`) — pre-Phase-A model, untouched/separate. **The injected gateway key is EPHEMERAL** — key=`mxgw_otopcuakey2_so0…` is supplied via shell env `GALAXY_MXGW_API_KEY='…'` at `docker compose up --no-deps --force-recreate central-1 central-2`; a recreate WITHOUT it re-exported falls back to the compose's stale default and Galaxy auth fails. ORDER on a redeploy: POST deploy FIRST, THEN recreate (a faulted driver ignores `ApplyDelta`). + +### Findings + follow-ups from the protocol-linkage + write-through work (2026-06-13) + +- **DATA-PLANE ROLE CONFIG REQUIREMENT (important, deployment-facing).** The OPC UA session's roles come from two sources unioned: the **DB `LdapGroupRoleMapping`** (its `Role` column is the **`AdminRole` enum** — Administrator/Designer/Viewer only, for the AdminUI) AND the **appsettings `Security:Ldap:GroupToRole`** baseline (free-form `string→string`). The OPC UA **data-plane** gates (`WriteOperate`, `AlarmAck`, …) read literal role STRINGS that the AdminRole-typed DB mapping **cannot** produce — so a deployment MUST map its LDAP data-plane groups → data-plane role strings via `GroupToRole`, or write-through (and scripted-alarm OPC UA ack) is inert (every write → `BadUserAccessDenied`). The shared dev GLAuth already has dedicated groups+users (group `WriteOperate`, user `opc-writeop`, `multi-role` in all; `opc-readonly`); the dev rig just never seeded the `GroupToRole`. **Consider a docs note (and/or a documented default) so production deployments wire this.** (Same latent requirement applies to the pre-existing alarm-ack gate.) +- **Write-pipeline review follow-ups (non-blocking, from the final integration review):** (a) `DriverHostActor.Stale` (and `DriverInstanceActor.Connecting`/`Reconnecting`) have **no `RouteNodeWrite`/`WriteAttribute` handler** → an operator write while stale/reconnecting dead-letters and the 10s Ask times out with a generic log (client got optimistic Good). Add fast-fail handlers returning a clear status. (b) Drop `TaskContinuationOptions.ExecuteSynchronously` on the router `ContinueWith`; `List.Contains`→`HashSet` in the forward-map build (micro). (c) FOCAS re-parses the address on every equipment-tag write (`_parsedAddressesByTagName` miss; perf only, rare). (d) `DriverHostActorWriteRoutingTests` seeds a Galaxy-style `{"FullName":...}` artifact, not a raw protocol-driver TagConfig blob — add a raw-blob case for belt-and-suspenders (runtime path is identical + live-verified). (e) Task-9 parity test is a faithful simulation of `ConfigComposer` (`ToSnapshot` casts AccessLevel to int) not a through-the-real-serializer proof; add an `InlineData(2,false)` future-enum trap. + +### Dev-rig artifacts created for the protocol-linkage live-verify (left in place, NOT committed) + +- **`docker-dev/docker-compose.yml`** gained `Security__Ldap__GroupToRole__{ReadOnly,WriteOperate,WriteTune,WriteConfigure,AlarmAck}` identity entries on both central nodes (needed for data-plane roles — see above). **Uncommitted** (rig config; the file was already modified at session start). +- **DB seeds** on `otopcua-dev-sql-1`/`OtOpcUa`: driver `MAIN-modbus-eq` (DriverType=Modbus, `{"Host":"10.100.0.35","Port":5020,"UnitId":1,"Tags":[]}`, namespace `nw-uns`, cluster MAIN) + tags `tag-modbus-hr100` (HR[100] auto-increment, Read — read demo) and `tag-modbus-hr200` (HR[200] scratch, ReadWrite — write demo), both on equipment `EQ-55297329838d` (filler-02). The pymodbus `standard` sim (`10.100.0.35:5020`) serves HR[0..31]=addr-as-value, HR[100]=auto-increment, HR[200..209]=writable scratch. + +### Incidental findings from the OpcUaClient live-verify (2026-06-13) + +- **Driver-reconfigure-while-faulted gap (real, pre-existing, NOT fixed).** When a `DriverInstanceActor` is stuck in `Reconnecting` (init keeps failing) and the operator deploys a *corrected* config, `DriverHostActor` sends `ApplyDelta` — but the `Reconnecting` behavior (`DriverInstanceActor.cs` ~L266) has **no `ApplyDelta` handler**, so it's dead-lettered and the actor keeps retrying the OLD `_currentConfigJson` forever. **Workaround: restart the node** (respawns the driver actor fresh from the current deployment artifact). Proper fix = handle `ApplyDelta` in `Reconnecting` (and `Connecting`) to adopt the new config mid-retry. Touches the actor state machine → its own design/plan (high-risk). Surfaced because the dev `MAIN-opcua-eq` driver was already faulted from a prior bad config. +- **Dev-rig config edit applied directly in DB.** The `MAIN-opcua-eq` `DriverConfig.targetNamespaceKind` was `0` (Equipment, which requires a `UnsMappingTable` → `InitializeAsync` rejected it). Set to `1` (SystemPlatform — the direct-ref mode the equipment-tag model wants; empty `unsMappingTable:{}` passes validation) via a direct `JSON_MODIFY` UPDATE on `otopcua-dev-sql-1` (DB `OtOpcUa`, `SET QUOTED_IDENTIFIER ON` required for JSON fns; `sqlcmd -h` and `-y 0` are mutually exclusive — pick one). The AdminUI driver-edit combobox for "Target namespace kind" did **not** persist the change (suspected live-only Blazor binding bug — unverified; the DB edit sidestepped it). Deploy snapshots the **live** config DB directly (`AdminOperationsActor` → `DraftSnapshotFactory.FromConfigDbAsync` + `ConfigComposer.SnapshotAndFlattenAsync`), so a DB edit flows through on the next `POST /api/deployments` (new revisionHash). + +## Operational deferral (user choice) + +1. **docker-dev rig cleanup (round-1 T9) deferred.** The local docker-dev rig still has the + live-verify seed artifacts deployed: the `t12-overheat` scripted alarm, the + `SC-ba675b168a85` predicate script, the `layer0-logcheck` vtag/script, and filler-02's + modified `cycle-time-s` line. Left as-is to inspect the working double-emit fix. **To clean + up:** delete those artifacts in the AdminUI (or DB), revert filler-02's `cycle-time-s` to + `return ctx.GetTag("TestMachine_002.TestDuration").Value;`, then redeploy + (`POST http://localhost:9200/api/deployments`, header `X-Api-Key: docker-dev-deploy-key`). + +2. **Equipment-tag live-value verify artifacts (left in place — all now FUNCTIONAL).** The docker-dev + rig carries verify artifacts under the `nw-uns` Equipment namespace on `EQ-55297329838d` + (filler-02), all three now working: `MAIN-galaxy-eq` (GalaxyMxGateway → live gateway, `GalaxyTestTag` + = `TestMachine_002.TestFloat` RW, write-proven; needs the ephemeral `GALAXY_MXGW_API_KEY` re-exported + on recreate — see Galaxy dev-rig note above); `MAIN-opcua-eq` (OpcUaClient, factory shipped `22d553af`) + + `FastUInt1` tag (`{"FullName":"ns=3;s=FastUInt1"}`); `MAIN-modbus-eq` (Modbus) + `tag-modbus-hr100` + (read) / `tag-modbus-hr200` (RW write demo). docker-dev is **LOCAL on this Mac** (OrbStack); central-1 + @ `localhost:4840`/AdminUI+deploy @ `localhost:9200`, sql @ `localhost:14330` (sa/`OtOpcUa!Dev123`), + login disabled. Sims for the protocol/opcua verifies run on the docker host `10.100.0.35` + (`otopcua-pymodbus-standard` :5020, `otopcua-opc-plc` :50000) — leave up or `docker compose down` per + `/opt/otopcua-modbus` + `/opt/otopcua-opcuaclient`. Phase A backup + `OtOpcUa-prePhaseA-20260612-224908.bak` is in the SQL volume `/var/opt/mssql/backup/`.