docs(drivers): design — protocol-driver equipment-tag linkage + inbound write pipeline

This commit is contained in:
Joseph Doherty
2026-06-13 10:45:25 -04:00
parent 22d553afd1
commit e58f33584f
@@ -0,0 +1,99 @@
# Protocol-driver equipment-tag linkage + inbound write pipeline — Design
**Date:** 2026-06-13
**Status:** Approved — ready for implementation planning
**Branch:** feature branch off master `22d553af`
**Scope:** Make an equipment `Tag` bound to any protocol driver (Modbus, S7, AbCip, AbLegacy, TwinCAT, Focas) **subscribe + publish a live value** (delivered by the already-shipped `FullName→NodeId` router `c4435e4f`), **and** let an authorized operator **write** that node back to the device. Milestone 1b gap (b) + full operator write-through. **No EF/Configuration schema change.**
## Problem
An equipment `Tag` bound to a protocol driver has `TagConfig = {region, address, dataType, …}`**no `FullName`**. The shared compose helper `ExtractTagFullName` (`Phase7Composer.cs:426-441`, mirrored in `DeploymentArtifact.cs:624-639`) extracts `TagConfig.FullName`, **falling back to the raw `TagConfig` blob** when absent. So:
- `DriverHostActor.PushDesiredSubscriptions` (`DriverHostActor.cs:611-633`) pushes that raw blob as the driver's subscription `FullReference`, and keys the forward router map `_nodeIdByDriverRef: (DriverInstanceId, FullName) → NodeId[]` on it.
- The protocol driver looks up each incoming ref in `_tagsByName` (keyed by the **authored** `DriverConfig.Tags[].Name`, the legacy Device/PollGroup model). The blob misses → `BadNodeIdUnknown` (Modbus `ReadAsync` `ModbusDriver.cs:300`) → the register is never read.
The authored driver tag-table and the equipment-tag (UNS) model are **separate**; equipment tags never flow into `DriverConfig.Tags`. The OpcUaClient case works only because its `TagConfig.FullName` *is* a directly-resolvable upstream NodeId. Separately, **inbound operator write is unwired server-wide**: every node is created `AccessLevel = CurrentRead` (`OtOpcUaNodeManager.cs:658,781`), there is no `Write`/`OnWriteValue` dispatch, no sender of `WriteAttribute`, and no `NodeId → (driver, ref)` reverse map. The driver-side write plumbing (`IWritable.WriteAsync`, `DriverInstanceActor.WriteAttribute`/`HandleWriteAsync` `DriverInstanceActor.cs:316-348`) exists but nothing drives it.
## Approach (chosen)
**Approach B — driver-side direct-ref parse**, mirroring the OpcUaClient precedent (the ref *is* the address; the driver resolves it). No compose change, no EF change, no byte-parity concern for the ref (the router already keys on exactly the `TagConfig` blob). Rejected alternatives: **A** compose-time tag-table synthesis (invasive: DriverConfig-assembly seam + byte-parity in both compose paths + cross-project shape knowledge + Name collisions); **C** explicit compact address-string `FullName` (new grammar/parser/editor change for marginal gain).
Decided in brainstorming: **all six protocol drivers**, **full write-through now**, Part A merges/verifies before Part B, primary-only writes, `WriteOperate` as the single v1 write role.
---
## Part A — Per-driver equipment-tag resolver (read + write at the driver)
### A1. Shared helper `EquipmentTagRefResolver<TDef>` (Core.Abstractions)
A generic, driver-agnostic resolver every protocol driver instantiates:
```
sealed class EquipmentTagRefResolver<TDef> where TDef : class
ctor(Func<string, TDef?> byName, // authored tag-table lookup (_tagsByName.TryGetValue)
Func<string, TDef?> parseRef) // equipment TagConfig JSON -> transient TDef (driver-specific)
bool TryResolve(string fullRef, out TDef def) // byName(ref) ?? cache.GetOrAdd(ref, parseRef); null result cached
void Clear() // called on ReinitializeAsync so a config change drops stale transients
```
- Disambiguation is automatic: a legacy authored `Name` hits `byName`; an equipment ref is a JSON object that `parseRef` parses; anything else caches `null` → unknown (current skip behaviour preserved).
- The cache is keyed by the ref string; a `ConcurrentDictionary<string, TDef?>` (negative entries included). Cleared on reinit.
### A2. Per-driver parser + wiring (×6)
For each of **Modbus, S7, AbCip, AbLegacy, TwinCAT, Focas**:
1. **Parser** `TryParseEquipmentTagConfig(string json, out TDef def)` in the driver's `*.Contracts` project, mirroring that driver's AdminUI `…TagConfigModel` (`src/Server/.../AdminUI/Uns/TagEditors/<Driver>TagConfigModel.cs`) — same camelCase keys + enum-name values (case-insensitive). Builds a **transient `TDef` whose identity/`Name` = the ref string itself**, so the value the driver publishes back keys the forward router. Returns false when the JSON lacks the driver's required address fields (so a genuinely-unknown ref still skips).
2. **Wire `TryResolve`** into every `_tagsByName.TryGetValue(ref, …)` site: the subscribe/read path, any coalesced/bulk read path, the write path, and the deadband/ShouldPublish path. (Modbus sites: `ModbusDriver.cs:151,300,714,934`; other drivers have the analogous `_tagsByName` sites — each driver's set is its own task.)
3. Instantiate the resolver in the driver ctor/init with `byName = _tagsByName.TryGetValue` and `parseRef = TryParseEquipmentTagConfig`; `Clear()` it in `ReinitializeAsync`.
Result: `ReadAsync` **and** `WriteAsync` resolve equipment-tag refs with one helper — read + write capable per driver.
**Driver order:** Modbus first (sim at `10.100.0.35:5020` → live-verify the read value), then S7 (sim `:1102`), then AbCip/AbLegacy/TwinCAT/Focas (unit-tested; no sims). Tasks are parallelizable across drivers (disjoint files).
---
## Part B — Server inbound operator-write pipeline
### B1. Writable nodes
- Add `bool Writable` to `EquipmentTagPlan` (`Phase7Composer.cs:76-83`), derived `= Tag.AccessLevel == TagAccessLevel.ReadWrite` (`Tag.cs:52`, `TagAccessLevel.cs`). Derive it in **both** `Phase7Composer` (from the `Tag` entity) and `DeploymentArtifact.BuildEquipmentTagPlans` (from the already-snapshotted artifact `Tag` JSON — `ConfigComposer` serialises full `Tag` rows). **Byte-parity; no new persisted field, no EF migration.**
- `IOpcUaAddressSpaceSink.EnsureVariable` / `OtOpcUaNodeManager.EnsureVariable` (`OtOpcUaNodeManager.cs:636-669`) gains a `bool writable` arg. When true (and the driver implements `IWritable`): `AccessLevel = UserAccessLevel = AccessLevels.CurrentReadWrite` and attach an `OnWriteValue` handler; otherwise unchanged (`CurrentRead`). `MaterialiseEquipmentTags` (`Phase7Applier.cs:162-199`) passes the plan's `Writable`.
### B2. Reverse routing + primary gate (DriverHostActor)
- In `PushDesiredSubscriptions` (alongside the forward map build), populate `_driverRefByNodeId: Dictionary<string nodeId, (string DriverInstanceId, string FullName)>` (cleared+repopulated each apply; a NodeId maps to exactly one equipment tag).
- New message `RouteNodeWrite(string NodeId, object? Value)` + `NodeWriteResult(bool Success, string? Reason)`. Handler: **primary gate** — if the local node is not the driver role-leader (reuse the `RedundancyStateActor` Primary determination, `RedundancyStateActor.cs:114-138`), reply `NodeWriteResult(false, "not primary")`; else resolve `_driverRefByNodeId[nodeId]``_children[driverId].Actor` (`DriverHostActor.cs:87`) → `Forward(new WriteAttribute(fullName, value))` so the child replies `WriteAttributeResult` straight to the original asker (mapped to `NodeWriteResult`). Unknown nodeId / missing child → `NodeWriteResult(false, …)`.
### B3. Authz gate + write gateway (reuses the alarm-ack bridge)
- New `IOpcUaNodeWriteGateway { Task<NodeWriteOutcome> WriteAsync(string nodeId, object? value, CancellationToken ct) }` + a `Deferred…` wrapper + a production impl that **Asks** `DriverHostActor.RouteNodeWrite` (bounded, ~10 s outer; the driver's `HandleWriteAsync` already bounds to 5 s) and maps `NodeWriteResult` → outcome. DI mirrors `IOpcUaAddressSpaceSink`/`SdkAddressSpaceSink` (singleton + `Deferred` set on the host at `StartAsync`).
- The node manager's `OnWriteValue` handler (mirroring `HandleAlarmCommand` `OtOpcUaNodeManager.cs:527-552`): extract `(context as ISessionOperationContext)?.UserIdentity as RoleCarryingUserIdentity` (`RoleCarryingUserIdentity.cs`); **gate** on the node's required write role — writable equipment tags require `WriteOperate` (`OpcUaDataPlaneRoles.WriteOperate`); `identity is null` or role absent → `BadUserAccessDenied` (fails closed). On pass, call the gateway **blocking** (`.GetAwaiter().GetResult()` — the SDK write delegate is synchronous; writes are infrequent operator actions), map outcome → `ServiceResult` (Good → SDK applies the value optimistically; next poll confirms). Bad → SDK rejects.
- Tune/Configure granularity is **deferred** (equipment tags carry no SecurityClassification today — a future schema item). v1: writable ⇒ `WriteOperate`.
## Data flow
- **Read:** driver polls register → publishes keyed by ref (the `TagConfig` blob) → forward router (`ForwardToMux`, `DriverHostActor.cs:390-418`) → folder-scoped equipment NodeId value. Works once Part A resolves the ref.
- **Write:** client write → SDK `OnWriteValue` → role gate → gateway → `Ask RouteNodeWrite` → primary gate + reverse map → `WriteAttribute(fullName,value)` → driver `WriteAsync` resolves the ref via the **same** Part-A helper → register write → `StatusCode` to the client.
## Error handling
- Resolver: unknown/garbage ref → cached `null` → driver skips (current `BadNodeIdUnknown`); never throws on a bad ref.
- Write: not-primary / unknown node / driver-not-`IWritable` / timeout / driver `Bad*` → mapped `Bad*` StatusCode; unauthorized → `BadUserAccessDenied`. Fails closed everywhere.
- Plan parity: a `Writable` derivation mismatch between Composer and Artifact is a deploy-time defect — covered by a parity unit test.
## Testing (no bUnit)
- **Unit (xUnit + Shouldly):** `EquipmentTagRefResolver` (legacy-name hit / equipment-ref parse / garbage→null / cache / Clear); each driver's `TryParseEquipmentTagConfig` + a resolve-on-miss read; `Writable` plan-parity round-trip (Composer == Artifact); the `OnWriteValue` authz gate (role present → routes; null/absent → `BadUserAccessDenied`); `RouteNodeWrite` reverse-map + primary gate (resolves on primary, denies on secondary, unknown node → fail).
- **Live docker-dev `/run` (agent-driven; dev UI login disabled):** Modbus equipment tag on the `:5020` sim → live **changing** value at its equipment NodeId (same proof as the OpcUaClient milestone). Then an authorized **write** to a writable Modbus equipment node → register changes (read-back on the sim) and an unauthorized session is denied. The authorized-write check needs a Client.CLI OPC UA session bound as an LDAP user holding `WriteOperate` against the shared GLAuth (`10.100.0.35:3893`) — OPC UA session auth is separate from the disabled dev UI login; confirm the rig path at verify time. S7 read live-checked if `:1102` cooperates; the other four drivers proven by unit tests.
## Out of scope
- Tune/Configure write-role granularity (needs a SecurityClassification on equipment tags — future schema).
- Unifying the driver-side parser with the AdminUI editor model (YAGNI; they share keys).
- Secondary-node writes / write redirect (primary-only for v1).
- Phase B native alarms / Phase C server historian (separate milestones).
## Hard rules
Stage by path; never `git add .`; never stage `sql_login.txt` / `src/Server/.../Host/pki/` / `pending.md` / `current.md`. Never echo secrets. No force-push, no `--no-verify`. **No Configuration entity / EF migration change.** Build on a feature branch off master `22d553af`; this design doc + the plan are committed on master first.