diff --git a/docs/plans/2026-06-08-driverless-equipment-namespace-design.md b/docs/plans/2026-06-08-driverless-equipment-namespace-design.md new file mode 100644 index 00000000..7ff9df54 --- /dev/null +++ b/docs/plans/2026-06-08-driverless-equipment-namespace-design.md @@ -0,0 +1,151 @@ +# Driver-less Equipment Namespace — Design (2026-06-08) + +## Problem + +In the OtOpcUa config model, `Equipment.DriverInstanceId` is a **non-null** logical FK — every +equipment row must reference a `DriverInstance`. But the Northwind company overlay's equipment +signals are **VirtualTags** (driverless — they link via `EquipmentId` + a `Script`, with no driver +binding); their live values come from the galaxy mirror over the **GalaxyMxGateway** driver via the +script `return ctx.GetTag("TestMachine_NNN.Sig").Value;`. + +To satisfy the non-null FK *and* the validator rule that **forbids a GalaxyMxGateway driver in an +Equipment-kind namespace** (`DraftValidator.ValidateDriverNamespaceCompatibility`: +`NamespaceKind.Equipment => DriverType != "GalaxyMxGateway"`), the loader was cornered into inventing +a **placeholder `Modbus` driver** (`nw-uns-modbus`, 0 tags bound) and pointing all 40 equipment rows +at it. + +That placeholder is misleading ("equipment tied to Modbus") **and not inert** — on every deploy the +runtime tries to instantiate a real Modbus driver for it and fails: + +``` +WARNING DriverHost: factory for Modbus threw on nw-uns-modbus; stubbing +Cause: Modbus driver config for 'nw-uns-modbus' missing required Host +WARNING DriverHost: no factory for driver type Modbus (instance nw-uns-modbus); falling back to stub +INFO DriverHost: spawned Modbus driver nw-uns-modbus (stub=True) +``` + +The data path is correct (the MxGateway *is* the source, via the VirtualTag indirection — `verify-equipment` +shows 396 Good), but the driver association is a lie and generates per-deploy exception/stub noise. + +## Decision + +**Option B — make equipment driver-less.** VirtualTag-only equipment genuinely has no field driver, so +it should reference none: make `Equipment.DriverInstanceId` **nullable**, **delete the placeholder +driver**, and teach the one cluster-attribution site to tolerate a null driver. This keeps the +VirtualTag route intact (so overlay-building, hierarchy reorganization, and value transforms all stay +available — those are the route's strength), removes the misleading Modbus FK, and eliminates the +stub/exception noise. + +Chosen over: **A** (a no-op "Virtual" driver type — keeps a placeholder), and **Option 4** (relax the +validator so a GalaxyMxGateway driver can serve equipment directly as galaxy tags — rejected because a +galaxy tag's subscribe ref is `FolderPath.Name`, i.e. tree placement is coupled to galaxy source, so a +direct-tag overlay could not freely reorganize the company hierarchy without further driver +re-architecture, and it would also drop transforms/ScriptedAlarms and blur the SystemPlatform/Equipment +layering). + +## Impact map (investigated) + +The change is small because almost nothing depends on `Equipment.DriverInstanceId`. + +### Schema / entity — the core change +- `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/Equipment.cs:23` — + `public required string DriverInstanceId` → `public string? DriverInstanceId`. +- EF infers nullability from the C# type: `OtOpcUaConfigDbContext.ConfigureEquipment` has **no + `.IsRequired()`** call, so only the property type changes. One EF migration emits a single + `AlterColumn(DriverInstanceId, nullable: true)`; the model snapshot loses its `.IsRequired()`. +- **No FK to drop.** Equipment→DriverInstance is a *logical* FK only — no EF `HasOne/WithMany`, no DB + `FK_…` constraint, no `ON DELETE`. Deleting the placeholder driver while equipment still references it + is already schema-legal. +- `IX_Equipment_Driver` is a plain non-unique, non-filtered index — valid on a nullable column, kept + as-is (no filtered-index change; YAGNI). + +### Validator / composer / applier / runtime — no change required +- **`DraftValidator`** (all 9 rules read): none dereference `Equipment.DriverInstanceId`; none require an + Equipment namespace to have a driver. `ValidateDriverNamespaceCompatibility` iterates + `draft.DriverInstances` — with the placeholder gone, that namespace simply has no driver row to check. +- **`DraftSnapshotFactory.FromConfigDbAsync`** loads the full `Equipment` entity → naturally carries + `null`. No projection change. +- **`Phase7Composer`** — `EquipmentNode` uses `EquipmentId`/`Name`/`UnsLineId`; equipment *Tags* use + `Tag.DriverInstanceId` (a separate field), not Equipment's; `EquipmentVirtualTag` uses + `EquipmentId`/`Name`/`ScriptId`. `Equipment.DriverInstanceId` is never read. +- **`Phase7Applier`** — `MaterialiseHierarchy` / `MaterialiseEquipmentVirtualTags` build folders + + variable nodes from projected fields only; never reads the driver. +- **`DriverHostActor`** — spawns one driver actor *per DriverInstance* in the artifact. Deleting the + placeholder DriverInstance ⇒ no Modbus spec ⇒ **no stub spawn, no `missing required Host` exception**. + The VirtualTag host is spawned unconditionally and streams regardless of driver children. +- **`VirtualTagHostActor`** — driver-agnostic; depends only on the dependency mux publishing the + galaxy-mirror values (from the GalaxyMxGateway driver, untouched). + +### The one real code change — `DeploymentArtifact.BuildClusterSets` +`src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DeploymentArtifact.cs` (~lines 271–297). Equipment carries +no `ClusterId`; today it is cluster-attributed **via its driver** (`equipmentIds.Add(id)` when +`di is not null && driverIds.Contains(di)`). With a null driver, equipment — and its VirtualTags — would +be **silently dropped in multi-cluster (`ScopeTo`) mode**. + +Fix using the UNS hierarchy, which already carries cluster identity (`UnsArea.ClusterId` exists, and +`Equipment → UnsLine.UnsAreaId → UnsArea`): +- Build a `UnsLineId → UnsAreaId` map from the artifact's `UnsLines` array. +- **Driver-bound** equipment: unchanged (in-cluster when its driver is in-cluster). +- **Driver-less** equipment: in-cluster **iff its line's area is in-cluster** + (`areaIds.Contains(lineToArea[UnsLineId])`). + +This is *more* correct than the driver path: it anchors on the equipment's actual UNS placement, which is +exactly the same-cluster invariant decision #122 already enforces (driver-cluster must equal +line-cluster). The existing cross-cluster consistency warning (lines 237–244) is phrased "in cluster by +its driver"; for driver-less equipment the attribution is by line, so it is consistent by construction +and the warning won't fire. **Single-cluster (`ClusterFilterMode.None`, the docker-dev topology) never +calls `BuildClusterSets`**, so this is a correctness fix for multi-cluster, not a dev-path requirement. + +### Loader — `scadaproj/otopcua-uns-loader/otopcua_uns.py` +In `cmd_populate_equipment`: +- **Remove** the placeholder `DriverInstance` INSERT (`'Northwind UNS placeholder', 'Modbus', …`) and its + comment. +- **Equipment INSERT**: drop `DriverInstanceId` from the column list (→ NULL). +- Retire the `EQ_DRIVER = "nw-uns-modbus"` constant and the now-dead `DELETE … Tag WHERE DriverInstanceId` + / `DELETE … DriverInstance` teardown lines (in both `cmd_populate_equipment` and `cmd_clean`) — they + become no-ops; remove for clarity. Keep the `Namespace` INSERT/teardown. +- Fix the stale `# … "an Equipment namespace has a driver" expectations` comment. + +### Noted, not changing (YAGNI) +- `sp_ComputeGenerationDiff` includes `DriverInstanceId` in a `CHECKSUM(...)`. It is NULL-tolerant for + this one-time transition and sits on the **dormant** generation-diff path (the active deploy gate is + the C# `DraftValidator` + `ConfigComposer.SnapshotAndFlattenAsync`, not the SP). Flagged verify-only; + if it is ever reactivated, wrap with `ISNULL(DriverInstanceId, '')` as a follow-on. + +## Testing & verification + +1. **Unit (Configuration tests):** a `DraftValidator` test that a draft with driver-less equipment + (`Equipment.DriverInstanceId == null`, Equipment namespace with zero drivers) validates clean. +2. **Unit (Runtime tests):** a `BuildClusterSets` / scoped-`ParseComposition` test proving a null-driver + equipment is attributed to the cluster of its line's area in `ScopeTo` mode (and its VirtualTags are + kept), while a wrong-cluster line excludes it. +3. Existing `Configuration`, `OpcUaServer` (Phase7), and `Runtime` (DeploymentArtifact) suites stay green. +4. **Migration** authored + applied to the docker-dev config DB. +5. **Live docker-dev:** re-run the loader (`populate-equipment`, now driver-less) → redeploy → confirm + **396 Good still flow** on `:4840` (`VERIFY-EQUIPMENT: PASS`) **and** central-1 logs **no longer + contain** `spawned Modbus driver nw-uns-modbus` or `missing required Host`. + +## Sequencing & risk + +| Step | Risk | Notes | +|---|---|---| +| Entity nullable + EF migration | medium — schema change | single `AlterColumn`; no FK/constraint to drop; reversible | +| `BuildClusterSets` null-driver attribution | low–medium — multi-cluster scoping | additive branch; single-cluster path unaffected; covered by a new test | +| Loader edits | low | drop placeholder + NULL the FK; teardown becomes no-ops | +| Live redeploy on docker-dev | low | recreate admin nodes only; sites untouched; the proof the wart is gone | + +## Branches + +- OtOpcUa: `feat/driverless-equipment-namespace` (off `master` `446a456`). +- Loader: `scadaproj` (`otopcua_uns.py`). +- Migration authored in the Configuration project; applied to docker-dev. Merge/push only on the user's + explicit go (the user manages this repo's integration). + +## Related context + +- Investigation: this design is grounded in a three-front impact sweep (EF/schema, validator/artifact, + runtime/loader) — see the per-layer findings folded into the Impact map above. +- Decision #122 (same-cluster invariant: equipment's driver-cluster must equal its UNS-line cluster) — + the anchor that makes line-based attribution for driver-less equipment correct. +- `DraftValidator.ValidateDriverNamespaceCompatibility` — the rule that forbids GalaxyMxGateway in + Equipment namespaces and forced the original placeholder.