Files
lmxopcua/docs/plans/2026-06-08-driverless-equipment-namespace-design.md
T

11 KiB
Raw Blame History

Driver-less Equipment Namespace — Design (2026-06-08)

Problem

In the OtOpcUa config model, Equipment.DriverInstanceId is a non-null logical FK — every equipment row must reference a DriverInstance. But the Northwind company overlay's equipment signals are VirtualTags (driverless — they link via EquipmentId + a Script, with no driver binding); their live values come from the galaxy mirror over the GalaxyMxGateway driver via the script return ctx.GetTag("TestMachine_NNN.Sig").Value;.

To satisfy the non-null FK and the validator rule that forbids a GalaxyMxGateway driver in an Equipment-kind namespace (DraftValidator.ValidateDriverNamespaceCompatibility: NamespaceKind.Equipment => DriverType != "GalaxyMxGateway"), the loader was cornered into inventing a placeholder Modbus driver (nw-uns-modbus, 0 tags bound) and pointing all 40 equipment rows at it.

That placeholder is misleading ("equipment tied to Modbus") and not inert — on every deploy the runtime tries to instantiate a real Modbus driver for it and fails:

WARNING  DriverHost: factory for Modbus threw on nw-uns-modbus; stubbing
Cause:   Modbus driver config for 'nw-uns-modbus' missing required Host
WARNING  DriverHost: no factory for driver type Modbus (instance nw-uns-modbus); falling back to stub
INFO     DriverHost: spawned Modbus driver nw-uns-modbus (stub=True)

The data path is correct (the MxGateway is the source, via the VirtualTag indirection — verify-equipment shows 396 Good), but the driver association is a lie and generates per-deploy exception/stub noise.

Decision

Option B — make equipment driver-less. VirtualTag-only equipment genuinely has no field driver, so it should reference none: make Equipment.DriverInstanceId nullable, delete the placeholder driver, and teach the one cluster-attribution site to tolerate a null driver. This keeps the VirtualTag route intact (so overlay-building, hierarchy reorganization, and value transforms all stay available — those are the route's strength), removes the misleading Modbus FK, and eliminates the stub/exception noise.

Chosen over: A (a no-op "Virtual" driver type — keeps a placeholder), and Option 4 (relax the validator so a GalaxyMxGateway driver can serve equipment directly as galaxy tags — rejected because a galaxy tag's subscribe ref is FolderPath.Name, i.e. tree placement is coupled to galaxy source, so a direct-tag overlay could not freely reorganize the company hierarchy without further driver re-architecture, and it would also drop transforms/ScriptedAlarms and blur the SystemPlatform/Equipment layering).

Impact map (investigated)

The change is small because almost nothing depends on Equipment.DriverInstanceId.

Schema / entity — the core change

  • src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/Equipment.cs:23public required string DriverInstanceIdpublic string? DriverInstanceId.
  • EF infers nullability from the C# type: OtOpcUaConfigDbContext.ConfigureEquipment has no .IsRequired() call, so only the property type changes. One EF migration emits a single AlterColumn(DriverInstanceId, nullable: true); the model snapshot loses its .IsRequired().
  • No FK to drop. Equipment→DriverInstance is a logical FK only — no EF HasOne/WithMany, no DB FK_… constraint, no ON DELETE. Deleting the placeholder driver while equipment still references it is already schema-legal.
  • IX_Equipment_Driver is a plain non-unique, non-filtered index — valid on a nullable column, kept as-is (no filtered-index change; YAGNI).

Validator / composer / applier / runtime — no change required

  • DraftValidator (all 9 rules read): none dereference Equipment.DriverInstanceId; none require an Equipment namespace to have a driver. ValidateDriverNamespaceCompatibility iterates draft.DriverInstances — with the placeholder gone, that namespace simply has no driver row to check.
  • DraftSnapshotFactory.FromConfigDbAsync loads the full Equipment entity → naturally carries null. No projection change.
  • Phase7ComposerEquipmentNode uses EquipmentId/Name/UnsLineId; equipment Tags use Tag.DriverInstanceId (a separate field), not Equipment's; EquipmentVirtualTag uses EquipmentId/Name/ScriptId. Equipment.DriverInstanceId is never read.
  • Phase7ApplierMaterialiseHierarchy / MaterialiseEquipmentVirtualTags build folders + variable nodes from projected fields only; never reads the driver.
  • DriverHostActor — spawns one driver actor per DriverInstance in the artifact. Deleting the placeholder DriverInstance ⇒ no Modbus spec ⇒ no stub spawn, no missing required Host exception. The VirtualTag host is spawned unconditionally and streams regardless of driver children.
  • VirtualTagHostActor — driver-agnostic; depends only on the dependency mux publishing the galaxy-mirror values (from the GalaxyMxGateway driver, untouched).

The one real code change — DeploymentArtifact.BuildClusterSets

src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DeploymentArtifact.cs (~lines 271297). Equipment carries no ClusterId; today it is cluster-attributed via its driver (equipmentIds.Add(id) when di is not null && driverIds.Contains(di)). With a null driver, equipment — and its VirtualTags — would be silently dropped in multi-cluster (ScopeTo) mode.

Fix using the UNS hierarchy, which already carries cluster identity (UnsArea.ClusterId exists, and Equipment → UnsLine.UnsAreaId → UnsArea):

  • Build a UnsLineId → UnsAreaId map from the artifact's UnsLines array.
  • Driver-bound equipment: unchanged (in-cluster when its driver is in-cluster).
  • Driver-less equipment: in-cluster iff its line's area is in-cluster (areaIds.Contains(lineToArea[UnsLineId])).

This is more correct than the driver path: it anchors on the equipment's actual UNS placement, which is exactly the same-cluster invariant decision #122 already enforces (driver-cluster must equal line-cluster). The existing cross-cluster consistency warning (lines 237244) is phrased "in cluster by its driver"; for driver-less equipment the attribution is by line, so it is consistent by construction and the warning won't fire. Single-cluster (ClusterFilterMode.None, the docker-dev topology) never calls BuildClusterSets, so this is a correctness fix for multi-cluster, not a dev-path requirement.

Loader — scadaproj/otopcua-uns-loader/otopcua_uns.py

In cmd_populate_equipment:

  • Remove the placeholder DriverInstance INSERT ('Northwind UNS placeholder', 'Modbus', …) and its comment.
  • Equipment INSERT: drop DriverInstanceId from the column list (→ NULL).
  • Retire the EQ_DRIVER = "nw-uns-modbus" constant and the now-dead DELETE … Tag WHERE DriverInstanceId / DELETE … DriverInstance teardown lines (in both cmd_populate_equipment and cmd_clean) — they become no-ops; remove for clarity. Keep the Namespace INSERT/teardown.
  • Fix the stale # … "an Equipment namespace has a driver" expectations comment.

AdminUI — two production derefs (found at build time, not by the grep sweep)

Making the column nullable surfaced two .razor sites the impact grep missed (caught by TreatWarningsAsErrors):

  • Components/Pages/Clusters/TagEdit.razor:191db.Equipment.Where(e => driverIds.Contains(e.DriverInstanceId)) (CS8604). Behavior-preserving fix: guard e.DriverInstanceId != null && … (SQL already excludes NULL from an IN set, so this only satisfies the compiler).
  • Components/Pages/Clusters/EquipmentEdit.razor — the equipment editor loads DriverInstanceId into a non-null FormModel (line 183, CS8601) and mandates a driver on save ("Pick a driver instance."). Decision: give it full driver-less supportFormModel.DriverInstanceIdstring?, add a "(none / driver-less)" option to the driver dropdown, relax the mandatory-driver validation, and persist NULL when none is selected (normalize empty → null).

Noted, not changing (YAGNI)

  • sp_ComputeGenerationDiff includes DriverInstanceId in a CHECKSUM(...). It is NULL-tolerant for this one-time transition and sits on the dormant generation-diff path (the active deploy gate is the C# DraftValidator + ConfigComposer.SnapshotAndFlattenAsync, not the SP). Flagged verify-only; if it is ever reactivated, wrap with ISNULL(DriverInstanceId, '') as a follow-on.

Testing & verification

  1. Unit (Configuration tests): a DraftValidator test that a draft with driver-less equipment (Equipment.DriverInstanceId == null, Equipment namespace with zero drivers) validates clean.
  2. Unit (Runtime tests): a BuildClusterSets / scoped-ParseComposition test proving a null-driver equipment is attributed to the cluster of its line's area in ScopeTo mode (and its VirtualTags are kept), while a wrong-cluster line excludes it.
  3. Existing Configuration, OpcUaServer (Phase7), and Runtime (DeploymentArtifact) suites stay green.
  4. Migration authored + applied to the docker-dev config DB.
  5. Live docker-dev: re-run the loader (populate-equipment, now driver-less) → redeploy → confirm 396 Good still flow on :4840 (VERIFY-EQUIPMENT: PASS) and central-1 logs no longer contain spawned Modbus driver nw-uns-modbus or missing required Host.

Sequencing & risk

Step Risk Notes
Entity nullable + EF migration medium — schema change single AlterColumn; no FK/constraint to drop; reversible
BuildClusterSets null-driver attribution lowmedium — multi-cluster scoping additive branch; single-cluster path unaffected; covered by a new test
Loader edits low drop placeholder + NULL the FK; teardown becomes no-ops
Live redeploy on docker-dev low recreate admin nodes only; sites untouched; the proof the wart is gone

Branches

  • OtOpcUa: feat/driverless-equipment-namespace (off master 446a456).
  • Loader: scadaproj (otopcua_uns.py).
  • Migration authored in the Configuration project; applied to docker-dev. Merge/push only on the user's explicit go (the user manages this repo's integration).
  • Investigation: this design is grounded in a three-front impact sweep (EF/schema, validator/artifact, runtime/loader) — see the per-layer findings folded into the Impact map above.
  • Decision #122 (same-cluster invariant: equipment's driver-cluster must equal its UNS-line cluster) — the anchor that makes line-based attribution for driver-less equipment correct.
  • DraftValidator.ValidateDriverNamespaceCompatibility — the rule that forbids GalaxyMxGateway in Equipment namespaces and forced the original placeholder.