Files
lmxopcua/docs/plans/2026-06-08-driverless-equipment-namespace-design.md
T

161 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Driver-less Equipment Namespace — Design (2026-06-08)
## Problem
In the OtOpcUa config model, `Equipment.DriverInstanceId` is a **non-null** logical FK — every
equipment row must reference a `DriverInstance`. But the Northwind company overlay's equipment
signals are **VirtualTags** (driverless — they link via `EquipmentId` + a `Script`, with no driver
binding); their live values come from the galaxy mirror over the **GalaxyMxGateway** driver via the
script `return ctx.GetTag("TestMachine_NNN.Sig").Value;`.
To satisfy the non-null FK *and* the validator rule that **forbids a GalaxyMxGateway driver in an
Equipment-kind namespace** (`DraftValidator.ValidateDriverNamespaceCompatibility`:
`NamespaceKind.Equipment => DriverType != "GalaxyMxGateway"`), the loader was cornered into inventing
a **placeholder `Modbus` driver** (`nw-uns-modbus`, 0 tags bound) and pointing all 40 equipment rows
at it.
That placeholder is misleading ("equipment tied to Modbus") **and not inert** — on every deploy the
runtime tries to instantiate a real Modbus driver for it and fails:
```
WARNING DriverHost: factory for Modbus threw on nw-uns-modbus; stubbing
Cause: Modbus driver config for 'nw-uns-modbus' missing required Host
WARNING DriverHost: no factory for driver type Modbus (instance nw-uns-modbus); falling back to stub
INFO DriverHost: spawned Modbus driver nw-uns-modbus (stub=True)
```
The data path is correct (the MxGateway *is* the source, via the VirtualTag indirection — `verify-equipment`
shows 396 Good), but the driver association is a lie and generates per-deploy exception/stub noise.
## Decision
**Option B — make equipment driver-less.** VirtualTag-only equipment genuinely has no field driver, so
it should reference none: make `Equipment.DriverInstanceId` **nullable**, **delete the placeholder
driver**, and teach the one cluster-attribution site to tolerate a null driver. This keeps the
VirtualTag route intact (so overlay-building, hierarchy reorganization, and value transforms all stay
available — those are the route's strength), removes the misleading Modbus FK, and eliminates the
stub/exception noise.
Chosen over: **A** (a no-op "Virtual" driver type — keeps a placeholder), and **Option 4** (relax the
validator so a GalaxyMxGateway driver can serve equipment directly as galaxy tags — rejected because a
galaxy tag's subscribe ref is `FolderPath.Name`, i.e. tree placement is coupled to galaxy source, so a
direct-tag overlay could not freely reorganize the company hierarchy without further driver
re-architecture, and it would also drop transforms/ScriptedAlarms and blur the SystemPlatform/Equipment
layering).
## Impact map (investigated)
The change is small because almost nothing depends on `Equipment.DriverInstanceId`.
### Schema / entity — the core change
- `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/Equipment.cs:23`
`public required string DriverInstanceId``public string? DriverInstanceId`.
- EF infers nullability from the C# type: `OtOpcUaConfigDbContext.ConfigureEquipment` has **no
`.IsRequired()`** call, so only the property type changes. One EF migration emits a single
`AlterColumn(DriverInstanceId, nullable: true)`; the model snapshot loses its `.IsRequired()`.
- **No FK to drop.** Equipment→DriverInstance is a *logical* FK only — no EF `HasOne/WithMany`, no DB
`FK_…` constraint, no `ON DELETE`. Deleting the placeholder driver while equipment still references it
is already schema-legal.
- `IX_Equipment_Driver` is a plain non-unique, non-filtered index — valid on a nullable column, kept
as-is (no filtered-index change; YAGNI).
### Validator / composer / applier / runtime — no change required
- **`DraftValidator`** (all 9 rules read): none dereference `Equipment.DriverInstanceId`; none require an
Equipment namespace to have a driver. `ValidateDriverNamespaceCompatibility` iterates
`draft.DriverInstances` — with the placeholder gone, that namespace simply has no driver row to check.
- **`DraftSnapshotFactory.FromConfigDbAsync`** loads the full `Equipment` entity → naturally carries
`null`. No projection change.
- **`Phase7Composer`** — `EquipmentNode` uses `EquipmentId`/`Name`/`UnsLineId`; equipment *Tags* use
`Tag.DriverInstanceId` (a separate field), not Equipment's; `EquipmentVirtualTag` uses
`EquipmentId`/`Name`/`ScriptId`. `Equipment.DriverInstanceId` is never read.
- **`Phase7Applier`** — `MaterialiseHierarchy` / `MaterialiseEquipmentVirtualTags` build folders +
variable nodes from projected fields only; never reads the driver.
- **`DriverHostActor`** — spawns one driver actor *per DriverInstance* in the artifact. Deleting the
placeholder DriverInstance ⇒ no Modbus spec ⇒ **no stub spawn, no `missing required Host` exception**.
The VirtualTag host is spawned unconditionally and streams regardless of driver children.
- **`VirtualTagHostActor`** — driver-agnostic; depends only on the dependency mux publishing the
galaxy-mirror values (from the GalaxyMxGateway driver, untouched).
### The one real code change — `DeploymentArtifact.BuildClusterSets`
`src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DeploymentArtifact.cs` (~lines 271297). Equipment carries
no `ClusterId`; today it is cluster-attributed **via its driver** (`equipmentIds.Add(id)` when
`di is not null && driverIds.Contains(di)`). With a null driver, equipment — and its VirtualTags — would
be **silently dropped in multi-cluster (`ScopeTo`) mode**.
Fix using the UNS hierarchy, which already carries cluster identity (`UnsArea.ClusterId` exists, and
`Equipment → UnsLine.UnsAreaId → UnsArea`):
- Build a `UnsLineId → UnsAreaId` map from the artifact's `UnsLines` array.
- **Driver-bound** equipment: unchanged (in-cluster when its driver is in-cluster).
- **Driver-less** equipment: in-cluster **iff its line's area is in-cluster**
(`areaIds.Contains(lineToArea[UnsLineId])`).
This is *more* correct than the driver path: it anchors on the equipment's actual UNS placement, which is
exactly the same-cluster invariant decision #122 already enforces (driver-cluster must equal
line-cluster). The existing cross-cluster consistency warning (lines 237244) is phrased "in cluster by
its driver"; for driver-less equipment the attribution is by line, so it is consistent by construction
and the warning won't fire. **Single-cluster (`ClusterFilterMode.None`, the docker-dev topology) never
calls `BuildClusterSets`**, so this is a correctness fix for multi-cluster, not a dev-path requirement.
### Loader — `scadaproj/otopcua-uns-loader/otopcua_uns.py`
In `cmd_populate_equipment`:
- **Remove** the placeholder `DriverInstance` INSERT (`'Northwind UNS placeholder', 'Modbus', …`) and its
comment.
- **Equipment INSERT**: drop `DriverInstanceId` from the column list (→ NULL).
- Retire the `EQ_DRIVER = "nw-uns-modbus"` constant and the now-dead `DELETE … Tag WHERE DriverInstanceId`
/ `DELETE … DriverInstance` teardown lines (in both `cmd_populate_equipment` and `cmd_clean`) — they
become no-ops; remove for clarity. Keep the `Namespace` INSERT/teardown.
- Fix the stale `# … "an Equipment namespace has a driver" expectations` comment.
### AdminUI — two production derefs (found at build time, not by the grep sweep)
Making the column nullable surfaced two `.razor` sites the impact grep missed (caught by `TreatWarningsAsErrors`):
- `Components/Pages/Clusters/TagEdit.razor:191``db.Equipment.Where(e => driverIds.Contains(e.DriverInstanceId))` (CS8604).
Behavior-preserving fix: guard `e.DriverInstanceId != null && …` (SQL already excludes NULL from an `IN` set, so this only satisfies the compiler).
- `Components/Pages/Clusters/EquipmentEdit.razor` — the equipment editor loads `DriverInstanceId` into a non-null
`FormModel` (line 183, CS8601) and **mandates** a driver on save (`"Pick a driver instance."`). Decision: give it
**full driver-less support**`FormModel.DriverInstanceId``string?`, add a "(none / driver-less)" option to the
driver dropdown, relax the mandatory-driver validation, and persist NULL when none is selected (normalize empty → null).
### Noted, not changing (YAGNI)
- `sp_ComputeGenerationDiff` includes `DriverInstanceId` in a `CHECKSUM(...)`. It is NULL-tolerant for
this one-time transition and sits on the **dormant** generation-diff path (the active deploy gate is
the C# `DraftValidator` + `ConfigComposer.SnapshotAndFlattenAsync`, not the SP). Flagged verify-only;
if it is ever reactivated, wrap with `ISNULL(DriverInstanceId, '')` as a follow-on.
## Testing & verification
1. **Unit (Configuration tests):** a `DraftValidator` test that a draft with driver-less equipment
(`Equipment.DriverInstanceId == null`, Equipment namespace with zero drivers) validates clean.
2. **Unit (Runtime tests):** a `BuildClusterSets` / scoped-`ParseComposition` test proving a null-driver
equipment is attributed to the cluster of its line's area in `ScopeTo` mode (and its VirtualTags are
kept), while a wrong-cluster line excludes it.
3. Existing `Configuration`, `OpcUaServer` (Phase7), and `Runtime` (DeploymentArtifact) suites stay green.
4. **Migration** authored + applied to the docker-dev config DB.
5. **Live docker-dev:** re-run the loader (`populate-equipment`, now driver-less) → redeploy → confirm
**396 Good still flow** on `:4840` (`VERIFY-EQUIPMENT: PASS`) **and** central-1 logs **no longer
contain** `spawned Modbus driver nw-uns-modbus` or `missing required Host`.
## Sequencing & risk
| Step | Risk | Notes |
|---|---|---|
| Entity nullable + EF migration | medium — schema change | single `AlterColumn`; no FK/constraint to drop; reversible |
| `BuildClusterSets` null-driver attribution | lowmedium — multi-cluster scoping | additive branch; single-cluster path unaffected; covered by a new test |
| Loader edits | low | drop placeholder + NULL the FK; teardown becomes no-ops |
| Live redeploy on docker-dev | low | recreate admin nodes only; sites untouched; the proof the wart is gone |
## Branches
- OtOpcUa: `feat/driverless-equipment-namespace` (off `master` `446a456`).
- Loader: `scadaproj` (`otopcua_uns.py`).
- Migration authored in the Configuration project; applied to docker-dev. Merge/push only on the user's
explicit go (the user manages this repo's integration).
## Related context
- Investigation: this design is grounded in a three-front impact sweep (EF/schema, validator/artifact,
runtime/loader) — see the per-layer findings folded into the Impact map above.
- Decision #122 (same-cluster invariant: equipment's driver-cluster must equal its UNS-line cluster) —
the anchor that makes line-based attribution for driver-less equipment correct.
- `DraftValidator.ValidateDriverNamespaceCompatibility` — the rule that forbids GalaxyMxGateway in
Equipment namespaces and forced the original placeholder.