Files
lmxopcua/docs/plans/2026-06-08-driverless-equipment-namespace.md
T

245 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Driver-less Equipment Namespace — Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans (or subagent-driven-development) to implement this plan task-by-task.
**Goal:** Let VirtualTag-only equipment reference no field driver — make `Equipment.DriverInstanceId` nullable, delete the misleading `Modbus` placeholder driver, and stop the per-deploy stub/exception noise, while preserving cluster-attribution correctness.
**Architecture:** The data path is unchanged (equipment values stream from the GalaxyMxGateway driver via the galaxy-mirror → VirtualTag indirection). Almost nothing reads `Equipment.DriverInstanceId`, so the change is: (1) make the column nullable (entity + one EF migration), (2) fix the single cluster-attribution site (`DeploymentArtifact.BuildClusterSets`) to anchor driver-less equipment on its UNS line's area cluster instead of its driver, (3) update the loader to stop creating the placeholder and NULL the FK, (4) live-verify on docker-dev.
**Tech Stack:** .NET 10, EF Core (MSSQL), Akka.NET, xUnit + Shouldly. Solution `ZB.MOM.WW.OtOpcUa.slnx`, Central Package Management. Branch `feat/driverless-equipment-namespace` (off `master` `446a456`). Loader: Python (`scadaproj/otopcua-uns-loader/otopcua_uns.py`). Design: `docs/plans/2026-06-08-driverless-equipment-namespace-design.md`.
**Sequencing:** T1 (entity + migration) first — it's the schema foundation. T2 (BuildClusterSets) and T3 (validator test) build on T1; T4 (loader) is a different repo and runs anytime. T5 (live verify) needs T1+T2+T4.
---
### Task 1: Make `Equipment.DriverInstanceId` nullable + EF migration
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 4 (different repo)
**Files:**
- Modify: `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/Equipment.cs` (line ~23)
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Clusters/TagEdit.razor` (~line 191 — null-guard the EF predicate)
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Clusters/EquipmentEdit.razor` (FormModel + dropdown + save — full driver-less support)
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/<timestamp>_NullableEquipmentDriverInstanceId.cs` (generated by `dotnet ef`)
- Auto-modify: `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/OtOpcUaConfigDbContextModelSnapshot.cs` (regenerated by `dotnet ef`)
**AdminUI surface (found at build time — the nullable change breaks two `.razor` sites under `TreatWarningsAsErrors`):**
- `TagEdit.razor:191`: `db.Equipment.Where(e => driverIds.Contains(e.DriverInstanceId))` → guard `e.DriverInstanceId != null && driverIds.Contains(e.DriverInstanceId)` (behavior-preserving; SQL already excludes NULL).
- `EquipmentEdit.razor`: full driver-less support — `FormModel.DriverInstanceId``string?`; add a "(none / driver-less)" option to the driver `<select>`; remove the mandatory `if (string.IsNullOrEmpty(_form.DriverInstanceId)) { _error = "Pick a driver instance."; return; }` check (~line 209); on save (both the IsNew `Add` ~line 222 and the update ~line 245) write `string.IsNullOrWhiteSpace(_form.DriverInstanceId) ? null : _form.DriverInstanceId`.
**Context:** `DriverInstanceId` is currently `public required string DriverInstanceId { get; set; }`. The EF config `OtOpcUaConfigDbContext.ConfigureEquipment` has **no `.IsRequired()`** call and **no FK relationship** for it (it's a logical FK only) — so EF infers NOT NULL purely from the C# `required string` type. Changing the type to `string?` flips the column to nullable; no fluent-config change is needed. The index `IX_Equipment_Driver` is a plain non-unique index and stays valid on a nullable column.
**Step 1 — Change the entity property.** In `Equipment.cs`, change:
```csharp
public required string DriverInstanceId { get; set; }
```
to:
```csharp
public string? DriverInstanceId { get; set; }
```
(Drop the `required` modifier and make the type nullable. Update the property's XML-doc comment if it asserts non-null — note it is now optional: null = VirtualTag-only / driver-less equipment.)
**Step 2 — Build to confirm no compile break from the nullability change.**
Run: `dotnet build ZB.MOM.WW.OtOpcUa.slnx`
Expected: build succeeds. If any production code dereferences `Equipment.DriverInstanceId` with a non-null assumption (`!` or implicit), the nullable-reference analyzer will flag it under `TreatWarningsAsErrors` — the investigation found NONE in production code, so expect 0 new warnings. If one appears, STOP and report the file:line (it's an undocumented deref the design missed).
**Step 3 — Author the migration.** Confirm the EF tool is available (`dotnet ef --version`; if missing, `dotnet tool restore` or `dotnet tool install --global dotnet-ef`). Set the design-time connection env (no DB connection is made during `migrations add` — it diffs the model), then scaffold:
```bash
export OTOPCUA_CONFIG_CONNECTION="Server=localhost,14330;Database=OtOpcUa;User Id=sa;Password=OtOpcUa!Dev123;TrustServerCertificate=True;"
dotnet ef migrations add NullableEquipmentDriverInstanceId --project src/Core/ZB.MOM.WW.OtOpcUa.Configuration
```
**Inspect the generated migration.** It MUST be a single `AlterColumn` and its inverse — nothing else:
```csharp
migrationBuilder.AlterColumn<string>(
name: "DriverInstanceId", table: "Equipment",
type: "nvarchar(64)", maxLength: 64, nullable: true,
oldClrType: typeof(string), oldType: "nvarchar(64)", oldMaxLength: 64);
// Down(): the same AlterColumn with nullable: false / oldNullable: true
```
If the migration contains anything beyond this `AlterColumn` (e.g. unrelated model drift), STOP and report — the model snapshot may be out of sync with master and that must be resolved first.
**Step 4 — Build + run the Configuration test suite (behaviour-preserving).**
Run: `dotnet build ZB.MOM.WW.OtOpcUa.slnx` then `dotnet test tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests/` (glob `tests` for the exact path if it differs).
Expected: green. Nothing asserts `DriverInstanceId` non-null, so all existing tests pass.
**Step 5 — Commit.**
```bash
git add -A && git commit -m "feat(config): make Equipment.DriverInstanceId nullable (driver-less equipment) + migration"
```
**Acceptance:** entity is `string?`; the new migration is a single `AlterColumn(... nullable: true)`; the model snapshot drops `.IsRequired()` on `DriverInstanceId`; solution builds; Configuration tests green. **Do NOT apply the migration to any DB here** — that happens in Task 5.
---
### Task 2: `BuildClusterSets` — attribute driver-less equipment via its UNS line's area cluster
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 3 (different file), Task 4 (different repo)
**Files:**
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DeploymentArtifact.cs` (`BuildClusterSets`, ~lines 271297, and its `ClusterSets` doc ~260267)
- Test: `tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/...` (find the existing DeploymentArtifact test file by glob, e.g. `*DeploymentArtifact*Tests.cs` or `*ParseComposition*`/`*ClusterScope*`)
**Context:** Equipment carries no `ClusterId`. Today `BuildClusterSets` attributes equipment to a cluster **via its driver**:
```csharp
// Equipment carries no ClusterId — include it when its DriverInstanceId is in-cluster.
var di = el.TryGetProperty("DriverInstanceId", out var diEl) ? diEl.GetString() : null;
var id = el.TryGetProperty("EquipmentId", out var idEl) ? idEl.GetString() : null;
if (!string.IsNullOrWhiteSpace(id) && di is not null && driverIds.Contains(di))
equipmentIds.Add(id!);
```
A null driver ⇒ equipment (and its VirtualTags) silently dropped in multi-cluster (`ScopeTo`) mode. The fix anchors driver-less equipment on its UNS placement: `UnsArea` carries `ClusterId` (collected into `areaIds`), and `Equipment → UnsLine.UnsAreaId → UnsArea`. So a driver-less equipment is in-cluster iff its line's area is in-cluster. Single-cluster (`ClusterFilterMode.None`) never calls `BuildClusterSets`, so this only affects multi-cluster — but it's a correctness fix worth a test.
**Step 1 — Write the failing test.** First READ the existing DeploymentArtifact/cluster-scope test file to learn the harness: how a deployment-artifact JSON blob is built (the helper that assembles `DriverInstances`/`UnsAreas`/`UnsLines`/`Equipment`/`EquipmentVirtualTags` arrays), and how `ParseComposition(blob, scope)` / `BuildClusterSets` results are asserted. Mirror that harness exactly. Add a test like:
- `Driverless_equipment_is_kept_when_its_line_area_is_in_cluster`: blob with an in-cluster `UnsArea` (ClusterId = "C1"), a `UnsLine` in that area, an `Equipment` with **`DriverInstanceId: null`** on that line, plus an `EquipmentVirtualTag` for it. Call the cluster-scoped parse for cluster "C1". Assert the equipment node AND its VirtualTag are present in the result.
- `Driverless_equipment_is_excluded_when_its_line_area_is_in_another_cluster`: same but the area's ClusterId is "C2"; scope to "C1" → equipment + its vtag absent.
- (Keep/confirm an existing driver-bound case still works — driver-bound equipment attribution must be unchanged.)
If the existing tests assert via `ParseComposition(blob, ClusterScope)`, use that public entry; if they unit-test `BuildClusterSets` directly via an internal-visible hook, follow that. Do NOT invent a new harness.
**Step 2 — Run, confirm fail.** `dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ --filter "FullyQualifiedName~Driverless"` → the new tests fail (driver-less equipment currently dropped).
**Step 3 — Implement.** In `BuildClusterSets`, before the Equipment loop, build a `UnsLineId → UnsAreaId` map from the artifact's `UnsLines` array; then attribute driver-less equipment by line-area:
```csharp
// Map each UnsLine to its area so driver-less equipment can be cluster-attributed by its UNS placement.
var lineToArea = new Dictionary<string, string>(StringComparer.Ordinal);
if (root.TryGetProperty("UnsLines", out var lines) && lines.ValueKind == JsonValueKind.Array)
{
foreach (var el in lines.EnumerateArray())
{
if (el.ValueKind != JsonValueKind.Object) continue;
var lineId = el.TryGetProperty("UnsLineId", out var lEl) ? lEl.GetString() : null;
var areaId = el.TryGetProperty("UnsAreaId", out var aEl) ? aEl.GetString() : null;
if (!string.IsNullOrWhiteSpace(lineId) && !string.IsNullOrWhiteSpace(areaId))
lineToArea[lineId!] = areaId!;
}
}
// ... in the Equipment loop, replace the driver-only attribution with:
var di = el.TryGetProperty("DriverInstanceId", out var diEl) ? diEl.GetString() : null;
var id = el.TryGetProperty("EquipmentId", out var idEl) ? idEl.GetString() : null;
var lineId = el.TryGetProperty("UnsLineId", out var luEl) ? luEl.GetString() : null;
if (string.IsNullOrWhiteSpace(id)) continue;
var inClusterByDriver = di is not null && driverIds.Contains(di);
var inClusterByLine = di is null && lineId is not null
&& lineToArea.TryGetValue(lineId, out var areaOfLine) && areaIds.Contains(areaOfLine);
if (inClusterByDriver || inClusterByLine)
equipmentIds.Add(id!);
```
Update the `// Equipment carries no ClusterId — …` comment and the `ClusterSets.EquipmentIds` doc (~line 263) to state: driver-bound equipment is attributed by its driver's cluster; driver-less equipment by its UNS line's area cluster. Keep `driverIds`/`areaIds` collection unchanged. **Note:** `areaIds` is already populated (the `CollectIdsWhereCluster(root, "UnsAreas", …)` call) before the Equipment loop — confirm ordering so `areaIds` is filled first.
**Step 4 — Run, confirm pass** + the broader DeploymentArtifact/Runtime suite stays green:
`dotnet test tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ --filter "FullyQualifiedName~DeploymentArtifact"` (and the `Driverless` filter). Also `dotnet build ZB.MOM.WW.OtOpcUa.slnx`.
**Step 5 — Commit.**
```bash
git add -A && git commit -m "fix(deploy): cluster-attribute driver-less equipment via its UNS line area (BuildClusterSets)"
```
**Acceptance:** driver-less equipment + its VirtualTags are kept in `ScopeTo` mode when the line's area is in-cluster, excluded otherwise; driver-bound attribution unchanged; suite green.
---
### Task 3: `DraftValidator` accepts a driver-less-equipment draft
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 2 (different file), Task 4 (different repo)
**Files:**
- Test: `tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests/...` (find the existing `DraftValidator` test file by glob, e.g. `*DraftValidator*Tests.cs`)
**Context:** No validator rule dereferences `Equipment.DriverInstanceId` or requires an Equipment namespace to have a driver. This task LOCKS that in with a regression test (no production change). It proves the design's core safety claim: a draft with driver-less equipment + an Equipment-kind namespace that has zero `DriverInstance` rows validates clean.
**Step 1 — Write the test.** READ the existing `DraftValidator` tests to learn how a `DraftSnapshot` is constructed in this suite (the builder/fixture for Equipment, Namespaces, DriverInstances, UnsAreas/Lines, ServerCluster/ClusterNode). Then add:
- `Validate_accepts_driverless_equipment_in_driverless_equipment_namespace`: build a minimal valid draft with an Equipment-kind namespace that has **zero `DriverInstance` rows**, and one `Equipment` with **`DriverInstanceId = null`** (canonical `EQ-…` id, valid Name/UnsLineId, with the UnsLine/UnsArea/cluster wiring the other rules need), plus its VirtualTag. Assert `DraftValidator.Validate(snapshot)` returns **zero errors** (`result.Errors.ShouldBeEmpty()` or the suite's idiom).
Reuse whatever canonical-equipment helper the suite already has (the EquipmentId must satisfy `ValidateEquipmentIdDerivation` = `"EQ-" + uuid.ToString("N")[:12].ToLowerInvariant()`); copy that derivation from an existing passing test so the test isn't tripped by an unrelated rule.
**Step 2 — Run, confirm pass.** `dotnet test tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests/ --filter "FullyQualifiedName~DraftValidator"` → green (this is a characterization test; it should pass immediately on top of Task 1's nullable entity). If it FAILS, that means a rule DOES implicitly require a driver — STOP and report which error code fired (the design would need revisiting).
**Step 3 — Commit.**
```bash
git add -A && git commit -m "test(config): DraftValidator accepts driver-less equipment + driverless equipment namespace"
```
**Acceptance:** the new test passes; existing DraftValidator tests stay green.
---
### Task 4: Loader — stop creating the placeholder, NULL the equipment FK
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 1, Task 2, Task 3 (different repo)
**Files:**
- Modify: `/Users/dohertj2/Desktop/scadaproj/otopcua-uns-loader/otopcua_uns.py` (`cmd_populate_equipment`, `cmd_clean`, the `EQ_DRIVER` constant + its comment ~lines 8691)
**Context:** The loader invents a placeholder `Modbus` `DriverInstance` (`nw-uns-modbus`) and points all 40 equipment at it. With the column now nullable (Task 1), equipment should reference NO driver.
**Step 1 — Edit `cmd_populate_equipment`.** READ the function first. Then:
- **Remove** the placeholder `DriverInstance` INSERT (the `... VALUES (NEWID(), %s, %s, %s, 'Northwind UNS placeholder', 'Modbus', 1, '{}')` block, ~lines 275278) and its preceding "Placeholder driver…" comment.
- **Equipment INSERT** (~lines 296299): drop `DriverInstanceId` from the column list and its value param entirely, so the row inserts with a NULL driver. E.g.:
```python
cur.execute(
"INSERT INTO dbo.Equipment (EquipmentRowId, EquipmentId, EquipmentUuid, UnsLineId, "
"Name, MachineCode, Manufacturer, Model, Enabled) VALUES (NEWID(), %s, %s, %s, %s, %s, %s, %s, 1)",
(eq_id, eq_uuid, "nw-" + e["unsLineId"], e["name"], e["machineCode"],
e.get("manufacturer"), e.get("model")))
```
(Verify the exact column/param alignment against the current code — remove exactly the `DriverInstanceId` column and the `EQ_DRIVER` arg, keep everything else.)
- In the teardown block at the top of `cmd_populate_equipment` (the DELETEs), **remove** the now-dead `DELETE FROM dbo.Tag WHERE DriverInstanceId=%s, (EQ_DRIVER,)` and `DELETE FROM dbo.DriverInstance WHERE DriverInstanceId=%s, (EQ_DRIVER,)` lines (no such rows are created anymore). Keep the VirtualTag/Script/Equipment/UnsLine/UnsArea/Namespace deletes.
**Step 2 — Edit `cmd_clean`** the same way: remove the `DELETE … Tag WHERE DriverInstanceId=%s` and `DELETE … DriverInstance WHERE DriverInstanceId=%s` lines that reference `EQ_DRIVER`.
**Step 3 — Retire the constant + fix the comment.** Remove the `EQ_DRIVER = "nw-uns-modbus"` constant and the stale comment block (~lines 8691) that explains the placeholder ("A placeholder non-Galaxy driver kept ONLY to satisfy 'an Equipment namespace has a driver' expectations…"). If any other reference to `EQ_DRIVER` remains, grep `EQ_DRIVER` across the file and remove/adjust each.
**Step 4 — Smoke-check the script parses.** `python3 -c "import ast; ast.parse(open('/Users/dohertj2/Desktop/scadaproj/otopcua-uns-loader/otopcua_uns.py').read())"` → no SyntaxError. (Do NOT run populate against the DB here — that's Task 5, after the migration is applied.) Confirm no lingering `EQ_DRIVER` references: `grep -n EQ_DRIVER /Users/dohertj2/Desktop/scadaproj/otopcua-uns-loader/otopcua_uns.py` → no output.
**Step 5 — Commit (in the scadaproj repo).**
```bash
cd /Users/dohertj2/Desktop/scadaproj && git add otopcua-uns-loader/otopcua_uns.py \
&& git commit -m "loader: equipment is driver-less (drop Modbus placeholder, NULL DriverInstanceId)"
```
(Commit on `scadaproj` `main` — the loader lives there; do NOT push.)
**Acceptance:** no `EQ_DRIVER` references remain; the placeholder DriverInstance INSERT is gone; the Equipment INSERT omits `DriverInstanceId`; the script parses; teardown still cleans the overlay.
---
### Task 5: Live docker-dev verification
**Classification:** standard
**Estimated implement time:** ~6 min (+ migration/deploy/settle)
**Parallelizable with:** none (needs T1, T2, T4)
**Steps (no new code — the proof the wart is gone and nothing regressed):**
1. **Build the docker-dev image off this branch** (carries the nullable entity + BuildClusterSets fix):
`cd docker-dev && docker compose build central-1`.
2. **Apply the migration to the docker-dev config DB.** The DB is NOT auto-migrated. Run EF update against the live config DB (port 14330):
```bash
export OTOPCUA_CONFIG_CONNECTION="Server=localhost,14330;Database=OtOpcUa;User Id=sa;Password=OtOpcUa!Dev123;TrustServerCertificate=True;"
dotnet ef database update --project src/Core/ZB.MOM.WW.OtOpcUa.Configuration
```
Confirm: `Applied migration '<...>_NullableEquipmentDriverInstanceId'`. Sanity-check the column is now nullable:
`docker exec otopcua-dev-sql-1 /opt/mssql-tools18/bin/sqlcmd -S localhost -U sa -P 'OtOpcUa!Dev123' -C -d OtOpcUa -Q "SELECT is_nullable FROM sys.columns WHERE object_id=OBJECT_ID('Equipment') AND name='DriverInstanceId';"` → `1`.
3. **Re-run the loader driver-less** (clears + reloads the equipment overlay with NULL driver, no placeholder):
`cd /Users/dohertj2/Desktop/scadaproj/otopcua-uns-loader && .venv/bin/python otopcua_uns.py populate-equipment` (use the same invocation the project README/memory documents). Verify in SQL: `SELECT COUNT(*) FROM Equipment WHERE DriverInstanceId IS NULL` = 40; `SELECT COUNT(*) FROM DriverInstance WHERE DriverInstanceId='nw-uns-modbus'` = 0.
4. **Recreate the admin nodes on the new image** (sites untouched):
`cd docker-dev && docker compose up -d --no-deps --force-recreate central-1 central-2`. Wait for `:9200` to answer (302).
5. **Redeploy headless:** `curl -s -X POST http://localhost:9200/api/deployments -H 'X-Api-Key: docker-dev-deploy-key'` → expect **202 Accepted**.
6. **Verify values still flow:** `cd /Users/dohertj2/Desktop/scadaproj/otopcua-uns-loader && .venv/bin/python otopcua_uns.py verify-equipment --require-good 396 --wait` → **`VERIFY-EQUIPMENT: PASS`**, 396 Good.
7. **The wart is gone — confirm the Modbus stub/exception no longer appears:**
`docker logs otopcua-dev-central-1-1 2>&1 | grep -iE "nw-uns-modbus|missing required Host|spawned Modbus driver"` → **no output** (previously this showed `factory for Modbus threw on nw-uns-modbus` + `spawned Modbus driver nw-uns-modbus (stub=True)`).
**Acceptance:** migration applied (column nullable); 40 equipment rows have NULL driver and the `nw-uns-modbus` DriverInstance is gone; deploy 202 Accepted; `verify-equipment` PASS (396 Good); central-1 log clean of the Modbus stub/exception. **This task changes the running docker-dev stack (the user's active env)** — coordinate: recreate only the admin nodes; don't disrupt the site nodes.
---
## After all tasks
Run the affected suites (`Configuration`, `Runtime` DeploymentArtifact) + build the solution, then use **superpowers-extended-cc:finishing-a-development-branch**: verify green, present merge/PR/keep/discard for `feat/driverless-equipment-namespace` (OtOpcUa) and the `scadaproj` loader commit. Merge/push only on the user's explicit go (the user manages this repo's integration).