# Hosts per-driver rows + AbCip nested-struct + Galaxy hygiene — Design **Date:** 2026-06-18 **Branch:** `feat/hosts-rows-abcip-nested-hygiene` (off master `f59680fa`) **Backlog items:** `stillpending.md` §A #8 (Hosts per-driver rows), #6 (AbCip nested-struct), #3/#13 + #10 (Galaxy stale-comment hygiene + reconcile) A three-component backlog phase. One live-provable AdminUI feature, one offline-provable driver enhancement, one doc/comment hygiene sweep. The three touch disjoint projects (AdminUI / Driver.AbCip / Driver.Galaxy + docs), so they are independently implementable. The triggering follow-up ("cursor-based paging WITHIN a single timestamp / oversized tie clusters") is **stale** — it shipped at `2e6c6d3a` (`HistoryPaging.SliceTieCluster`, wired in `OtOpcUaNodeManager.cs:1928`, tested in `HistoryPagingTests.cs`). This phase handles the "+ backlog" half. --- ## Standing constraints (in force) - **NO Commons wire/proto contract change**, NO interface/Core.Abstractions contract change, NO EF migration, NO bUnit (Razor proven only by live `/run`). - Stage by explicit path, never `git add .`; never stage the never-stage files (`sql_login.txt`, `src/Server/.../Host/pki/`, `pending.md`, `current.md`, `stillpending.md`, `docker-dev/docker-compose.yml`). - No force-push, no `--no-verify`. Finish = merge to master + push. - `dangerouslyDisableSandbox: true` for all build/test/rig commands. --- ## Component A — `/hosts` per-driver-instance status (backlog #8) ### The constraint that shapes the architecture `DriverHealthChanged` (the `driver-health` DPS message, `Commons/Messages/Drivers/DriverHealthChanged.cs`) carries `ClusterId` + `DriverInstanceId` + health, but **no hosting-node identity**. The snapshot store (`InMemoryDriverStatusSnapshotStore`) keys by `DriverInstanceId` alone, so in a redundancy pair the two nodes' health snapshots overwrite each other. `ClusterNode.NodeId` is a *logical* id (`"LINE3-OPCUA-A"`), not the Akka member address — the runtime keys "this node" by that logical id (`DriverHostActor.cs:428` filters `s.NodeId == _localNode.Value`). Consequences: - True per-Akka-**member** driver health would need a hosting-node field on `DriverHealthChanged` (a Commons change — forbidden) plus a node-keyed store, OR a per-node `DriverHostActor`-children Ask (new cluster actor messaging). Both are out of scope here. - The **reliable join key both sides share is `ClusterId`.** ### Approach (no Commons change) - **Reuse the existing health pipeline unchanged** (`DriverStatusSignalRBridge` → snapshot store). - **Add one AdminUI-internal store method:** `IReadOnlyCollection GetAll()` on `IDriverStatusSnapshotStore`, implemented in `InMemoryDriverStatusSnapshotStore` by snapshotting the backing `ConcurrentDictionary` values. (AdminUI-internal interface — **not** Commons.) - **New "Driver Instances" section on `/hosts`**, grouped by `ClusterId`: - Each cluster-group header lists that cluster's nodes from ConfigDB `ClusterNode` (logical NodeId + Host). - The body lists that cluster's live driver snapshots, enriched with `Name` + `DriverType` by joining ConfigDB `DriverInstance` on `DriverInstanceId`. Each row: a live status chip (the same state→color mapping `DriverStatusPanel` uses), last successful read, last error, 5-min error count. - Live-refreshed by subscribing to the store's `SnapshotChanged` event (in-process — dodges the Traefik self-hub trap, same pattern as `DriverStatusPanel`). - **Keep the existing Akka member topology rows as-is** — they are authoritative for Akka-level Up/Unreachable/Leader, which the DB does not know. Two sections, each authoritative for its own data. ### Testing Extract the grouping/enrichment into a **pure view-model builder**: `HostsDriverView.Build(snapshots, clusterNodes, driverInstances) -> IReadOnlyList`. xUnit + Shouldly: group-by-cluster, Name/DriverType enrichment, unknown-driver (snapshot with no matching `DriverInstance`) fallback, empty inputs. The Razor is a thin shell over the builder — proven only by live `/run` (no bUnit), per established pattern. ### Live `/run` (Component A) Rebuild **both** central-1 AND central-2 (the `:9200` AdminUI is Traefik round-robined across both — a half-deploy round-robins old/new code). Deploy a Modbus driver (Galaxy too if convenient), open `http://localhost:9200/hosts`, confirm the Driver Instances section lists the deployed drivers grouped by cluster with live state, then drive a Reconnect from a driver page and confirm the chip updates. ### Deferred follow-up (noted, not built) True per-Akka-member nesting needs node identity on the health feed (Commons field) or a per-node `DriverHostActor`-children Ask. Out of scope under the no-Commons constraint. --- ## Component B — AbCip nested-struct member expansion (backlog #6) ### Finding (simpler than the backlog framed) The backlog said the CIP Template Object member block "carries no nested template id." It actually **does**: the per-member info `u16` uses the *same encoding as the Symbol Object* — bit 15 = struct flag, lower 12 bits = template instance id for a struct member (the codebase's own comment in `CipTemplateObjectDecoder` documents this). The prior member-expansion fix (`4e141402`/`4a7b0fde`) **discarded** that id and recursed with `templateInstanceId: null` (`AbCipDriver.cs:~1179`), leaving a test-only `SeedDiscoveredUdtShapeForTest` seam. So the close is to stop discarding it — **no new CIP query needed**. ### Approach (driver-internal only) - Carry `uint? NestedTemplateId` on the `AbCipUdtMember` record (driver-internal — not Commons). - `CipTemplateObjectDecoder` sets it from `info & 0x0FFF` when the struct flag (bit 15) is set; `null` for scalar members. - The driver threads `member.NestedTemplateId` into the existing `FetchUdtShapeAsync` (recursion site `AbCipDriver.cs:~1179`) instead of `null`. Reuses `IAbCipTemplateReader.ReadAsync("@udt/{id}")` — zero new CIP primitives. ### Testing (fully offline) Via the injectable `IAbCipTemplateReader` / `FakeTemplateReader`: feed a Template Object blob with a struct member whose info = `0x8000 | nestedId`; assert (1) the decoder extracts `NestedTemplateId == nestedId`, (2) the driver fetches the nested shape (`@udt/{nestedId}`) so nested members become addressable. Add a scalar-member case asserting `NestedTemplateId == null`. Live-verify stays infra-gated (no nested-UDT ControlLogix rig) — the unit test pins the real risk (decode + threading). ### Risk + fallback Load-bearing assumption: struct-member lower-12 = template id. The codebase comment asserts it and the test pins it. If it proved false on real controller bytes, the fallback is the backlog's original per-member controller query — but the direct path is the confident choice. --- ## Component C — Backlog hygiene + reconcile (backlog #3/#13 + #10) ### Galaxy stale comments (comment-only — verify each claim against current code first) Rewrite 5 sites that reference an unshipped "PR 4.W", a never-added `Galaxy:Backend` switch, and the **retired** "legacy-host" backend (PR 7.2 retired Galaxy.Host/Proxy/Shared): - `GalaxyDriver.cs:~52` — `IGalaxyDataReader` "until then... legacy-host backend handles reads in production" → reads ARE supported in production now (standard driver, Phase A shipped). - `GalaxyDriver.cs:~92` — `_ownedMxSession` "PR 4.W — production runtime owned by InitializeAsync" → confirm it's built in `InitializeAsync` now, relabel to present tense. - `GalaxyDriver.cs:~669` — `DiscoverAsync` "PR 4.W — also refresh the per-platform probe watcher's membership after discovery" → confirm wired, drop the forward-ref. - `GalaxyDriverFactoryExtensions.cs:~19` — "PR 4.W will add a server-side `Galaxy:Backend` switch" → never landed and won't (legacy "Galaxy" proxy type retired; only `GalaxyMxGateway`). Replace with the shipped reality. - `HostStatusAggregator.cs:~21` — "re-raises OnHostStatusChanged as the driver-level event (wired in PR 4.W)" → confirm wired, drop the forward-ref. Each rewrite must verify the claim against the current code before asserting it (e.g. confirm `_ownedMxSession` really is built in `InitializeAsync`); convert to accurate present-tense or a real TODO. ### Reconcile - `stillpending.md` (never-staged): mark item #10 SHIPPED (the `ctx`-receiver guard is in `ScriptAnalysisService.cs:224`, `70e1bde9`); record the #6/#8 outcomes; tidy. - Update the two memory files (`MEMORY.md`, `project_stillpending_backlog.md`). --- ## Task slicing (independent → parallelizable) | Task | Component | Project | Class | Parallel with | |---|---|---|---|---| | T1 | AbCip nested-struct (decoder + record + driver threading + tests) | Driver.AbCip | standard | T2, T4 | | T2 | Hosts store `GetAll()` + pure grouping view-model + tests | AdminUI | standard | T1, T4 | | T3 | Hosts Razor "Driver Instances" section (uses T2) | AdminUI | small | none (after T2) | | T4 | Galaxy stale-comment rewrites | Driver.Galaxy | trivial | T1, T2 | | T5 | Reconcile stillpending #10/#6/#8 + memory | docs (never-staged) | trivial | none | | T6 | Build + driver/AdminUI tests + live `/run` + finish (merge+push) | — | small | none | Parallel implementers will use **worktree isolation** (or serial commits) per the shared-tree git-race lesson. T3 depends on T2; T5/T6 run at the end. ## Done = Build clean + `dotnet test` green (Driver.AbCip + AdminUI + Driver.Galaxy) + Component A live `/run` pass + merged to master + pushed.