# Design: M9 — Templates & Authoring (T22–T26, T28, T30–T32 + CLI cached-call Retry/Discard) **Date:** 2026-06-18 **Status:** Approved (brainstorming session) — ready for writing-plans **Milestone:** M9 of the system-completion roadmap (`docs/plans/2026-06-15-stillpending-completion-design.md` line 105) **Branch:** `worktree-m9-templates-authoring` off `origin/main` @ `72aec3b4` **Source backlog:** `stillpending.md` Tier 3 — "Templates / Data Connections / Triggers UI" + "Cached-call tracking" ## Goal Deliver the in-scope authoring and templates backlog: make the template tree searchable and reorderable, let operators move and live-monitor data connections, surface multi-level inheritance + base-change staleness in the template editor, add an opt-in strict trigger-analysis mode, build schema-driven value entry (nested forms + Monaco hover/completion + a reusable `$ref` schema library), and put cached-call Retry/Discard on the CLI. One unified-outbox page item is explicitly deferred. ## Scope ### In scope (10 deliverables) - **T22** — Template tree search/filter. - **T23** — Folder sibling reorder + root-level context menu (menu-based; **no drag-drop**). - **T24** — Move a data connection between sites. - **T25** — Connection live-status indicators on the design page. - **T26** — Base-template versioning *authoring*: multi-level inherited-member resolution in the editor + a read-only staleness banner. - **T28** — Strict expression-trigger analysis *kind* (opt-in escalation). - **T30** — Schema-driven nested value-entry forms. - **T31** — Monaco JSON-Schema hover/completion on value-entry. - **T32** — JSON Schema `$ref` resolver + a template-level schema library. - **CLI** — `cached-call retry|discard` for site-local cached calls. ### Deferred (logged follow-ups, not in M9) - **Unified notifications + site-calls outbox page** — the two data models diverge hard (enum vs string status lifecycles, offset vs keyset pagination, `string` vs strong-typed GUID ids, asymmetric provenance). A true union view-model is high-risk for marginal operator gain. Deferred; the CLI Retry/Discard ships instead. - **Folder drag-drop** (HTML5 DnD via JS interop) — `[PERM]`-tagged; menu-based reorder delivers the capability without the Blazor-Server interop fragility. - **T27** (promote-derived-to-base, cross-tenant libraries) and **T29** (WhileTrue alarm trigger) remain excluded per the roadmap. ## Locked decisions - **D1 — T26 is authoring-only.** Instance flattening already re-walks the full inheritance chain fresh on every deploy (`TemplateResolver.BuildInheritanceChain` is arbitrary-depth; `CycleDetector` covers inheritance/composition/cross-graph). The only real gap is in the *editor*: it loads the immediate base only, derived templates carry stale `IsInherited` placeholder rows, and there is no staleness signal. M9 closes that with a **read-only** resolve + staleness banner. **No stored-row mutation, no `RefreshDerivedTemplate` command.** - **D2 — T32 is fully in scope.** Build a template-level schema library (new entity + idempotent migration + repo) and a custom `$ref` resolver. **No new NuGet package** — `Directory.Packages.props` has no JSON-schema library and CLAUDE.md forbids adding one; everything uses `System.Text.Json` and extends the existing `InboundApiSchema` parser. - **D3 — Unified outbox page deferred;** ship the CLI Retry/Discard from this cluster. - **D4 — T23 is menu-based:** sibling Move-up/Move-down (uses the existing `TemplateFolder.SortOrder`) + a root-level context menu (New Folder / New Template at root) + completing the folder context menu. No drag-drop. - **D5 — Move-connection is guarded.** A `MoveDataConnectionCommand` succeeds only when: (a) the target site exists; (b) no name collision with an existing connection at the target site; (c) **no `InstanceConnectionBinding` references the connection** (instances are site-scoped — a bound connection cannot leave its site without orphaning the binding). On block, return a clear error naming the blocking instances. Also re-point/validate name-based references (`TemplateNativeAlarmSource.ConnectionName`, `InstanceNativeAlarmSourceOverride.ConnectionNameOverride`) for collisions. Every move emits an audit-log row. - **D6 — T25 reuses existing health transport.** Health already flows DCL → `ISiteHealthCollector.UpdateConnectionHealth` → `SiteHealthReport.DataConnectionStatuses` (name→`ConnectionHealth`) → `ICentralHealthAggregator` → the Health page renders badges. M9 surfaces the same data on the *design* `DataConnections` page (per-node badge + ~10s poll). No new transport, no SignalR. - **D7 — T28 is an opt-in escalation layer.** Expression triggers already get a real Roslyn semantic compile + forbidden-API + undefined-attribute analysis that **blocks deploy** (delivered in M2/M3). T28 adds a per-trigger `AnalysisKind` (default **Advisory** = today's behavior; **Strict** escalates the currently-advisory findings — blank expression, ambiguous coercion — to deploy-blocking errors). The increment is the toggle + the escalation branch; implementation right-sizes after confirming exact current behavior. ## Current-state map (reconnaissance evidence) ### Cluster A — Template tree UI - `CentralUI/Components/Shared/TreeView.razor` — generic tree; external-filter model (R8), `ContextMenu` render-fragment (R15); **no built-in DnD**. - `CentralUI/Components/Shared/TemplateFolderTree.razor:68` — already exposes a `Filter` parameter with recursive substring match + ancestor auto-expand (`ApplyFilter`/`CopyMatching`). - `CentralUI/Components/Pages/Design/Templates.razor` — uses `TemplateFolderTree` but **wires no search box**; folder context menu present (New Folder/Template, Rename, Move…, Delete); `MoveFolderDialog.razor` exists. - `Commons/Entities/Templates/TemplateFolder.cs:12` — has `SortOrder`. `TemplateEngine/Services/TemplateFolderService.cs` — Create/Rename/Move (cycle + collision checks)/Delete; **no sort-order update method**. - `Commons/Messages/Management/TemplateFolderCommands.cs` — Create/Move/Rename/Delete commands; **no reorder command**. ### Cluster B — Data connections - `Commons/Entities/Sites/DataConnection.cs:8` — `SiteId` FK. `Commons/Messages/Management/DataConnectionCommands.cs` — Create/Update/Delete; `Update` does **not** change `SiteId`; **no move command**. - `CentralUI/Components/Pages/Design/DataConnectionForm.razor` — site locked after creation. `DataConnections.razor` — site→connection tree, search box present, Edit/Delete actions; **no move, no health badge**. - FK/blockers for move: `Instances/InstanceConnectionBinding.cs:12` (`DataConnectionId` FK), `Templates/TemplateNativeAlarmSource.cs:21` (`ConnectionName`, name-based), `Instances/InstanceNativeAlarmSourceOverride.cs:22` (`ConnectionNameOverride`, name-based). - Health: `HealthMonitoring/ISiteHealthCollector.cs:58`, `Commons/Messages/Health/SiteHealthReport.cs:10` (`DataConnectionStatuses`), `ICentralHealthAggregator` (`GetSiteState`), `CentralUI/.../Monitoring/Health.razor` (existing badge render + `GetConnectionHealthBadge`). ### Cluster C — Inheritance authoring - `Commons/Entities/Templates/Template.cs` — `ParentTemplateId` (inheritance), `IsDerived` + `OwnerCompositionId` (composition-materialized slots). Members carry `IsInherited` + `LockedInDerived` flags (`TemplateAttribute`/`TemplateAlarm`/`TemplateScript`/`TemplateNativeAlarmSource`). - `TemplateEngine/TemplateResolver.cs:119` — `BuildInheritanceChain` walks arbitrary depth (root-first), cycle-guarded. `TemplateEngine/Flattening/FlatteningService.cs` — derived wins, `IsInherited` placeholders skip in favor of the live base value. `CycleDetector.cs` — inheritance/composition/cross-graph checks on save. - `TemplateEngine/Flattening/RevisionHashService.cs` — deterministic SHA-256 of flattened config (already used for staleness in Transport/M8 via `IStaleInstanceProbe`). - `CentralUI/Components/Pages/Design/TemplateEdit.razor:58` — loads only the **immediate** base (`_baseTemplate`, `_baseAttributesByName`, …); no multi-level resolution, no staleness banner. - `ManagementService/ManagementActor.cs:178` — template command block; **no resolve/update-derived command**. ### Cluster D — Triggers + schema entry - `TemplateEngine/Validation/ValidationService.cs:263` (`CheckExpressionTrigger`) — real Roslyn compile + forbidden-API + undefined-attribute checks; blank expression = warning; errors block deploy. Separate error/warning lists make selective escalation a clean seam. - JSON Schema is canonical storage (migration `20260512211204_MigrateParametersToJsonSchema`); `Commons/Types/InboundApi/InboundApiSchema.cs` (`Parse`/`ParseSchema` recursive, depth-capped; `Validate`). `CentralUI/Components/Shared/SchemaBuilder.razor` authors schemas; `ParameterValueForm.razor:52` renders scalars but falls back to a **JSON textarea** for object/list. Monaco already integrated (`MonacoEditor.razor`). **No `$ref` resolution anywhere; no schema-library entity.** `Directory.Packages.props` — no JSON-schema package (System.Text.Json only). ### Cluster E — CLI cached-call Retry/Discard - Backend relay fully exists: `ManagementActor.cs:220,380` (`RetryParkedMessageCommand`/`DiscardParkedMessageCommand`, Deployer-gated) → `SiteCallAuditActor.cs:877,909` (`HandleRetrySiteCall`/`HandleDiscardSiteCall` → `RetryParkedOperation`/`DiscardParkedOperation` relay) → site. Central UI Site Calls page already uses it. - CLI pattern: `CLI/Commands/CommandHelpers.cs:34` (`ExecuteCommandAsync` → `ManagementHttpClient.SendCommandAsync`), command-name via `ManagementCommandRegistry.GetCommandName`. Model on `NotificationCommands.cs`. **No cached-call command group today** — must verify the registry maps the two commands. ## Design by feature ### T22 — Template tree search (small) Add a search `` to `Templates.razor`, bound to a local field, passed to `TemplateFolderTree.Filter`. The recursive filter + auto-expand already exist. UI-only — no service/entity/command change. Clear-filter restores the full tree and prior expansion state. ### T23 — Folder reorder + context menus (standard) - `TemplateFolderService.ReorderFolderAsync(folderId, direction, user)` (or `MoveUp`/`MoveDown`) — swap `SortOrder` with the adjacent sibling under the same parent; no-op at the ends. New `ReorderTemplateFolderCommand` + ManagementActor handler (Designer-gated, matching the other folder commands). - Sibling loads ordered by `SortOrder` (then Name) everywhere the folder tree is built. - `Templates.razor` — Move-up/Move-down items in the folder context menu; a **root-level** context menu (right-click empty/root → New Folder, New Template at root). Complete any missing folder-menu items. ### T24 — Move connection between sites (high-risk) - `MoveDataConnectionCommand(DataConnectionId, TargetSiteId)` + ManagementActor `HandleMoveDataConnection` (Designer-gated). - Guards (D5), all server-side: target site exists; no name collision at target; **reject if any `InstanceConnectionBinding` references the connection** with an error naming blockers; validate name-based native-alarm-source references won't collide/orphan at the target. - Persist via the existing `ISiteRepository.UpdateDataConnectionAsync` (sets `SiteId`); emit an audit row. - UI: a "Move to Site…" action + `MoveDataConnectionDialog` (target-site picker, error surface) on `DataConnections.razor`. ### T25 — Connection live-status (standard) - A central-side query (extend the health query service or inject `ICentralHealthAggregator`) returning a `connectionId → ConnectionHealth` map for a site: read the latest `SiteHealthReport.DataConnectionStatuses`, resolve names→ids via the repo. - `DataConnections.razor` — render a health badge per connection node (reuse `GetConnectionHealthBadge`/`AlarmStateBadges`-style classes), refresh on a ~10s poll timer (mirror the Health page). Register the injected service in the existing `DataConnections` bUnit fixtures. ### T26 — Inheritance authoring resolve + staleness banner (high-risk) - A resolve service/method that, given a derived or child template, walks the full inheritance chain (`BuildInheritanceChain`) and returns the **effective inherited member set** — including base members added *after* the derived template was created, across ≥2 inheritance levels — annotated per member with origin (own override / inherited-from-X / locked). - A new read-only query command (e.g. `GetResolvedTemplateMembersCommand`) + ManagementActor handler returning that set (plus a staleness summary). - `TemplateEdit.razor` renders the **full** resolved inherited set (not just the immediate base) and a read-only banner when the stored derived rows differ from the freshly-resolved chain ("Base changed — N inherited members differ"). **No mutation** — flattening at deploy is already correct; the banner is informational and the editor's own override actions are unchanged. ### T28 — Strict expression-trigger kind (small) - Add `AnalysisKind` (Advisory default / Strict) to the trigger config (carried in the existing `TriggerConfiguration` JSON or a small dedicated field — chosen to stay additive to the flattened model and avoid a migration if feasible). - `CheckExpressionTrigger` — when Strict, promote the currently-advisory findings to errors (deploy-blocking); Advisory preserves today's behavior exactly. - Trigger editor selector (alarm/script trigger UI) + CLI flag (`--trigger-kind`/`--strict`). Right-size after confirming exact current advisory set. ### T30 — Schema-driven nested forms (standard) - Extend `ParameterValueForm.razor` to recursively render object fields and list items as typed inputs (replacing the JSON textarea for object/list), driven by the parsed `InboundApiSchema` (including `$ref`-resolved schemas from T32). Per-field validation via `InboundApiSchema.Validate`; collect to canonical JSON. Re-register in existing fixtures. ### T31 — Monaco hover/completion (standard) - Feed the resolved JSON Schema to the existing Monaco editor's JSON language config so the value-entry JSON surface gets schema-driven hover + completion. Reuses `MonacoEditor.razor`; no new package (Monaco's built-in JSON schema support). ### T32 — `$ref` resolver + template-level schema library (high-risk; build first) - New `SharedSchema` entity (Id, Name unique, optional scope, `SchemaJson`) + EF config + **idempotent** migration + repository. - Custom `$ref` resolver in `InboundApiSchema.Parse` (resolve `{"$ref":"lib:Name"}`-style pointers to library entries; depth/cycle-guarded, System.Text.Json only). - ManagementActor CRUD commands (Designer-gated) + a Central UI schema-library page (reuse `SchemaBuilder`). - Deploy-time validation that every `$ref` target exists (block on dangling ref), wired into the existing validation pipeline. ### CLI — cached-call Retry/Discard (small) - New `CachedCallCommands.cs` (`cached-call retry|discard --site-id … --tracked-operation-id …`) calling the existing Deployer-gated `RetryParkedMessageCommand`/`DiscardParkedMessageCommand` via `CommandHelpers.ExecuteCommandAsync`. **Verify `ManagementCommandRegistry` maps both command names** (the CLI's `GetCommandName` depends on it). Update `CLI/README.md` + `Component-CLI.md`. ## Dependencies & wave plan Execute **subagent-driven** in the `worktree-m9-templates-authoring` worktree. Implementers do **not** create worktrees; commit **pathspec** form (`-m` before `--`, never `git add -A`); keep ≤2–3 concurrent committers with a post-wave HEAD-presence check; targeted builds/tests per task; full-solution build + docker rebuild only at integration. - **Wave 1 (low-risk, parallel — disjoint files):** T22 (`Templates.razor`) ‖ CLI Retry/Discard (CLI) ‖ T28 (Template Engine validation + trigger editor). - **Wave 2:** T23 (`Templates.razor`, after T22 — same file) ‖ T25 (`DataConnections.razor`, additive). - **Wave 3:** T24 (`DataConnections.razor`, after T25 — same file) ‖ T32 foundation (entity + migration + `$ref` resolver — Commons/ConfigDB/ManagementActor). - **Wave 4:** T30 ‖ T31 (consume the resolver) ‖ T26 (`TemplateEdit.razor` + resolve service). - **Wave 5 — integration.** Classifications: T24, T26, T32, integration = **high-risk**; T23, T25, T30, T31 = **standard**; T22, T28, CLI = **small**. ## Integration (first-class verification phase) Per `integration-catches-cross-cutting-gaps`: - Full-solution `dotnet build ZB.MOM.WW.ScadaBridge.slnx`; **EF model-drift check** for the new `SharedSchema` entity (the M2-pre `PendingModelChangesWarning` lesson — idempotent migration, no pending changes). - **Trace every new ManagementActor command end-to-end** through the registry + handler routing: `ReorderTemplateFolderCommand`, `MoveDataConnectionCommand`, the `GetResolvedTemplateMembersCommand`, the schema-library CRUD commands, and the CLI's two cached-call commands (confirm `ManagementCommandRegistry` mappings so the CLI resolves names). - **Re-run the full bUnit suites of every shared component touched** (TreeView, `TemplateFolderTree`, `TemplateEdit`, `DataConnections`, `ParameterValueForm`, `SchemaBuilder`) — register substitutes for any newly-injected service in their existing fixtures. - `bash docker/deploy.sh` rebuild + `/health/ready` smoke on central-a/central-b/LB; Playwright coverage for the new UI surfaces (search, reorder menu, move dialog, connection health badge, schema library, schema-driven form). ## Testing strategy - **T22:** filter unit/bUnit (match, auto-expand, clear). - **T23:** reorder swap (ends no-op, ordering persists); root-menu render. - **T24:** guard tests — binding-blocks (error names instances), name-collision-blocks, success path; audit row asserted. - **T25:** health map query (name→id resolution, missing report), badge render. - **T26:** multi-level chain (A→B→C), base member added after derive shows in editor, locked member display, staleness banner true/false; adversarial chain/composition-derived cases. - **T28:** Advisory preserves current pass/fail; Strict escalates each advisory finding to a deploy-block. - **T30:** nested object/list render + per-field validation (incl. `$ref`-resolved schema). - **T31:** schema fed to Monaco (smoke/bUnit where feasible). - **T32:** `$ref` resolution (valid, dangling→deploy-block, depth/cycle guard); migration idempotency; CRUD round-trip. - **CLI:** command-name registry mapping; retry/discard happy-path + not-parked/unreachable mapping. ## Risks - **T32 migration ↔ EF model drift** — idempotent migration + a model-drift assertion in integration. - **T26 resolution semantics** on multi-level + locked + composition-derived templates — adversarial chain tests; keep it strictly read-only to avoid any deploy-path regression. - **T28 may be near-complete** — confirm the exact current advisory set before sizing; the deliverable is the toggle + escalation, not re-building analysis. - **Shared-component injection regressions** (T25/T26/T30 inject into reused components) — the integration wave re-runs each touched component's full fixture suite. - **CLI registry gap** — if `RetryParkedMessageCommand`/`DiscardParkedMessageCommand` aren't registered for name resolution, the CLI call fails; verified in Wave 1 and re-asserted at integration. ## Next step Hand off to the writing-plans skill to produce the bite-sized, per-task implementation plan and `.tasks.json`, then execute subagent-driven wave-by-wave. Finish via finishing-a-development-branch (FF-merge to main + push, no force; docker rebuild to match main).