Files
ScadaBridge/docs/plans/2026-06-07-template-alarm-cli-and-override-coverage-design.md
T

111 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Template-Alarm CLI Ergonomics + InstanceConfigure Alarm-Override Coverage — Design
**Date:** 2026-06-07
**Status:** Approved (brainstorming complete) → ready for writing-plans
**Component:** #9 Central UI (`InstanceConfigure`) + the `scadabridge` CLI + the Playwright harness
**Predecessor / context:** closes the one functional gap deferred across the [4-wave coverage-fill effort](2026-06-06-playwright-coverage-fill-design.md) (alarm-override UI), plus the cheap Wave-4 review-hygiene items.
## Goal
Three sequential pieces of work:
- **(A)** Make template alarms *ergonomically* CLI-provisionable — add typed setpoint flags to the existing `template alarm add` verb so callers don't hand-write trigger-config JSON; add a `CliRunner.AddAlarmAsync` test helper.
- **(B)** Add the deferred **alarm-override UI coverage** on `InstanceConfigure` (Edit → set priority override → Save → badge → Clear), gated on additive `data-test` hooks.
- **(C)** Tackle the cheap, safe Wave-4 hygiene items.
## Premise correction (verified during brainstorming)
The Wave-1 deferral TODO (`InstanceConfigureTests.cs`) says *"template alarms are not CLI-provisionable today."* **That is STALE.** The CLI already has a fully-wired `template alarm add | update | delete`:
```
template alarm add --template-id <id> --name <n> \
--trigger-type <ValueMatch|RangeViolation|RateOfChange|HiLo|Expression> \
--priority <0-1000> [--description <d>] [--trigger-config <json>] [--locked]
```
End-to-end path (no cluster change needed to add an alarm):
`CLI BuildAlarm``AddTemplateAlarmCommand` DTO → `POST /management``ManagementActor.HandleAddAlarm` (role-gated `Designer`) → `TemplateService.AddAlarmAsync``ITemplateEngineRepository.AddTemplateAlarmAsync` + audit.
So the only *real* gaps are: (1) no ergonomic typed flags (you must hand-write `--trigger-config` JSON for setpoints); (2) no `CliRunner.AddAlarmAsync` test helper; (3) the `InstanceConfigure` alarm-override section is hook-poor and untested.
## Verified domain facts (grounding for the plan)
**Alarm entity — `TemplateAlarm`** (`src/ZB.MOM.WW.ScadaBridge.Commons/Entities/Templates/TemplateAlarm.cs`, table `TemplateAlarms`):
- Child of **`Template`** (FK `TemplateId`), NOT of `TemplateAttribute`. Binds to an attribute *by name* inside the `TriggerConfiguration` JSON (`attributeName` key).
- Fields: `Id`, `TemplateId`, `Name` (unique within template), `Description?`, `PriorityLevel` (01000), `IsLocked`, `TriggerType` (`AlarmTriggerType` enum), `TriggerConfiguration?` (JSON ≤4000 chars), `OnTriggerScriptId?`, `IsInherited`, `LockedInDerived`.
- `AlarmTriggerType` enum: `ValueMatch | RangeViolation | RateOfChange | HiLo | Expression`.
- Minimal valid alarm: `Name` + `TriggerType` + `PriorityLevel`. No trigger-config required to make a row exist (it just won't *fire*); an unlocked alarm is enough to render an override row.
**Trigger-config JSON shapes** (authoritative key names from `src/ZB.MOM.WW.ScadaBridge.CentralUI/Components/Shared/AlarmTriggerConfigCodec.cs`, which is `internal`):
- `ValueMatch``{ attributeName, matchValue }` (a `"!=X"` prefix on matchValue means not-equals)
- `RangeViolation``{ attributeName, min, max }`
- `RateOfChange``{ attributeName, thresholdPerSecond, windowSeconds, direction }`
- `HiLo``{ attributeName, loLo, lo, hi, hiHi, loLoPriority, loPriority, hiPriority, hiHiPriority, *Deadband, *Message }`
- `Expression``{ expression }` (no attributeName)
**Instance alarm override — `InstanceAlarmOverride`** (`src/.../Commons/Entities/Instances/InstanceAlarmOverride.cs`, table `InstanceAlarmOverrides`):
- Fields: `Id`, `InstanceId` (FK, cascade), `AlarmCanonicalName` (matches `TemplateAlarm.Name` for direct alarms), `TriggerConfigurationOverride?` (JSON; partial-merge for HiLo, whole-replace otherwise), `PriorityLevelOverride?`. Unique `(InstanceId, AlarmCanonicalName)`. No "enabled" flag — only trigger-config and/or priority.
- Save/clear service: `InstanceService.SetAlarmOverrideAsync` / `DeleteAlarmOverrideAsync`. CLI verbs already exist: `instance alarm-override set|delete|list`.
- **Read-back:** `GetInstanceByIdAsync` eager-loads `.Include(i => i.AlarmOverrides)`, so the existing `CliRunner.GetInstanceDocumentAsync` (`instance get`) surfaces an `alarmOverrides` array alongside the `connectionBindings`/`attributeOverrides` the current tests already read. (Plan must confirm the exact camelCase JSON key.)
**InstanceConfigure alarm-override UI** (`src/.../CentralUI/Components/Pages/Deployment/InstanceConfigure.razor`, single-file `@code`):
- Section identified only by header text `<strong>Alarm Overrides</strong>`**no `data-test` hooks anywhere in the alarm section** (the whole file has only 4 hooks: `audit-link`, `instance-error-alert`, `binding-bulk-select`, `area-select`). `AlarmTriggerEditor.razor` has zero hooks.
- Rows render from `_overridableAlarms = alarms.Where(a => !a.IsLocked)`. Empty → `"No overridable (non-locked) alarms on this template."` So the fixture template **must carry a non-locked alarm**.
- Per row: `Edit` button → opens a modal (`@if (_editingAlarm != null)``.modal.show.d-block`). Modal has an `<AlarmTriggerEditor>` + a **priority override** `<input type=number min=0 max=1000>` (placeholder = inherited priority) + footer `Cancel` / `Save Override` (`.btn-success.btn-sm`) / `Clear Override` (when an override exists).
- **Save semantics:** if the trigger-config diff is empty AND the priority field is empty, Save **deletes** any existing override. So the test's reliable "create override" delta is **setting the priority field**.
- Override-set badge: `<span class="badge bg-warning text-dark">●</span>` in the row, shown only when `HasOverride(name)`. Row-level `Clear` button (`.btn-outline-danger.btn-sm`) → `ClearAlarmOverride``DeleteAlarmOverrideAsync` **immediately, no confirm dialog**; toast `"Cleared override on '{name}'."`
**Existing fixture/test** (`tests/.../Deployment/InstanceConfigureFixture.cs` / `InstanceConfigureTests.cs`):
- Fixture (CLI-provisioned on real `site-a`, instance **not deployed**): template + one `Double` attribute `"Value"` (`AddAttributeAsync`, `--data-source "Value"`) + data-connection + area + instance. Public surface: `int SiteAId/TemplateId/ConnectionId/AreaId/InstanceId`, `string AttributeName => "Value"`, `string ConnectionName`, `bool Available`.
- Test idioms: `[Collection("Playwright")]` + `IClassFixture<InstanceConfigureFixture>`; `Skip.IfNot(_cfg.Available, ClusterAvailability.SkipReason)`; `_fixture.NewAuthenticatedPageAsync("multi-role","password")`; web-first `Assertions.Expect(...).ToBeVisibleAsync(new(){Timeout=15_000})`; toast `Expect(.toast).ToHaveCountAsync(1)`; persistence via `using var doc = await CliRunner.GetInstanceDocumentAsync(InstanceId)`.
**Build/transport reality:**
- CLI is a **host-side** client (`OutputType=Exe`, only references `Commons`); runs from `src/.../CLI/bin/Debug/net10.0/scadabridge.dll`. A CLI-only change needs just `dotnet build src/ZB.MOM.WW.ScadaBridge.CLI`.
- `ManagementActor`/`TemplateService`/CentralUI run in the cluster image; a `.razor` change needs `bash docker/deploy.sh` to reach the running cluster the tests hit.
## Decisions (settled during brainstorming)
| # | Decision | Choice | Rationale |
|---|----------|--------|-----------|
| D1 | Part-A scope | **Typed CLI flags + test helper** (not just the helper) | User wants the ergonomic enhancement; CLI serializes the JSON. CLI-only, no server change. |
| D2 | Trigger-config serializer location | **New CLI-side `AlarmTriggerConfigJson`** (don't move the UI's `internal` codec) | Keeps the change CLI-only / no docker rebuild; the round-trip test verifies the JSON against the live server (the real contract). Cross-reference the codec for key names. |
| D3 | Part-B selectors | **Add 6 additive `data-test` hooks**`bash docker/deploy.sh` | Matches the existing InstanceConfigure convention (`binding-bulk-select`/`area-select`); removes selector drift (esp. the priority input, which collides with `AlarmTriggerEditor` number inputs). |
| D4 | Override-test delta | **Priority override** (not trigger-config) | Simplest reliable "create override" path; a no-delta save deletes instead. |
| D5 | Fixture alarm type | **HiLo** via the new typed flags (`--attribute Value --hi 80 --hihi 95`) | Dogfoods the Part-A flags end-to-end. (Any non-locked alarm works for the override row.) |
| D6 | Part-C scope | **Cheap & safe only** | `plant-a→site-a`, locator scoping, next-enabled re-assert, doc note. Leave the harmless suite-wide `WaitForLoadState` pattern. |
## Architecture / components
### Part A — CLI ergonomic alarm flags + test helper *(host-side build only)*
1. **`AlarmTriggerConfigJson` (new, CLI project):** `Build(triggerType, attribute, matchValue, notEquals, min, max, thresholdPerSecond, windowSeconds, direction, loLo, lo, hi, hiHi, expression) → string?` emitting the exact camelCase keys from `AlarmTriggerConfigCodec`. Only emits fields relevant to the trigger type; returns `null` when no typed flags are given. Comment cross-references the codec as the source of truth.
2. **`template alarm add` typed options** in `CLI/Commands/TemplateCommands.cs::BuildAlarm`: `--attribute`, `--match-value`, `--not-equals`, `--min`, `--max`, `--threshold-per-second`, `--window-seconds`, `--direction`, `--lolo`, `--lo`, `--hi`, `--hihi`, `--expression`. When any are present, the handler builds the JSON and sets it as `AddTemplateAlarmCommand.TriggerConfiguration`. Raw `--trigger-config` (if supplied) takes precedence. **No new DTO, no server change.** (HiLo per-level priorities/deadbands/messages stay JSON-only — YAGNI.)
3. **`CliRunner.AddAlarmAsync(int templateId, string name, string triggerType = "HiLo", int priority = 500, string? attribute = null, double? hi = null, double? hiHi = null, ...)`** in `CliRunner.Helpers.cs` (mirrors `AddAttributeAsync`; throws via `RequireId`).
4. **Round-trip `[SkippableFact]`** in `CliRunnerHelpersTests.cs`: create a `zztest` template → `AddAlarmAsync` a HiLo via typed flags → assert returned id > 0 (proves the server accepted the serialized JSON) → `finally` delete template (cascade).
### Part B — alarm-override coverage *(app change → one docker rebuild)*
1. **`InstanceConfigure.razor` hooks (additive, non-functional):** `data-test="alarm-override-row-{alarm.Name}"` (row), `alarm-edit-btn` (Edit), `alarm-priority-input` (modal priority input), `alarm-save-override` (Save Override), `alarm-override-badge` (●), `alarm-clear-btn` (row Clear). **`bash docker/deploy.sh`** so the live UI exposes them.
2. **`InstanceConfigureFixture`:** after `AddAttributeAsync`, call `CliRunner.AddAlarmAsync(TemplateId, AlarmName, "HiLo", priority:500, attribute:"Value", hi:80, hiHi:95)`; expose `string AlarmName`. No teardown (template delete cascades the alarm).
3. **`AlarmOverride_SetPriority_ThenClear_RoundTrips` (extends `InstanceConfigureTests.cs`):** navigate to the configure page → assert the alarm row shows "inherited" → `alarm-edit-btn` → fill `alarm-priority-input` = `750``alarm-save-override` → assert `.toast` count 1 + `alarm-override-badge` visible → CLI read-back `GetInstanceDocumentAsync` shows `alarmOverrides[]` with `priorityLevelOverride == 750``alarm-clear-btn` → assert badge gone + read-back empty. `finally`: best-effort `instance alarm-override delete` (new `CliRunner.DeleteInstanceAlarmOverrideAsync` helper) so the **shared fixture instance is left as found**.
### Part C — hygiene (cheap & safe)
- `sourceSite:"plant-a"``"site-a"` in the older SiteCalls/Notification seeding tests.
- Scope the pager-indicator locators (`span.text-muted.small` → pager-container-scoped) in the SiteCalls + Notification pagination tests.
- Add the `next` enabled re-assert on the return-to-page-1 leg of both pagination tests.
- Add the "update all three-word deletes together" maintenance note to `DeleteRoleMappingAsync`.
- Test-only; verified by re-running the affected suites.
## Error handling & verification
- **Validation-behavior protocol** still applies: before asserting the override badge/persistence, the implementer reads the real save handler (a no-delta save deletes; priority is the reliable delta) and the `alarmOverrides` JSON key.
- **Build map:** A → `dotnet build src/...CLI` (host) + test-project build; B → **`bash docker/deploy.sh`** (the only docker rebuild, for the hooks) then test-project build; C → test-project build.
- **Residue & shared-state:** alarm round-trip uses a `zztest` template (cascade-deleted). The override test mutates the *shared* fixture instance's alarm override and clears it (+ `finally` cleanup) → instance left as found; the fixture's `DisposeAsync` tears down template/connection/area/instance. No other InstanceConfigure test touches alarm overrides; serial collection.
- **Per-part gate:** each part ends with its new tests green against the live cluster, the full suite at 0 failed, zero residue, clean build under `TreatWarningsAsErrors`. Part B additionally confirms the `data-test` additions don't alter rendered behavior.
## Scope guard (YAGNI)
No new server command/DTO (Part A rides the existing `AddTemplateAlarmCommand`). No HiLo per-level priority/deadband/message flags (raw JSON escape hatch remains). No suite-wide `WaitForLoadState` refactor. The `data-test` hooks are the only app-code change and are purely additive. No new fixtures beyond extending `InstanceConfigureFixture`.
## Success criteria
`template alarm add` accepts typed setpoint flags (verified by a round-trip test); `InstanceConfigure`'s alarm-override flow (set-priority → badge → clear) has functional coverage asserting persistence via CLI read-back; the cheap hygiene items are applied; full suite green with logged skips; zero residue; `site-a` and the shared fixture instance left as found.