Files
ScadaBridge/docs/plans/2026-06-07-template-alarm-cli-and-override-coverage-design.md
T

14 KiB
Raw Blame History

Template-Alarm CLI Ergonomics + InstanceConfigure Alarm-Override Coverage — Design

Date: 2026-06-07 Status: Approved (brainstorming complete) → ready for writing-plans Component: #9 Central UI (InstanceConfigure) + the scadabridge CLI + the Playwright harness Predecessor / context: closes the one functional gap deferred across the 4-wave coverage-fill effort (alarm-override UI), plus the cheap Wave-4 review-hygiene items.

Goal

Three sequential pieces of work:

  • (A) Make template alarms ergonomically CLI-provisionable — add typed setpoint flags to the existing template alarm add verb so callers don't hand-write trigger-config JSON; add a CliRunner.AddAlarmAsync test helper.
  • (B) Add the deferred alarm-override UI coverage on InstanceConfigure (Edit → set priority override → Save → badge → Clear), gated on additive data-test hooks.
  • (C) Tackle the cheap, safe Wave-4 hygiene items.

Premise correction (verified during brainstorming)

The Wave-1 deferral TODO (InstanceConfigureTests.cs) says "template alarms are not CLI-provisionable today." That is STALE. The CLI already has a fully-wired template alarm add | update | delete:

template alarm add --template-id <id> --name <n> \
  --trigger-type <ValueMatch|RangeViolation|RateOfChange|HiLo|Expression> \
  --priority <0-1000> [--description <d>] [--trigger-config <json>] [--locked]

End-to-end path (no cluster change needed to add an alarm): CLI BuildAlarmAddTemplateAlarmCommand DTO → POST /managementManagementActor.HandleAddAlarm (role-gated Designer) → TemplateService.AddAlarmAsyncITemplateEngineRepository.AddTemplateAlarmAsync + audit.

So the only real gaps are: (1) no ergonomic typed flags (you must hand-write --trigger-config JSON for setpoints); (2) no CliRunner.AddAlarmAsync test helper; (3) the InstanceConfigure alarm-override section is hook-poor and untested.

Verified domain facts (grounding for the plan)

Alarm entity — TemplateAlarm (src/ZB.MOM.WW.ScadaBridge.Commons/Entities/Templates/TemplateAlarm.cs, table TemplateAlarms):

  • Child of Template (FK TemplateId), NOT of TemplateAttribute. Binds to an attribute by name inside the TriggerConfiguration JSON (attributeName key).
  • Fields: Id, TemplateId, Name (unique within template), Description?, PriorityLevel (01000), IsLocked, TriggerType (AlarmTriggerType enum), TriggerConfiguration? (JSON ≤4000 chars), OnTriggerScriptId?, IsInherited, LockedInDerived.
  • AlarmTriggerType enum: ValueMatch | RangeViolation | RateOfChange | HiLo | Expression.
  • Minimal valid alarm: Name + TriggerType + PriorityLevel. No trigger-config required to make a row exist (it just won't fire); an unlocked alarm is enough to render an override row.

Trigger-config JSON shapes (authoritative key names from src/ZB.MOM.WW.ScadaBridge.CentralUI/Components/Shared/AlarmTriggerConfigCodec.cs, which is internal):

  • ValueMatch{ attributeName, matchValue } (a "!=X" prefix on matchValue means not-equals)
  • RangeViolation{ attributeName, min, max }
  • RateOfChange{ attributeName, thresholdPerSecond, windowSeconds, direction }
  • HiLo{ attributeName, loLo, lo, hi, hiHi, loLoPriority, loPriority, hiPriority, hiHiPriority, *Deadband, *Message }
  • Expression{ expression } (no attributeName)

Instance alarm override — InstanceAlarmOverride (src/.../Commons/Entities/Instances/InstanceAlarmOverride.cs, table InstanceAlarmOverrides):

  • Fields: Id, InstanceId (FK, cascade), AlarmCanonicalName (matches TemplateAlarm.Name for direct alarms), TriggerConfigurationOverride? (JSON; partial-merge for HiLo, whole-replace otherwise), PriorityLevelOverride?. Unique (InstanceId, AlarmCanonicalName). No "enabled" flag — only trigger-config and/or priority.
  • Save/clear service: InstanceService.SetAlarmOverrideAsync / DeleteAlarmOverrideAsync. CLI verbs already exist: instance alarm-override set|delete|list.
  • Read-back: GetInstanceByIdAsync eager-loads .Include(i => i.AlarmOverrides), so the existing CliRunner.GetInstanceDocumentAsync (instance get) surfaces an alarmOverrides array alongside the connectionBindings/attributeOverrides the current tests already read. (Plan must confirm the exact camelCase JSON key.)

InstanceConfigure alarm-override UI (src/.../CentralUI/Components/Pages/Deployment/InstanceConfigure.razor, single-file @code):

  • Section identified only by header text <strong>Alarm Overrides</strong>no data-test hooks anywhere in the alarm section (the whole file has only 4 hooks: audit-link, instance-error-alert, binding-bulk-select, area-select). AlarmTriggerEditor.razor has zero hooks.
  • Rows render from _overridableAlarms = alarms.Where(a => !a.IsLocked). Empty → "No overridable (non-locked) alarms on this template." So the fixture template must carry a non-locked alarm.
  • Per row: Edit button → opens a modal (@if (_editingAlarm != null).modal.show.d-block). Modal has an <AlarmTriggerEditor> + a priority override <input type=number min=0 max=1000> (placeholder = inherited priority) + footer Cancel / Save Override (.btn-success.btn-sm) / Clear Override (when an override exists).
  • Save semantics: if the trigger-config diff is empty AND the priority field is empty, Save deletes any existing override. So the test's reliable "create override" delta is setting the priority field.
  • Override-set badge: <span class="badge bg-warning text-dark">●</span> in the row, shown only when HasOverride(name). Row-level Clear button (.btn-outline-danger.btn-sm) → ClearAlarmOverrideDeleteAlarmOverrideAsync immediately, no confirm dialog; toast "Cleared override on '{name}'."

Existing fixture/test (tests/.../Deployment/InstanceConfigureFixture.cs / InstanceConfigureTests.cs):

  • Fixture (CLI-provisioned on real site-a, instance not deployed): template + one Double attribute "Value" (AddAttributeAsync, --data-source "Value") + data-connection + area + instance. Public surface: int SiteAId/TemplateId/ConnectionId/AreaId/InstanceId, string AttributeName => "Value", string ConnectionName, bool Available.
  • Test idioms: [Collection("Playwright")] + IClassFixture<InstanceConfigureFixture>; Skip.IfNot(_cfg.Available, ClusterAvailability.SkipReason); _fixture.NewAuthenticatedPageAsync("multi-role","password"); web-first Assertions.Expect(...).ToBeVisibleAsync(new(){Timeout=15_000}); toast Expect(.toast).ToHaveCountAsync(1); persistence via using var doc = await CliRunner.GetInstanceDocumentAsync(InstanceId).

Build/transport reality:

  • CLI is a host-side client (OutputType=Exe, only references Commons); runs from src/.../CLI/bin/Debug/net10.0/scadabridge.dll. A CLI-only change needs just dotnet build src/ZB.MOM.WW.ScadaBridge.CLI.
  • ManagementActor/TemplateService/CentralUI run in the cluster image; a .razor change needs bash docker/deploy.sh to reach the running cluster the tests hit.

Decisions (settled during brainstorming)

# Decision Choice Rationale
D1 Part-A scope Typed CLI flags + test helper (not just the helper) User wants the ergonomic enhancement; CLI serializes the JSON. CLI-only, no server change.
D2 Trigger-config serializer location New CLI-side AlarmTriggerConfigJson (don't move the UI's internal codec) Keeps the change CLI-only / no docker rebuild; the round-trip test verifies the JSON against the live server (the real contract). Cross-reference the codec for key names.
D3 Part-B selectors Add 6 additive data-test hooksbash docker/deploy.sh Matches the existing InstanceConfigure convention (binding-bulk-select/area-select); removes selector drift (esp. the priority input, which collides with AlarmTriggerEditor number inputs).
D4 Override-test delta Priority override (not trigger-config) Simplest reliable "create override" path; a no-delta save deletes instead.
D5 Fixture alarm type HiLo via the new typed flags (--attribute Value --hi 80 --hihi 95) Dogfoods the Part-A flags end-to-end. (Any non-locked alarm works for the override row.)
D6 Part-C scope Cheap & safe only plant-a→site-a, locator scoping, next-enabled re-assert, doc note. Leave the harmless suite-wide WaitForLoadState pattern.

Architecture / components

Part A — CLI ergonomic alarm flags + test helper (host-side build only)

  1. AlarmTriggerConfigJson (new, CLI project): Build(triggerType, attribute, matchValue, notEquals, min, max, thresholdPerSecond, windowSeconds, direction, loLo, lo, hi, hiHi, expression) → string? emitting the exact camelCase keys from AlarmTriggerConfigCodec. Only emits fields relevant to the trigger type; returns null when no typed flags are given. Comment cross-references the codec as the source of truth.
  2. template alarm add typed options in CLI/Commands/TemplateCommands.cs::BuildAlarm: --attribute, --match-value, --not-equals, --min, --max, --threshold-per-second, --window-seconds, --direction, --lolo, --lo, --hi, --hihi, --expression. When any are present, the handler builds the JSON and sets it as AddTemplateAlarmCommand.TriggerConfiguration. Raw --trigger-config (if supplied) takes precedence. No new DTO, no server change. (HiLo per-level priorities/deadbands/messages stay JSON-only — YAGNI.)
  3. CliRunner.AddAlarmAsync(int templateId, string name, string triggerType = "HiLo", int priority = 500, string? attribute = null, double? hi = null, double? hiHi = null, ...) in CliRunner.Helpers.cs (mirrors AddAttributeAsync; throws via RequireId).
  4. Round-trip [SkippableFact] in CliRunnerHelpersTests.cs: create a zztest template → AddAlarmAsync a HiLo via typed flags → assert returned id > 0 (proves the server accepted the serialized JSON) → finally delete template (cascade).

Part B — alarm-override coverage (app change → one docker rebuild)

  1. InstanceConfigure.razor hooks (additive, non-functional): data-test="alarm-override-row-{alarm.Name}" (row), alarm-edit-btn (Edit), alarm-priority-input (modal priority input), alarm-save-override (Save Override), alarm-override-badge (●), alarm-clear-btn (row Clear). bash docker/deploy.sh so the live UI exposes them.
  2. InstanceConfigureFixture: after AddAttributeAsync, call CliRunner.AddAlarmAsync(TemplateId, AlarmName, "HiLo", priority:500, attribute:"Value", hi:80, hiHi:95); expose string AlarmName. No teardown (template delete cascades the alarm).
  3. AlarmOverride_SetPriority_ThenClear_RoundTrips (extends InstanceConfigureTests.cs): navigate to the configure page → assert the alarm row shows "inherited" → alarm-edit-btn → fill alarm-priority-input = 750alarm-save-override → assert .toast count 1 + alarm-override-badge visible → CLI read-back GetInstanceDocumentAsync shows alarmOverrides[] with priorityLevelOverride == 750alarm-clear-btn → assert badge gone + read-back empty. finally: best-effort instance alarm-override delete (new CliRunner.DeleteInstanceAlarmOverrideAsync helper) so the shared fixture instance is left as found.

Part C — hygiene (cheap & safe)

  • sourceSite:"plant-a""site-a" in the older SiteCalls/Notification seeding tests.
  • Scope the pager-indicator locators (span.text-muted.small → pager-container-scoped) in the SiteCalls + Notification pagination tests.
  • Add the next enabled re-assert on the return-to-page-1 leg of both pagination tests.
  • Add the "update all three-word deletes together" maintenance note to DeleteRoleMappingAsync.
  • Test-only; verified by re-running the affected suites.

Error handling & verification

  • Validation-behavior protocol still applies: before asserting the override badge/persistence, the implementer reads the real save handler (a no-delta save deletes; priority is the reliable delta) and the alarmOverrides JSON key.
  • Build map: A → dotnet build src/...CLI (host) + test-project build; B → bash docker/deploy.sh (the only docker rebuild, for the hooks) then test-project build; C → test-project build.
  • Residue & shared-state: alarm round-trip uses a zztest template (cascade-deleted). The override test mutates the shared fixture instance's alarm override and clears it (+ finally cleanup) → instance left as found; the fixture's DisposeAsync tears down template/connection/area/instance. No other InstanceConfigure test touches alarm overrides; serial collection.
  • Per-part gate: each part ends with its new tests green against the live cluster, the full suite at 0 failed, zero residue, clean build under TreatWarningsAsErrors. Part B additionally confirms the data-test additions don't alter rendered behavior.

Scope guard (YAGNI)

No new server command/DTO (Part A rides the existing AddTemplateAlarmCommand). No HiLo per-level priority/deadband/message flags (raw JSON escape hatch remains). No suite-wide WaitForLoadState refactor. The data-test hooks are the only app-code change and are purely additive. No new fixtures beyond extending InstanceConfigureFixture.

Success criteria

template alarm add accepts typed setpoint flags (verified by a round-trip test); InstanceConfigure's alarm-override flow (set-priority → badge → clear) has functional coverage asserting persistence via CLI read-back; the cheap hygiene items are applied; full suite green with logged skips; zero residue; site-a and the shared fixture instance left as found.