Files
ScadaBridge/docs/plans/2026-06-06-playwright-coverage-fill-design.md
T

214 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Playwright Coverage Fill — Design
**Date:** 2026-06-06
**Status:** Approved (brainstorming complete) → ready for writing-plans
**Component:** #9 Central UI — `tests/ZB.MOM.WW.ScadaBridge.CentralUI.PlaywrightTests`
**Predecessor:** [2026-06-05 Playwright Coverage Expansion](2026-06-05-playwright-coverage-expansion-design.md) (the first wave; this is the close-out)
## Goal
Close the remaining functional and edge-case gaps found in the 2026-06-06 coverage
re-audit — every untested Tier 13 page plus a cross-cutting edge-case sweep on the
already-covered pages — delivered as **4 risk-tiered waves** inside the existing
xunit + `PlaywrightFixture` harness, against the live 8-node docker cluster.
## Background — the re-audit
After the first expansion (suite at 83 passing), a re-audit mapped all 38 routable pages
against the test catalog. Findings:
- ~14 of 38 pages have real functional coverage; the rest are nav-only (h4 heading) or
untested.
- **Biggest blind spot:** `InstanceConfigure` (`/deployment/instances/{id}/configure`) —
the most complex mutating page (5 override subsystems, browse dialog, test-bindings) —
has zero coverage.
- Edge cases are systematically uncovered across *all* pages: no duplicate-name validation,
no cancel flows, no empty states, no pagination, no filter-combination, no error/500
paths, no wrong-passphrase import.
- The suite is happy-path + outcome-tolerant by design (a deliberate, defensible choice
that matches the relay reality) — but a page that renders-but-misbehaves on a non-default
input can still stay green.
## Decisions (settled during brainstorming)
| # | Decision | Choice | Rationale |
|---|----------|--------|-----------|
| D1 | How much to take on | **Everything** — Tier 13 functional gaps + the edge sweep | User chose full close-out; phased so it stays tractable. |
| D2 | Structure | **Risk-tiered phased waves** (4) | Front-loads highest-risk surfaces; each wave is a shippable, green, zero-residue increment; mirrors the structure that worked for wave 1. |
| D3 | Test selectors on hook-poor pages | **Add `data-test` hooks as needed** (minimal, additive, non-functional) | Deep assertions on the complex forms (esp. `InstanceConfigure`) are otherwise brittle; matches the convention the audited Audit/SiteCalls/Transport pages already follow. |
| D4 | Real-time / streaming tests | **Drive real behavior, outcome-tolerant, generous timeouts, `SkippableFact`** | Highest value; managed flake risk on a handful of tests, with a documented downgrade-to-render-guard fallback. |
## Key enabler — the CLI mirrors the whole surface
The `scadabridge` CLI exposes create/list/delete (and read-back) for **every** entity the
new suites need, including the entire `InstanceConfigure` surface:
- `instance set-bindings | set-overrides | alarm-override set/delete/list |
native-alarm-source set/clear | set-area | diff | get`
- `security api-key create/list/delete/update/set-methods`
- `data-connection`, `db-connection`, `external-system` (+ `method`), `api-method`,
`shared-script`, `notification` (list/create/delete + `smtp`) create/list/delete
→ New functional suites need **no DB seeders**. Fixtures are CLI-provisioned and
persistence is verified by CLI read-back (`instance get`, `security api-key list`). DB
seeding remains only in the existing Audit/SiteCalls/Notification *report* edge tests that
require a `Parked` row (site SQLite isn't reachable, so those stay direct-SQL).
## Wave structure
| Wave | Theme | New suites | New fixtures |
|------|-------|-----------|--------------|
| **1** | Tier 1 — deepest mutating surfaces | `InstanceConfigureTests`, `ApiKeyCrudTests`, `TransportExportTests` (+ wrong-passphrase import negative) | `InstanceConfigureFixture`, `ApiSurfaceFixture` |
| **2** | Tier 2 — real-time / relay | `DeploymentsRealtimeTests`, `TopologyAreaTests`, `DebugViewTests`, `ParkedMessagesActionTests`, Discard click-throughs (SiteCalls + Notification) | reuse `DeploymentFixture` / `InstanceConfigureFixture` |
| **3** | Tier 3 — config CRUD breadth | `NotificationListCrudTests`, `SmtpConfigTests`, `NotificationKpisTests`, `DataConnectionCrudTests`, `ExternalSystemCrudTests`, `SharedScriptCrudTests`, `ApiMethodFormTests`, `EventLogsTests`, `AuditConfigurationTests` | small per-suite CLI helpers |
| **4** | Cross-cutting edge sweep | extend `SiteCrudTests`, `TemplateCrudTests`, `LdapMappingCrudTests`, `AuditLogPageTests`, `SiteCallsPageTests`, `NotificationActionTests` | none |
Each wave ends green with zero residue — a clean stop/ship point. Inside a wave, suites
are independent (disjoint fixtures/files) → parallel-dispatchable. Rough size: W1 ≈ 3 files
/~12 tests, W2 ≈ 5 files/~14, W3 ≈ 9 files/~20, W4 ≈ 6 extended files/~20. Total ≈
**23 files, ~65 new cases.**
## Shared infrastructure
**CLI helper extensions** (`CliRunner.Helpers.cs`, same throw-vs-swallow split):
- *Provision (throw):* `CreateDataConnectionAsync`, `CreateExternalSystemAsync`,
`CreateApiMethodAsync`, `CreateNotificationListAsync`, `CreateSharedScriptAsync`.
- *Verify (read-back, throw):* `GetInstanceAsync(id)`, `ListApiKeysAsync`.
- *Teardown (best-effort):* `DeleteApiKeyAsync`, `DeleteDataConnectionAsync`,
`DeleteExternalSystemAsync`, `DeleteNotificationListAsync`, `DeleteSharedScriptAsync` —
all keyed on `zztest-*`.
**New fixtures** (`IAsyncLifetime` + `IClassFixture`, partial-init guard + `Available`
flag, mirroring `DeploymentFixture`):
- **`InstanceConfigureFixture`** — site-a: `zztest` template + attribute(s) + `zztest`
data-connection + an instance, **deployed**. Disposes instance → connection → template.
- **`ApiSurfaceFixture`** — `zztest` external-system + one api-method, so the API-key form
renders method checkboxes. Disposes api-method + external-system; created keys deleted
per-test via `security api-key delete`.
**`data-test` hooks** (minimal, additive, each its own tiny commit so the app diff is
auditable/revertable): `InstanceConfigure.razor` subsystem anchors; row anchors on
`EventLogs.razor` / `ConfigurationAuditLog.razor`; save/result anchors on a few Design
forms. Only where no stable existing selector (id/aria-label/text) exists.
**Reused as-is:** `ClusterAvailability` skip gate + `SkipSummaryReporter`;
`PlaywrightFixture`; toast = `.toast` web-first `ToHaveCountAsync(1)`; confirm =
`.modal-footer .btn-danger` (Delete) / `.btn-primary` (Confirm); `zztest-<kind>-<8hex>`
naming; `PlaywrightDbConnection` (only Wave-4 report-page Parked-row edge tests).
## Per-wave test shapes
### Wave 1 — Tier 1 mutating surfaces
**`InstanceConfigureTests`** (`InstanceConfigureFixture`):
- *Bindings round-trip* — bulk "Assign all to" the zztest connection → Save Bindings →
toast; verify persisted via `GetInstanceAsync` (not just a toast).
- *Attribute-override round-trip* — type text override → Save Overrides → toast →
`GetInstanceAsync` shows it.
- *Area reassignment* — select area → Set Area → toast → `GetInstanceAsync` confirms.
- *(⚠ feasibility)* Alarm-override Edit→Save→badge→Clear — needs a template attribute with
an alarm; resolve via `instance alarm-override set` precondition or downgrade to a
section-renders assertion. Other three tests don't depend on alarms.
- *Edge* — `/deployment/instances/999999/configure` → `alert-danger` not-found.
**`ApiKeyCrudTests`** (`ApiSurfaceFixture`; teardown `security api-key delete`):
- Create→token reveal (`data-test="created-token"`, Copy works); Enable/Disable toast +
badge; Delete confirm `.btn-danger` → row gone; Edit (name disabled, toggle method).
- Edges: empty name → validation; uncheck all methods → "select at least one".
**`TransportExportTests`** + negative import:
- Export — CLI-create zztest template → `/design/transport/export` → select → passphrase →
Export → assert the bundle download (filename/size).
- Wrong-passphrase import — feed the exported bundle to the import wizard with a bad
passphrase → Unlock error state (passphrase input stays, error shown), no `diff-summary`.
### Wave 2 — Tier 2 real-time / relay (drive behavior, tolerant, generous timeouts, `SkippableFact`)
- **`DeploymentsRealtimeTests`** — on `/deployment/deployments`, CLI `instance deploy` a
fixture instance → status row appears within ~20s (SignalR push). Second: Pause → deploy
→ row absent → Refresh → row present.
- **`TopologyAreaTests`** — create area (toolbar + site context-menu); inline rename
(Enter commits + toast; Escape reverts); move area; move instance; *(⚠)* Diff dialog
opens for a deployed instance (tolerant).
- **`DebugViewTests`** — select site-a + enabled instance → Connect → badge "Live" +
snapshot/table region renders (tolerant: rows *or* "Waiting for snapshot"). Disconnect →
tables gone, selects re-enabled.
- **`ParkedMessagesActionTests`** — render+controls guard on the page's own Retry/Discard
button presence/enablement (site SQLite not seedable → no click-through here).
- **Discard click-throughs** — add the symmetric Discard-clicked test to
`SiteCallsPageTests` and `NotificationActionTests` (Retry is already click-through-tested).
### Wave 3 — Tier 3 config CRUD (breadth; CLI teardown by zztest name)
- **`NotificationListCrudTests`** — create → add recipient → delete recipient (no confirm)
→ delete list (confirm + "Deleted." toast).
- **`SmtpConfigTests`** (RequireAdmin) — add (host+from required) → saved toast →
credentials "(stored)"; edit cancel/save.
- **`NotificationKpisTests`** — load → 5 tiles resolve (or "—") → Refresh spinner.
- **`DataConnectionCrudTests` / `ExternalSystemCrudTests` / `SharedScriptCrudTests`** —
create→(edit)→delete round-trips.
- **`ApiMethodFormTests`** — create method (name+timeout, minimal Monaco script, default
schema) → Save → appears under external systems (Monaco interaction minimal/tolerant).
- **`EventLogsTests`** — Search disabled until site selected; select site → Search; filter
by Severity; row expand/collapse; Load more appends.
- **`AuditConfigurationTests`** — load `/audit/configuration`; search narrows; Prev disabled
on page 1; large-state modal open/close; copy-entity-id toast; `?bundleImportId=` chip
drill-in.
### Wave 4 — cross-cutting edge sweep (⚠ verify each page's real validation first)
- **Sites** — duplicate-identifier (assert whatever the app surfaces), cancel-from-edit,
invalid Akka/gRPC URL *if validated*.
- **Templates** — duplicate-name, cancel, edit existing attribute, delete-when-instances-
exist (blocking).
- **LDAP** — duplicate group name, missing-field validation.
- **Audit Log** — filter combination (channel+site+time), empty-results-after-Apply, drawer
close (X/Escape), non-API row has no cURL button, pagination.
- **Site Calls** — filter by status, empty state, keyset pagination (`site-calls-prev/next`
hooks exist, unused today).
- **Notification Report** — filter combos, "Stuck only", detail modal open/close, pagination.
## Error handling & verification strategy
**Validation-behavior protocol (the ⚠ items).** Before asserting a specific failure mode,
the implementer reads the page code-behind to learn what it actually does — inline
`_formError` vs error toast vs DB-constraint bubble vs silent success — and asserts that
reality. Where the app doesn't validate (relies on a DB constraint with no friendly
message), the test asserts the real surfaced behavior and a code comment notes the gap.
Same protocol resolves the alarm-override and Diff-dialog feasibility ⚠s.
**Flake management (real-time/streaming).** Web-first assertions only (`Expect(...)` with
explicit timeouts), never `WaitForTimeout` + read. Generous ceilings (~20s). Outcome-
tolerant where the cluster's response isn't deterministic. All `SkippableFact` +
`ClusterAvailability` gated. Documented fallback: if a real-time test is irreducibly flaky,
downgrade *that test only* to a render+controls guard.
**Teardown & residue.** Best-effort cleanup in `DisposeAsync`/`finally`, keyed on
`zztest-*`. Each wave ends with the no-residue check (`site/template/instance/
data-connection/api-key/notification list` via CLI + DB marker scan) → zero. Mutating tests
that target real site-a leave it as found.
**Per-wave gate.** Wave is "done" only when: new tests pass against the live cluster, the
full suite stays at **0 failed** (skips logged), zero residue, and `dotnet build` is clean
(`TreatWarningsAsErrors=true`). `data-test` additions verified not to alter rendered
behavior.
## Scope guard (YAGNI)
No new page-object framework, no CI wiring, no parallelization/runner changes, no
visual-regression/screenshot testing, no perf testing. New `data-test` attributes are the
only app-code change and are purely additive. Everything slots into the existing structure.
## Success criteria
All 4 waves merged; ~23 files / ~65 new cases; every Tier 13 page from the audit has
functional coverage; the edge sweep adds duplicate/cancel/empty/filter-combo/pagination
assertions to the previously happy-path-only pages; suite green with logged skips; zero
residue.
## Native tasks
Brainstorming checklist tasks #73#78 track this design through to writing-plans. The
implementation plan (produced next by the writing-plans skill) carries its own task set,
one wave at a time.