docs(tests): design for Playwright coverage expansion (7 audit recs)

Captures the 2026-06-05 coverage audit's gaps and the approved approach for
closing them: ephemeral CLI-provisioned fixtures with outcome-tolerant asserts
for the mutating suites (deploy lifecycle, retry/discard, transport import),
UI CRUD round-trips, nav render hardening, a Health KPI load test, and a
standardized skip-and-log policy. Next: writing-plans turns this into tasks.
This commit is contained in:
Joseph Doherty
2026-06-05 09:39:35 -04:00
parent d33617d65d
commit cb3b3bf373
@@ -0,0 +1,156 @@
# Playwright Coverage Expansion — Design
**Date:** 2026-06-05
**Status:** Approved (brainstorming complete) → ready for writing-plans
**Component:** #9 Central UI — `tests/ZB.MOM.WW.ScadaBridge.CentralUI.PlaywrightTests`
## Goal
Close the functional-coverage gaps found in the 2026-06-05 Playwright coverage audit by
implementing all 7 audit recommendations: add ~1518 functional E2E tests, upgrade the
shallow navigation tests, and standardize the skip policy — all against the live 8-node
docker cluster, inside the existing xunit + `PlaywrightFixture` structure.
## Background — the audit
The suite (~68 tests) is **bimodal**: a deep, well-built audit/site-calls core wrapped in
a thin shell of navigation + nav-visibility tests, with a large blind spot over the app's
**mutating actions** (deploy, import, retry/discard, all CRUD writes).
Key gaps the 7 recommendations target:
1. **Topology instance lifecycle** (Deploy/Enable/Disable/Delete) — Akka-singleton relays,
the exact surface the recent report-page hang lived in — untested beyond URL change.
2. **Parked-message / notification Retry/Discard** relays — untested (Site-Calls relay *is*
tested; it is the pattern to copy).
3. **Transport Import → Apply** — bulk writes across all central config, Admin-only — zero
coverage.
4. **Navigation tests assert URL only** — never that the destination rendered; a route
could 500 after navigation and stay green.
5. **No Health-dashboard load test** — the page that fans out to three singleton `Ask`s
every 10s has no assertion its KPI tiles resolve vs. hang/degrade.
6. **No successful persisted write through the UI** anywhere — the entire create/edit/delete
surface is functionally unverified end-to-end (`SiteCrudTests` only covers validation
failure; audit/site-calls "writes" are direct SQL seeds).
7. **Silent coverage cliffs** — DB-dependent tests are inconsistent (`AuditLogPageTests`
*throw* when MSSQL is down; Site-Calls/grid tests *skip*), and skips aren't surfaced.
## Decisions (settled during brainstorming)
| # | Decision | Choice | Rationale |
|---|---|---|---|
| D1 | Mutation fidelity for state-changing tests | **Ephemeral fixtures + outcome-tolerant** | High fidelity, isolated; matches the existing `SiteCallsPageTests` real-relay test that asserts the round-trip happened (toast: `Applied`/`NotParked`/`SiteUnreachable`), not a deep cluster side-effect. |
| D2 | How fixtures are created/torn down | **Shell out to the `scadabridge` CLI** | CLAUDE.md prefers the CLI for state setup; the CLI exposes every needed verb with `--format json` + clean `0`/`1` exit codes. |
| D3 | Skip vs fail when DB/cluster unavailable | **Standardize on Skip + log** | Consistent `SkippableFact` everywhere; a logged skipped-summary prevents a downed dependency from masquerading as full green. Local dev without the cluster still gets green on the rest. |
### CLI surface confirmed (host → `http://localhost:9000`, `multi-role`/`password`)
- `site create --name --identifier [--description]`, `site delete --id`, `site list/get`,
`site area create --site-id --name`, `site area delete --id`, `site deploy-artifacts`.
- `template create --name [--description] [--parent-id]`, `template delete --id`,
`template attribute add --template-id --name --data-type`, `template validate --id`.
- `instance create --name --template-id --site-id [--area-id]`, `instance deploy --id`,
`instance enable --id`, `instance disable --id`, `instance delete --id`.
- `bundle export --output --passphrase [--all|--include-dependencies]`, `bundle preview`,
`bundle import --input --passphrase --on-conflict`.
## Design
### Section 1 — Shared infrastructure (underpins recs 1, 2, 3, 6, 7)
**`CliRunner`** (new helper):
- Resolves the CLI via the built DLL of `src/ZB.MOM.WW.ScadaBridge.CLI` (invoked as
`dotnet <cli>.dll …`), falling back to a `scadabridge` on PATH. The test `.csproj` adds a
build-order `ProjectReference` to the CLI project so the binary always exists.
- Fixed args: `--url http://localhost:9000 --username multi-role --password password --format json`.
- `Task<JsonDocument> RunAsync(params string[] args)` — runs the subprocess, captures
stdout/stderr, throws on non-zero exit (stderr in the message), parses JSON stdout.
- Typed helpers: `CreateSiteAsync`, `CreateAreaAsync`, `CreateTemplateAsync` +
`AddAttributeAsync` (so the template validates), `CreateInstanceAsync`,
`DeleteInstanceAsync` / `DeleteTemplateAsync` / `DeleteSiteAsync`,
`BundleExportAsync(path, templateId, passphrase)`.
**Naming + teardown convention:** every provisioned entity is named `zztest-<8charguid>`
(sorts last; unambiguous for `LIKE 'zztest%'` safety-net deletes). Teardown is best-effort
(swallow errors), mirroring `AuditDataSeeder`/`SiteCallDataSeeder`.
**`ClusterAvailability` gate + skip logging (rec 7):**
- One shared probe (CLI `site list` succeeds *and* the existing DB `IsAvailableAsync`) →
`Skip.IfNot(...)` used uniformly across all DB/cluster tests.
- Convert `AuditLogPageTests`' 11 throw-on-unavailable tests to `SkippableFact`.
- A collection fixture's `DisposeAsync` writes one summary line
(`SKIPPED N tests — cluster/DB unavailable`) so skips are visible.
### Section 2 — Mutating action suites (recs 1, 2, 3)
**`DeploymentActionTests` (rec 1):** a `DeploymentFixture` (collection fixture) provisions
**one** `zztest` site + valid template once; each test creates its **own** throwaway
instance on it (cheap), acts via the Topology UI, deletes it:
- `Deploy_Instance_ConfirmsAndShowsOutcome` — Topology → context-menu Deploy → confirm
dialog → assert exactly one outcome toast (tolerating `Deployed`/`SiteUnreachable`) and
the status badge transitions off "not deployed".
- `Enable_Instance_ShowsOutcome`, `Disable_Instance_ShowsOutcome` — same shape.
- `Delete_Instance_RemovesFromTree` — UI delete → confirm → node disappears.
**`RetryDiscardActionTests` (rec 2):** reuse the existing direct-SQL seeders to seed a
`Parked` row, then drive the UI relay outcome-tolerantly (the `SiteCallsPageTests` pattern):
- Parked Messages: `Retry_ParkedMessage_ShowsOutcomeToast`,
`Discard_ParkedMessage_ShowsOutcomeToast` (seed a parked S&F message for a `zztest`
target on `site-a`).
- Notification Report: `Retry_ParkedNotification_ShowsOutcome`,
`Discard_ParkedNotification_ShowsOutcome` (seed a parked `Notifications` row).
- Each: confirm dialog → exactly one outcome toast (`Applied`/`NotParked`/`SiteUnreachable`);
best-effort row teardown.
**`TransportImportTests` (rec 3):** round-trip via CLI export + UI import:
1. CLI creates a `zztest` template → `bundle export --output /tmp/zztest-<guid>.bundle
--passphrase <p>` (1-template synthetic bundle).
2. UI Import wizard: upload the exported file → passphrase → diff/resolve (Add) →
type-env-name confirm → **Apply**.
3. Assert the result screen shows success, the imported template appears, and the audit
drill-in `?bundleImportId=` link is present.
4. CLI deletes the imported template(s).
- **Impl risk to de-risk first:** remote file upload — Playwright `SetInputFiles` streams the
host file to the container browser; fine for a tiny bundle, but the plan's first transport
step verifies the upload end-to-end before building the rest.
### Section 3 — Happy-path CRUD round-trips (rec 6)
Three pure-UI create→edit→delete tests (the suite currently verifies *no* successful
persisted write). Each ends by deleting what it made; a `zztest`-name safety-net teardown
guards mid-test failure:
- **Site** (extend `SiteCrudTests`): create via `/admin/sites/create` (name + identifier +
node addresses) → appears in list → edit description → delete → gone.
- **Template**: `/design/templates/create` → add an attribute on `/design/templates/{id}` →
delete.
- **LDAP mapping**: `/admin/ldap-mappings/create` (group + role) → edit role → delete.
### Section 4 — Shallow-coverage hardening (recs 4, 5)
**Nav render assertions (rec 4):** upgrade the 16 `NavigationTests` theory cases from "URL
changed" to *also* assert the destination's heading/content renders (a per-route
expected-heading map). No new tests — a strengthened helper.
**Health load test (rec 5):** `HealthDashboardTests.KpiTiles_ResolveToValues` — load
`/monitoring/health`, assert the three KPI tile groups (Notification-Outbox, Site-Call,
Audit) resolve to numeric values (not the em-dash degrade) within a timeout. A direct
regression guard for the singleton-hang class of bug.
## Verification
- Each new suite runs green against the live cluster; the full run stays at **0 failed**,
with any skips logged.
- Mutating tests leave **no residual** `zztest-*` entities — verified by a post-run
`site list` / `template list` check.
- Build: `dotnet build` the test project (picks up the new CLI `ProjectReference`), then
`dotnet test`.
## Scope guard (YAGNI)
No new page-object framework, no CI wiring, no parallelization changes. Everything slots
into the existing xunit + `PlaywrightFixture` structure.
## Native tasks
Brainstorming checklist tasks #51#56 track this design through to writing-plans. The
implementation plan (produced next by the writing-plans skill) will carry its own task set.