ScadaBridge/docs/plans/2026-06-05-playwright-coverage-expansion-design.md

# Playwright Coverage Expansion — Design

**Date:** 2026-06-05
**Status:** Approved (brainstorming complete) → ready for writing-plans
**Component:** #9 Central UI — `tests/ZB.MOM.WW.ScadaBridge.CentralUI.PlaywrightTests`

## Goal

Close the functional-coverage gaps found in the 2026-06-05 Playwright coverage audit by
implementing all 7 audit recommendations: add ~15–18 functional E2E tests, upgrade the
shallow navigation tests, and standardize the skip policy — all against the live 8-node
docker cluster, inside the existing xunit + `PlaywrightFixture` structure.

## Background — the audit

The suite (~68 tests) is **bimodal**: a deep, well-built audit/site-calls core wrapped in
a thin shell of navigation + nav-visibility tests, with a large blind spot over the app's
**mutating actions** (deploy, import, retry/discard, all CRUD writes).

Key gaps the 7 recommendations target:

1. **Topology instance lifecycle** (Deploy/Enable/Disable/Delete) — Akka-singleton relays,
   the exact surface the recent report-page hang lived in — untested beyond URL change.
2. **Parked-message / notification Retry/Discard** relays — untested (Site-Calls relay *is*
   tested; it is the pattern to copy).
3. **Transport Import → Apply** — bulk writes across all central config, Admin-only — zero
   coverage.
4. **Navigation tests assert URL only** — never that the destination rendered; a route
   could 500 after navigation and stay green.
5. **No Health-dashboard load test** — the page that fans out to three singleton `Ask`s
   every 10s has no assertion its KPI tiles resolve vs. hang/degrade.
6. **No successful persisted write through the UI** anywhere — the entire create/edit/delete
   surface is functionally unverified end-to-end (`SiteCrudTests` only covers validation
   failure; audit/site-calls "writes" are direct SQL seeds).
7. **Silent coverage cliffs** — DB-dependent tests are inconsistent (`AuditLogPageTests`
   *throw* when MSSQL is down; Site-Calls/grid tests *skip*), and skips aren't surfaced.

## Decisions (settled during brainstorming)

| # | Decision | Choice | Rationale |
|---|---|---|---|
| D1 | Mutation fidelity for state-changing tests | **Ephemeral fixtures + outcome-tolerant** | High fidelity, isolated; matches the existing `SiteCallsPageTests` real-relay test that asserts the round-trip happened (toast: `Applied`/`NotParked`/`SiteUnreachable`), not a deep cluster side-effect. |
| D2 | How fixtures are created/torn down | **Shell out to the `scadabridge` CLI** | CLAUDE.md prefers the CLI for state setup; the CLI exposes every needed verb with `--format json` + clean `0`/`1` exit codes. |
| D3 | Skip vs fail when DB/cluster unavailable | **Standardize on Skip + log** | Consistent `SkippableFact` everywhere; a logged skipped-summary prevents a downed dependency from masquerading as full green. Local dev without the cluster still gets green on the rest. |

### CLI surface confirmed (host → `http://localhost:9000`, `multi-role`/`password`)

- `site create --name --identifier [--description]`, `site delete --id`, `site list/get`,
  `site area create --site-id --name`, `site area delete --id`, `site deploy-artifacts`.
- `template create --name [--description] [--parent-id]`, `template delete --id`,
  `template attribute add --template-id --name --data-type`, `template validate --id`.
- `instance create --name --template-id --site-id [--area-id]`, `instance deploy --id`,
  `instance enable --id`, `instance disable --id`, `instance delete --id`.
- `bundle export --output --passphrase [--all|--include-dependencies]`, `bundle preview`,
  `bundle import --input --passphrase --on-conflict`.

## Design

### Section 1 — Shared infrastructure (underpins recs 1, 2, 3, 6, 7)

**`CliRunner`** (new helper):
- Resolves the CLI via the built DLL of `src/ZB.MOM.WW.ScadaBridge.CLI` (invoked as
  `dotnet <cli>.dll …`), falling back to a `scadabridge` on PATH. The test `.csproj` adds a
  build-order `ProjectReference` to the CLI project so the binary always exists.
- Fixed args: `--url http://localhost:9000 --username multi-role --password password --format json`.
- `Task<JsonDocument> RunAsync(params string[] args)` — runs the subprocess, captures
  stdout/stderr, throws on non-zero exit (stderr in the message), parses JSON stdout.
- Typed helpers: `CreateSiteAsync`, `CreateAreaAsync`, `CreateTemplateAsync` +
  `AddAttributeAsync` (so the template validates), `CreateInstanceAsync`,
  `DeleteInstanceAsync` / `DeleteTemplateAsync` / `DeleteSiteAsync`,
  `BundleExportAsync(path, templateId, passphrase)`.

**Naming + teardown convention:** every provisioned entity is named `zztest-<8charguid>`
(sorts last; unambiguous for `LIKE 'zztest%'` safety-net deletes). Teardown is best-effort
(swallow errors), mirroring `AuditDataSeeder`/`SiteCallDataSeeder`.

**`ClusterAvailability` gate + skip logging (rec 7):**
- One shared probe (CLI `site list` succeeds *and* the existing DB `IsAvailableAsync`) →
  `Skip.IfNot(...)` used uniformly across all DB/cluster tests.
- Convert `AuditLogPageTests`' 11 throw-on-unavailable tests to `SkippableFact`.
- A collection fixture's `DisposeAsync` writes one summary line
  (`SKIPPED N tests — cluster/DB unavailable`) so skips are visible.

### Section 2 — Mutating action suites (recs 1, 2, 3)

**`DeploymentActionTests` (rec 1):** a `DeploymentFixture` (collection fixture) provisions
**one** `zztest` site + valid template once; each test creates its **own** throwaway
instance on it (cheap), acts via the Topology UI, deletes it:
- `Deploy_Instance_ConfirmsAndShowsOutcome` — Topology → context-menu Deploy → confirm
  dialog → assert exactly one outcome toast (tolerating `Deployed`/`SiteUnreachable`) and
  the status badge transitions off "not deployed".
- `Enable_Instance_ShowsOutcome`, `Disable_Instance_ShowsOutcome` — same shape.
- `Delete_Instance_RemovesFromTree` — UI delete → confirm → node disappears.

**`RetryDiscardActionTests` (rec 2):** reuse the existing direct-SQL seeders to seed a
`Parked` row, then drive the UI relay outcome-tolerantly (the `SiteCallsPageTests` pattern):
- Parked Messages: `Retry_ParkedMessage_ShowsOutcomeToast`,
  `Discard_ParkedMessage_ShowsOutcomeToast` (seed a parked S&F message for a `zztest`
  target on `site-a`).
- Notification Report: `Retry_ParkedNotification_ShowsOutcome`,
  `Discard_ParkedNotification_ShowsOutcome` (seed a parked `Notifications` row).
- Each: confirm dialog → exactly one outcome toast (`Applied`/`NotParked`/`SiteUnreachable`);
  best-effort row teardown.

**`TransportImportTests` (rec 3):** round-trip via CLI export + UI import:
1. CLI creates a `zztest` template → `bundle export --output /tmp/zztest-<guid>.bundle
   --passphrase <p>` (1-template synthetic bundle).
2. UI Import wizard: upload the exported file → passphrase → diff/resolve (Add) →
   type-env-name confirm → **Apply**.
3. Assert the result screen shows success, the imported template appears, and the audit
   drill-in `?bundleImportId=` link is present.
4. CLI deletes the imported template(s).
- **Impl risk to de-risk first:** remote file upload — Playwright `SetInputFiles` streams the
  host file to the container browser; fine for a tiny bundle, but the plan's first transport
  step verifies the upload end-to-end before building the rest.

### Section 3 — Happy-path CRUD round-trips (rec 6)

Three pure-UI create→edit→delete tests (the suite currently verifies *no* successful
persisted write). Each ends by deleting what it made; a `zztest`-name safety-net teardown
guards mid-test failure:
- **Site** (extend `SiteCrudTests`): create via `/admin/sites/create` (name + identifier +
  node addresses) → appears in list → edit description → delete → gone.
- **Template**: `/design/templates/create` → add an attribute on `/design/templates/{id}` →
  delete.
- **LDAP mapping**: `/admin/ldap-mappings/create` (group + role) → edit role → delete.

### Section 4 — Shallow-coverage hardening (recs 4, 5)

**Nav render assertions (rec 4):** upgrade the 16 `NavigationTests` theory cases from "URL
changed" to *also* assert the destination's heading/content renders (a per-route
expected-heading map). No new tests — a strengthened helper.

**Health load test (rec 5):** `HealthDashboardTests.KpiTiles_ResolveToValues` — load
`/monitoring/health`, assert the three KPI tile groups (Notification-Outbox, Site-Call,
Audit) resolve to numeric values (not the em-dash degrade) within a timeout. A direct
regression guard for the singleton-hang class of bug.

## Verification

- Each new suite runs green against the live cluster; the full run stays at **0 failed**,
  with any skips logged.
- Mutating tests leave **no residual** `zztest-*` entities — verified by a post-run
  `site list` / `template list` check.
- Build: `dotnet build` the test project (picks up the new CLI `ProjectReference`), then
  `dotnet test`.

## Scope guard (YAGNI)

No new page-object framework, no CI wiring, no parallelization changes. Everything slots
into the existing xunit + `PlaywrightFixture` structure.

## Native tasks

Brainstorming checklist tasks #51–#56 track this design through to writing-plans. The
implementation plan (produced next by the writing-plans skill) will carry its own task set.