Files
lmxopcua/docs/plans/2026-06-18-adminui-cert-actions-design.md
T

9.9 KiB

AdminUI Certificate Store Actions — Design

Date: 2026-06-18 Status: Approved Backlog item: AdminUI Certificates page actions (reconciled ranked-OPEN list)

Goal

Turn the read-only /certificates AdminUI page into an operator surface that can trust, untrust, and delete peer certificates in the OPC UA server's PKI directory stores. Changes are honored live by the running server's certificate validator — no restart.

Background — current state

src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Certificates.razor (@page "/certificates", [Authorize], @rendermode InteractiveServer) reads four PKI directory stores directly from the filesystem in OnInitialized() and renders a table per store (subject / issuer / thumbprint / validity). It is display-only — zero action buttons, no backend service.

The four stores live under {OpcUa:PkiStoreRoot} (default pki), each with a certs/ subdir holding .der/.cer/.crt files:

Store label Subdir Actionable?
Own own/certs no (read-only)
Trusted peers trusted/certs yes
Trusted issuers issuer/certs no (read-only)
Rejected rejected/certs yes

The running server (OpcUaApplicationHost.BuildConfigurationAsync) points its SecurityConfiguration trust lists (TrustedPeerCertificates, RejectedCertificateStore, …) at these same on-disk paths. The SDK's DirectoryStore re-enumerates certs/ on each validation, so a file moved into trusted/certs is trusted live, and one removed is no longer trusted — no process restart required.

Action set (approved)

Symmetric trust/untrust + delete:

  • Rejected rows: [Trust] (→ trusted) and [Delete].
  • Trusted rows: [Untrust] (→ rejected) and [Delete].
  • Own / Issuer rows: unchanged (read-only).

Upload-to-trust is explicitly deferred (would add file-upload plumbing + an upload approval gate; the primary operator workflow — approve a peer that already tried to connect and landed in rejected — does not need it).

Architecture

Why an in-process service called directly (not a minimal-API endpoint)

A Blazor Server component cannot reliably dial its own HTTP endpoint server-side behind Traefik (durable lesson: project_blazor_server_self_hubconnection). The page already reads the stores via direct in-process filesystem access in OnInitialized; the write actions follow the same seam — the component calls an injected CertificateStoreManager directly, server-side, within the authenticated circuit. No HttpClient self-dial, no new endpoint.

Why pure filesystem (not the OPC UA SDK store API)

A directory-store cert is just a .der/.cer/.crt file in {store}/certs/. Trust = move the file rejected/certs → trusted/certs; delete = remove it. The filesystem approach:

  • adds no new dependency (the page already uses BCL X509CertificateLoader);
  • is identical to the existing read path, so behavior is predictable;
  • is trivially unit-testable on a temp directory;
  • finds the file by enumerate-and-match-thumbprint — never by building a path from caller input — so there is no path-injection surface;
  • is honored live because the SDK DirectoryStore re-enumerates certs/.

The SDK store API would add an async-store dependency for no behavioral gain on Directory stores.

Components

1. CertificateStoreManager (NEW)

src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Certificates/CertificateStoreManager.cs

public sealed record CertActionResult(bool Success, string? Error)
{
    public static CertActionResult Ok() => new(true, null);
    public static CertActionResult Fail(string error) => new(false, error);
}

public sealed class CertificateStoreManager
{
    private readonly string _pkiRoot;
    public CertificateStoreManager(IConfiguration config)
        => _pkiRoot = config.GetValue<string?>("OpcUa:PkiStoreRoot") ?? "pki";

    // test seam — explicit root
    internal CertificateStoreManager(string pkiRoot) => _pkiRoot = pkiRoot;

    public CertActionResult Trust(string thumbprint)   => Move("rejected", "trusted", thumbprint);
    public CertActionResult Untrust(string thumbprint) => Move("trusted", "rejected", thumbprint);
    public CertActionResult Delete(string store, string thumbprint) {  }
    private CertActionResult Move(string fromSub, string toSub, string thumbprint) {  }
}
  • Operations are synchronous filesystem moves/deletes (fast, local). Returning CertActionResult keeps the UI exception-free.
  • Delete(store, …) accepts only "trusted" or "rejected" (else Fail("unknown store")).
  • Thumbprint is validated as hex of length 40 (SHA-1) or 64 (SHA-256) before use — a cheap guard; path safety comes from enumerate-and-match, not from validation.
  • Move: validate thumbprint → enumerate {root}/{fromSub}/certs/*.{der,cer,crt}, load each, match .Thumbprint (case-insensitive); not found → Fail("certificate not found in {fromSub}"). Ensure {root}/{toSub}/certs exists; File.Move(src, dest) preserving the SDK filename, with a thumbprint suffix on name collision. If a cert with the same thumbprint already exists in the destination, the source is removed and the op is idempotent success.
  • All IOException/UnauthorizedAccessException caught → Fail(ex.Message).

2. Certificates.razor (MODIFY)

  • Tag each StoreView with a CertStoreKind (Own/Trusted/Issuer/Rejected) so the table knows which actions to render.
  • Refactor OnInitialized → a reusable LoadAll() (re-run after each action).
  • Add an Actions column rendered only for Trusted/Rejected, wrapped in <AuthorizeView Policy="FleetAdmin"> (Administrator role — the most-privileged existing policy, matching the ScriptAnalysis endpoints). Buttons set a _pending action record (kind, thumbprint, subject, verb).
  • Inline Blazor confirmation — a banner "Confirm {verb} of {subject}? [Confirm] [Cancel]". No window.confirm (a JS modal dialog would block browser-automation live-verify and is disallowed by the harness).
  • On Confirm: call the injected CertificateStoreManager, set a status banner (green success / red error), LoadAll(), clear _pending.

3. DI registration (MODIFY)

AddAdminUI(IServiceCollection) in EndpointRouteBuilderExtensions.cs: services.AddSingleton<Certificates.CertificateStoreManager>(); (stateless bar config → singleton).

Authorization

The page stays [Authorize] for viewing. The action buttons and handler are gated by <AuthorizeView Policy="FleetAdmin"> (= RequireRole("Administrator")). On the docker-dev rig (DisableLogin=true) the auto-admin satisfies the policy, so the buttons render and the actions are live-provable.

Data flow

operator clicks [Trust] on a rejected cert
  → component sets _pending=(Rejected, thumbprint, subject, "trust")
  → operator clicks [Confirm]
  → component (server circuit) calls CertificateStoreManager.Trust(thumbprint)
  → File.Move  rejected/certs/<f>.der → trusted/certs/<f>.der
  → live OPC UA CertificateValidator now enumerates the cert under trusted
  → component LoadAll() → cert now shows under "Trusted peers", gone from "Rejected"

Error handling

Condition Result
invalid/missing thumbprint Fail → red banner, no change
cert not in source store (concurrent admin already moved it) Fail("not found")
dest already has the thumbprint source removed, idempotent success
Delete with store ∉ {trusted, rejected} Fail("unknown store")
IOException / access denied caught → Fail(message)

Testing

xUnit + Shouldly in tests/Server/ZB.MOM.WW.OtOpcUa.AdminUI.Tests/Certificates/, driving CertificateStoreManager against a temp PkiStoreRoot seeded with ephemeral self-signed DER certs (generated in-test via CertificateRequest/X509Certificate2, written to rejected/certs):

  • Trust moves rejected → trusted (file gone from source, present in dest).
  • Untrust moves trusted → rejected.
  • Delete("rejected", …) / Delete("trusted", …) removes the file.
  • unknown thumbprint → Fail, no change.
  • path-traversal thumbprint ("../../x") → rejected by hex validation, nothing touched.
  • Delete("own", …) (disallowed store) → Fail("unknown store").
  • idempotent re-trust (thumbprint already in trusted) → Success, source removed.

No bUnit — the Razor changes are proven only by live /run.

Live-verify (/run)

docker-dev rig (login disabled → auto-admin). Seed a DER into a central node's rejected/certs/, open http://localhost:9200/certificates, click [Trust][Confirm], verify the cert appears under Trusted peers and is gone from Rejected; click [Delete] on a cert, verify removal.

Rig caveat (durable): :9200 is Traefik-round-robined across central-1 and central-2, and each node has its own on-disk PKI dir. Pin the verify to a single node — seed the rejected cert into one container and drive that same container — so the read and the write line up. (docker exec into the chosen central container to seed + confirm the file move on disk.)

Constraints honored

No EF migration, no Commons/proto/wire change, no bUnit. Stage by explicit path (never git add .); never stage the never-stage files. Finish = merge to master

  • push.

Touched code

  • NEW src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Certificates/CertificateStoreManager.cs
  • MODIFY src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Certificates.razor
  • MODIFY src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/EndpointRouteBuilderExtensions.cs (DI)
  • NEW tests/Server/ZB.MOM.WW.OtOpcUa.AdminUI.Tests/Certificates/CertificateStoreManagerTests.cs
  • (optional) one-line note in docs/security.md cert-store section