# Design — Auth + Audit normalization across all three sister projects **Date:** 2026-06-02 **Status:** Approved (brainstorming complete) — handing off to writing-plans. **Scope owner decision:** full two-library normalization (see [Scope decisions](#scope-decisions)). ## Summary Bring two shared libraries that already live in `scadaproj` but are **unpublished and adopted by no app** — `ZB.MOM.WW.Auth` (4 packages) and `ZB.MOM.WW.Audit` (1 package) — to **full adoption across OtOpcUa, MxAccessGateway, and ScadaBridge**, ending with every audit emit site carrying the genuine Auth-resolved principal as `AuditEvent.Actor`. The original request was "implement the audit component in all sister projects." Because audit GAPS #4 (Actor = the `ZB.MOM.WW.Auth` principal) requires an authenticated principal at every emit site, and because the owner chose the maximal scope at every fork, the job expands to a **two-library program**: full Auth adoption (auth GAPS #1–#8) first, then full Audit adoption (audit GAPS #1–#6) with #4 wiring `Actor` from the now-live principal. ## Verified starting state (source-checked 2026-06-02) - **Both libraries exist and are pack-ready** in `scadaproj/ZB.MOM.WW.Auth/` (4 csproj + `build/pack.sh` + `build/push.sh`, 172 tests) and `scadaproj/ZB.MOM.WW.Audit/` (`build/pack.sh`, 19 tests). Both at version `0.1.0`, both central-package-management. - **Neither is on the Gitea feed.** All five package registration endpoints return **HTTP 404**. No `.nupkg` is built locally. - **Adopted by zero apps.** No sibling repo references `ZB.MOM.WW.Auth*` or `ZB.MOM.WW.Audit`. - **Feed source-mapping is missing in all three repos.** Each `NuGet.config` `packageSourceMapping` lists Health/Telemetry/Configuration but **not** Auth or Audit, so each repo needs mapping lines added (mirror MxGateway commit `437ab65`, which did this for Configuration). - **The MxGateway audit coordination gate (audit GAPS #2) is CLEAR.** `MxGateway.Server` already references `ZB.MOM.WW.Telemetry.Serilog 0.1.0`; the Serilog/Telemetry/Configuration work is merged to `main`. MxGateway audit adoption is unblocked. - Established adoption rhythm (Telemetry, Configuration): publish lib to feed → add feed mapping + version pin → behaviour-preserving consumer cutover → land on the repo's local default branch (not pushed to remote). > Per repo memory, prior "published"/"adopted" claims in this workspace have repeatedly been > optimistic; every claim above was re-verified against the feed and source on 2026-06-02. ## Scope decisions | Fork | Decision | |---|---| | How deep into the audit GAPS backlog? | **Everything incl. #4 Actor→Auth** (all of #1–#6). | | How to satisfy #4 given Auth is unadopted? | **Adopt Auth first, then audit** (two-library program). | | How much of the Auth backlog? | **Full Auth normalization** (auth GAPS #1–#8, all 3 repos). | | How to walk the work matrix? | **Library-major waterfall** (Phase 1 Auth → Phase 2 Audit → Phase 3 wiring). | | Remote integration model | **Local-only**; no `git push`, no PRs (safest for production auth paths; flip per repo later if desired). | ## Architecture — four phases ``` Phase 0 Publish & feed-map pack + push both libs to Gitea feed (fix the 404s); (foundation) add NuGet.config source-mappings + version pins in all 3 repos. Phase 1 Auth adoption auth GAPS #1–#8 across all 3 repos, in GAPS sequence: (largest, sec-sensitive) #3 IGroupRoleMapper seam → #1 Ldap + #2 ApiKeys cutover → #4 config schema (A1/A2) + #5 claims/cookies → #6 dev base DN → #8 canonical roles. Each lands behind tests. Phase 2 Audit adoption audit GAPS #1–#3 core + #5/#6 cleanups across all 3 repos. (behaviour-preserving) Phase 3 Actor→Auth wiring audit GAPS #4: route the now-live Auth principal into Actor (the payoff) at every emit site. Closes the loop Audit.Actor == Auth principal. ``` The waterfall is enforced by task dependencies (Phase 0 → 1 → 2 → 3). Phase 1 must fully land before Phase 3 can wire a *stable* principal; Phase 2 sits after Phase 1 so emit sites aren't touched twice. ### Delivery model - One **feature branch per repo per library phase** (`feat/adopt-zb-auth`, then `feat/adopt-zb-audit`), behaviour-preserving except where a GAPS item is explicitly net-new. - **Publish-first**: both packages on the feed and verified resolvable before any consumer edit. - **Land on each repo's local default branch**, gated by that repo's tests + new contract tests. - **Local-only** (no push). Each phase is a revertable branch merge. - The libraries themselves are plain files in `scadaproj` (not nested git repos) — publishing is `pack` + `push` only; no commits to the libs unless a parity gap forces a fix. ## Phase 0 — publish & feed-map *(task #7)* 1. `dotnet pack -c Release` both libraries; `push.sh` to the Gitea feed (`https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json`). 2. Verify all five packages return HTTP 200 from the registration endpoint. 3. In each repo: add `packageSourceMapping` patterns (`ZB.MOM.WW.Auth`, `ZB.MOM.WW.Auth.*`, `ZB.MOM.WW.Audit`) to the gitea source, and version pins (`Directory.Packages.props` for OtOpcUa/ScadaBridge; inline `Version="0.1.0"` for MxGateway). 4. `dotnet restore` resolves the new patterns in all three repos. ## Phase 1 — Auth adoption *(task #8, blocked by #7)* Consumer cutover (libs are already extracted). GAPS order: #3 seam → #1 Ldap + #2 ApiKeys → #4 config schema + #5 claims/cookies → #6 dev base DN → #8 canonical roles. | | OtOpcUa | MxAccessGateway | ScadaBridge | |---|---|---|---| | Packages | Abstractions + Ldap + AspNetCore (no ApiKeys — OPC UA transport security) | all 4 (**source** for ApiKeys — cuts over first) | all 4 (**source** for Ldap; ApiKeys consumer after gw) | | Role mapper (#3) | config-backed (`GroupToRole`) | config-backed | **DB-backed** (`LdapGroupMapping`) | | Config migration (#4) | A1: `UseTls`→`Transport` enum (section already nested) | A1: `UseTls`→`Transport` enum | **A2 (biggest)**: flat `Security:Ldap*`→nested; rename `LdapUserIdAttribute`→`UserNameAttribute`, `LdapGroupAttribute`→`GroupAttribute` | | Cookies/claims (#5) | Blazor Admin control-plane cookie | keep `MxGatewayDashboard` name, share claims | keep `ZB.MOM.WW.ScadaBridge.Auth` name, share claims | | Canonical roles (#8) | no first-class `Deployer` (publish ⊂ `FleetAdmin`) | no `Designer`/`Deployer` | **roles collapse**: `AuditReadOnly`→Viewer, `Audit`→Administrator (auditor/admin SoD loss — GAPS-accepted) | **Two deliberate behaviour changes (accepted):** 1. **ScadaBridge API-key token format** (D2): raw `X-API-Key` → structured `__`. A genuine wire change for inbound API clients — acceptable pre-prod, requires an interop check. 2. **Canonical-roles collapse** in ScadaBridge removes auditor/admin separation-of-duties. **Known live issue to fix during OtOpcUa cutover:** `LdapAuthService` `Enabled`/double-singleton wiring is still open even though the `Security:Ldap` section binding was fixed — fold the fix into the OtOpcUa LDAP cutover. **Risk gate:** parity tests reproducing each app's current authn decisions (bind-then-search, fail-closed group lookup, RFC-4514 + filter escaping, constant-time compare, peppered HMAC-SHA256) must be green before any cutover merges. ## Phase 2 — Audit adoption *(task #9, blocked by #8)* Behaviour-preserving seam/record/enum adoption. | Repo | Core work (GAPS #1–#3) | Keep bespoke | |---|---|---| | **OtOpcUa** (#1, #5) | Replace `Commons/.../AuditEvent.cs` with canonical record; `AuditWriterActor : IAuditWriter`; derive `Outcome` at emit sites (`OpcUaAccessDenied`/`CrossClusterNamespaceAttempt`→Denied, config verbs→Success); bridge `NodeId`/`CorrelationId` value-types | Akka singleton transport, 500/5s batching, two-layer dedup, `ConfigAuditLog` EF entity + idempotency index | | **MxGateway** (#2, #6) | Map `IApiKeyAuditStore`/`ApiKeyAuditEntry`→`IAuditWriter`/`AuditEvent`; generate `EventId`; `"system"`/`"cli"` Actor fallback; `Category="ApiKey"`; `constraint-denied`→Denied | SQLite store, 3 producer call sites (only injected type changes), append-only table | | **ScadaBridge** (#3) | Outright rename `IAuditPayloadFilter`→`IAuditRedactor`; adopt canonical `AuditOutcome` enum; confirm writer contract (byte-identical) — keep bespoke ~25-field record as storage shape | Entire Site/Central pipeline, 4 domain enums, CLI export/verify, Blazor UI, redaction policy | **Resolved open GAPS decisions:** 1. **ScadaBridge rename vs. alias** → **outright rename** (compiler-verified across the HIGH blast radius). 2. **MxGateway `Details`→`DetailsJson`** → **wrap as a small JSON object** (keeps the field valid JSON). 3. **OtOpcUa `Outcome` storage** → **new nullable `Outcome` column + EF migration** (first-class, queryable). 4. **OtOpcUa SP path** → **leave bespoke + document**; *do* fix the `ClusterId`-filter/actor mismatch in `ClusterAudit.razor` so structured rows are visible. **Cleanups in scope:** #5 (OtOpcUa SP reconcile + `ClusterId` visibility fix), #6 (MxGateway `CorrelationId` capture + structured `Target`). **Behaviour fix:** MxGateway's `AppendAsync` currently may propagate; wrap it so the adopted `IAuditWriter` never throws (honors the best-effort contract). ## Phase 3 — Actor→Auth wiring *(task #10, blocked by #8 + #9)* With Auth live (Phase 1) and the canonical record adopted (Phase 2), route the resolved principal into `AuditEvent.Actor` everywhere: - **Seam:** one small `IAuditActorAccessor` — HTTP paths read `HttpContext.User`; non-HTTP paths (Akka actors, CLI) thread the operation principal or fall back. The single place that changes if the principal source ever changes again. - OtOpcUa → LDAP-resolved user. MxGateway → API-key name (system/cli fallback retained for keyless CLI events). ScadaBridge → principal at `ManagementActor`/inbound boundary. ## Contracts, testing & risk gates **Hard seam contracts:** - `IAuditWriter` — best-effort, MUST NOT throw, swallow internal failures. OtOpcUa actor ✅, ScadaBridge ✅; MxGateway needs the never-throw wrap (above). - `IAuditRedactor` — pure, never throws, over-redacts on failure. ScadaBridge's `SafeDefaultAuditPayloadFilter` is the reference; rename preserves it. **Cross-boundary surface:** Auth/Audit adoption is in-process and does **not** touch the cross-repo wire contracts (gateway `.proto` files, OPC UA address-space shape) — **except** the ScadaBridge API-key token-format change, the one item needing an interop check rather than just a green unit build. A green build in one repo does not prove interop. **Per-phase verification (evidence before "done"):** - **Phase 0:** all 5 packages HTTP 200; `dotnet restore` green in all 3 repos. - **Phase 1:** existing auth tests + new parity tests green per repo before merge; SB token-format integration check. - **Phase 2:** existing audit tests + new `Outcome`/`EventId`/rename tests; OtOpcUa `Outcome` migration applies forward. - **Phase 3:** `Actor == authenticated principal` on authenticated paths; fallback retained on keyless/system paths. - **Library suites** (Audit 19, Auth 172) re-run if any lib is touched. If a parity gap forces a lib fix, bump `0.1.0`→`0.1.1` and re-publish rather than editing a published version. ## Tasks | Task | Item | Blocked by | |---|---|---| | #7 | Phase 0 — publish both libs + feed-map all 3 repos | — | | #8 | Phase 1 — adopt ZB.MOM.WW.Auth across all 3 repos (auth GAPS #1–#8) | #7 | | #9 | Phase 2 — adopt ZB.MOM.WW.Audit across all 3 repos (audit GAPS #1–#3, #5, #6) | #8 | | #10 | Phase 3 — wire Actor from the Auth principal (audit GAPS #4) | #8, #9 | ## References - `components/auth/GAPS.md`, `components/auth/spec/`, `components/auth/current-state/*` - `components/audit/GAPS.md`, `components/audit/shared-contract/ZB.MOM.WW.Audit.md`, `components/audit/current-state/*` - Libraries: `ZB.MOM.WW.Auth/`, `ZB.MOM.WW.Audit/` - Prior adoption precedent: `components/configuration/GAPS.md`, `components/observability/GAPS.md`