Files
scadaproj/docs/plans/2026-06-02-auth-audit-normalization.md
T

22 KiB
Raw Blame History

Auth + Audit Normalization Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.

Goal: Publish ZB.MOM.WW.Auth (4 pkgs) + ZB.MOM.WW.Audit (1 pkg) to the Gitea feed and adopt both across OtOpcUa, MxAccessGateway, and ScadaBridge, ending with every audit emit site carrying the Auth-resolved principal as AuditEvent.Actor.

Architecture: Library-major waterfall — Phase 0 publish/feed-map → Phase 1 full Auth adoption (auth GAPS #1#8) → Phase 2 full Audit adoption (audit GAPS #1#3,#5,#6) → Phase 3 wire Actor from the principal. Behaviour-preserving cutover except two accepted changes (ScadaBridge token format, canonical-roles collapse). One feature branch per repo per library phase; local-only delivery (no git push).

Tech Stack: .NET 10, NuGet (Gitea feed + central package management), Akka.NET (OtOpcUa/ScadaBridge), EF Core + SQL Server (OtOpcUa) / SQLite (MxGateway, ScadaBridge site), Blazor admin UIs, gRPC (gateway), LDAP/GLAuth, peppered HMAC API keys, xUnit.

Design doc: 2026-06-02-auth-audit-normalization-design.md

Fidelity note: Phase 0 tasks are command-exact and executable as written. Phase 13 cutover tasks name exact files-to-edit and acceptance criteria but their per-step diffs are elaborated just-in-time by the per-phase "explore + elaborate" gate task (the implementer reads the named source first) — these repos' auth source has not been opened during planning, only the normalized components/*/current-state/ docs. Audit (Phase 2) tasks cite the exact paths/lines those docs provide.

Prerequisite the executor must supply: Phase 0 push needs GITEA_NUGET_KEY (Gitea token with package:write). The agent cannot mint this — the user exports it, or runs the push step via !.


PHASE 0 — Publish & feed-map (executable now)

Branch: work on docs/auth-audit-normalization (current) or a fresh chore/publish-auth-audit. The library packs happen in scadaproj; the feed-map edits happen in the three sibling repos (each on its own feat/adopt-zb-auth branch — created here, reused in Phase 1).

Task 0.1: Add a push script for ZB.MOM.WW.Audit

Classification: trivial Estimated implement time: ~2 min Parallelizable with: none (blocks 0.3)

Files:

  • Create: ZB.MOM.WW.Audit/build/push.sh

Step 1: Create the script (mirror ZB.MOM.WW.Auth/build/push.sh)

#!/usr/bin/env bash
# push.sh — pack and push the ZB.MOM.WW.Audit NuGet package to the Gitea feed.
#
# Required environment variables:
#   GITEA_NUGET_SOURCE  — full URL of the Gitea NuGet feed
#   GITEA_NUGET_KEY     — Gitea access token with package:write permission
set -euo pipefail
: "${GITEA_NUGET_SOURCE:?set GITEA_NUGET_SOURCE to your Gitea NuGet feed URL}"
: "${GITEA_NUGET_KEY:?set GITEA_NUGET_KEY to your Gitea access token}"
dotnet pack -c Release -o ./artifacts
dotnet nuget push "./artifacts/*.nupkg" \
  --source "$GITEA_NUGET_SOURCE" \
  --api-key "$GITEA_NUGET_KEY" \
  --skip-duplicate

Step 2: chmod +x ZB.MOM.WW.Audit/build/push.sh

Step 3: Commit

git add ZB.MOM.WW.Audit/build/push.sh && git commit -m "build(audit): add Gitea push.sh"

Task 0.2: Build + test both libraries green before publishing

Classification: small Estimated implement time: ~4 min Parallelizable with: 0.1

Files: none (verification only)

Step 1: cd ZB.MOM.WW.Auth && dotnet test — expect all 172 pass. Step 2: cd ZB.MOM.WW.Audit && dotnet test — expect all 19 pass. Acceptance: both suites green. If either fails, STOP — do not publish a red library.

Task 0.3: Pack + push both libraries to the Gitea feed

Classification: standard Estimated implement time: ~4 min (+ network) Parallelizable with: none (blocked by 0.1, 0.2)

Files: none (publishes artifacts)

Step 1: Export credentials (user-supplied token)

export GITEA_NUGET_SOURCE="https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json"
export GITEA_NUGET_KEY="<gitea token with package:write>"

Step 2: cd ZB.MOM.WW.Auth && ./build/push.sh Step 3: cd ZB.MOM.WW.Audit && ./build/push.sh Step 4: Verify all 5 resolve (HTTP 200)

for p in zb.mom.ww.auth.abstractions zb.mom.ww.auth.ldap zb.mom.ww.auth.apikeys \
         zb.mom.ww.auth.aspnetcore zb.mom.ww.audit; do
  printf '%s -> ' "$p"
  curl -s -o /dev/null -w "%{http_code}\n" \
    "https://gitea.dohertylan.com/api/packages/dohertj2/nuget/registration/$p/index.json"
done

Acceptance: all five print 200 (currently all 404).

Task 0.4: Feed-map + restore OtOpcUa

Classification: small Estimated implement time: ~4 min Parallelizable with: 0.5, 0.6 (different repos)

Files:

  • Modify: ~/Desktop/OtOpcUa/NuGet.config (add patterns under dohertj2-gitea)
  • Modify: ~/Desktop/OtOpcUa/Directory.Packages.props (add PackageVersion entries)

Step 1: create branch feat/adopt-zb-auth in OtOpcUa. Step 2: under the dohertj2-gitea packageSource, add:

<package pattern="ZB.MOM.WW.Auth" />
<package pattern="ZB.MOM.WW.Auth.*" />
<package pattern="ZB.MOM.WW.Audit" />

Step 3: in Directory.Packages.props add (version 0.1.0): ZB.MOM.WW.Auth.Abstractions, ZB.MOM.WW.Auth.Ldap, ZB.MOM.WW.Auth.AspNetCore, ZB.MOM.WW.Audit. (No ZB.MOM.WW.Auth.ApiKeys — OtOpcUa uses OPC UA transport security.) Step 4: dotnet restore ZB.MOM.WW.OtOpcUa.slnx — expect success, the new packages download from gitea. Step 5: Commit build: add ZB.MOM.WW.Auth/Audit feed mapping + version pins. Acceptance: restore succeeds; obj/project.assets.json lists the new packages from the gitea source.

Task 0.5: Feed-map + restore MxAccessGateway

Classification: small Estimated implement time: ~4 min Parallelizable with: 0.4, 0.6

Files:

  • Modify: ~/Desktop/MxAccessGateway/nuget.config
  • Modify: ~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj (inline Version= style — no CPM)

Step 1: branch feat/adopt-zb-auth in MxAccessGateway. Step 2: add the same three <package pattern> lines under dohertj2-gitea. Step 3: dotnet restore src/MxGateway.sln (PackageReferences added in Phase 1; this step only proves the feed resolves — optionally add a throwaway reference and remove, or defer restore-proof to Phase 1's first add). Step 4: Commit build: add ZB.MOM.WW.Auth/Audit feed mapping. Acceptance: nuget.config maps the new patterns; restore of an added Auth package succeeds.

Task 0.6: Feed-map + restore ScadaBridge

Classification: small Estimated implement time: ~4 min Parallelizable with: 0.4, 0.5

Files:

  • Modify: ~/Desktop/ScadaBridge/nuget.config
  • Modify: ~/Desktop/ScadaBridge/Directory.Packages.props

Step 1: branch feat/adopt-zb-auth in ScadaBridge. Step 2: add the three <package pattern> lines under dohertj2-gitea. Step 3: add PackageVersion entries @ 0.1.0 for all 4 Auth packages + ZB.MOM.WW.Audit. Step 4: dotnet restore ZB.MOM.WW.ScadaBridge.slnx. Step 5: Commit build: add ZB.MOM.WW.Auth/Audit feed mapping + version pins. Acceptance: restore succeeds.

Phase 0 exit gate: all 5 packages HTTP 200; all 3 repos restore green with the new feed mappings. Only then start Phase 1.


PHASE 1 — Auth adoption (auth GAPS #1#8) [HIGH-RISK PHASE]

Order within the phase (per components/auth/GAPS.md sequencing): #3 seam → #1 Ldap + #2 ApiKeys → #4 config + #5 claims/cookies → #6 base DN → #8 canonical roles. Every cutover is gated by parity tests before merge.

Task 1.0: Explore auth source + elaborate Phase 1 steps (GATE — do first)

Classification: standard Estimated implement time: ~5 min (read-only) Parallelizable with: none (blocks all 1.x)

Files (read-only):

  • components/auth/current-state/{otopcua,mxaccessgw,scadabridge}/CURRENT-STATE.md
  • components/auth/spec/SPEC.md, components/auth/spec/CANONICAL-ROLES.md, components/auth/shared-contract/ZB.MOM.WW.Auth.md
  • ZB.MOM.WW.Auth/src/** (the public surface being adopted)
  • Each repo's LDAP auth service, API-key pipeline, role mapper, and auth DI wiring (paths surfaced by the current-state docs).

Action: read the above; for each task below fill in the concrete diff, exact file paths, and the parity-test assertions. Append the elaborated steps to this plan section (or a …-phase1.md companion). No code changes in this task. This gate exists because the per-repo auth source was not opened during planning.

Task 1.1: IGroupRoleMapper<TRole> seam — config + DB mappers (GAPS #3, all 3 repos)

Classification: standard Estimated implement time: ~5 min/repo (split per repo if needed) Parallelizable with: 1.2 within a repo only after the seam type is referenced

Files: per-repo role-mapping call sites (config-backed for OtOpcUa + MxGateway; DB-backed LdapGroupMapping for ScadaBridge) — exact paths from Task 1.0. Steps: TDD — write a mapper test asserting current group→role outputs are preserved → wire the app to the library's IGroupRoleMapper<TRole> (config mapper for OtOpcUa/gw, DB/delegate mapper for SB) → green → commit. Acceptance: existing role-resolution behaviour byte-identical; #3 done (cheap, unblocks the rest).

Task 1.2: Adopt ZB.MOM.WW.Auth.Ldap — cutover (GAPS #1, all 3 repos)

Classification: high-risk (security; LDAP) Estimated implement time: split per repo (~5 min each) Parallelizable with: 1.3 (different repos) — but within a repo, serial after 1.1

Files: each repo's LDAP authentication service + DI (ScadaBridge is the donor baseline; OtOpcUa/gw cut over to it). For OtOpcUa also fix the open LdapAuthService Enabled/double-singleton wiring (repo memory). Steps (per repo): write parity tests reproducing current authn decisions (bind-then-search, fail-closed-on-group-lookup, RFC-4514 + filter escaping, username trim, service-account-bind distinction) → run red against the library path → replace bespoke LDAP with Auth.Ldap → green → commit. Acceptance: parity tests green; bespoke LDAP code removed/delegated; OtOpcUa singleton bug fixed.

Task 1.3: Adopt ZB.MOM.WW.Auth.ApiKeys — cutover (GAPS #2; MxGateway then ScadaBridge)

Classification: high-risk (security; API keys) Estimated implement time: ~5 min/repo Parallelizable with: 1.2 (different files) — MxGateway first (source), then ScadaBridge

Files: MxGateway Security/Authentication/ API-key verifier/store DI; ScadaBridge Inbound API X-API-Key path. Steps: parity tests (peppered HMAC-SHA256, constant-time compare, scope/constraint enforcement) → cutover to Auth.ApiKeys → green → commit. ScadaBridge behaviour change (accepted): raw X-API-Key → structured <prefix>_<id>_<secret>; add an interop check that an inbound client using the new token format authenticates and the old format is rejected. Acceptance: parity + interop green; gateway is the proven source before SB cuts over.

Task 1.4: Config schema migration (GAPS #4 / A1A2, all 3 repos)

Classification: standard Estimated implement time: ~4 min/repo Parallelizable with: bundled with 1.2 per the GAPS note ("mechanical; do with #1")

Files: OtOpcUa + MxGateway: UseTlsTransport enum binding + appsettings. ScadaBridge: flat Security:Ldap*→nested section; rename LdapUserIdAttributeUserNameAttribute, LdapGroupAttributeGroupAttribute (+ appsettings + any validators). Steps: update options class + binding + appsettings + (ScadaBridge) ConfigPreflight/validator messages → run config-validation tests → commit. Acceptance: apps bind the new schema; no behaviour change beyond key names/enum.

Task 1.5: ZB.MOM.WW.Auth.AspNetCore claims/cookie conventions (GAPS #5, all 3 UIs)

Classification: standard Estimated implement time: ~4 min/repo Parallelizable with: 1.4

Files: each UI's cookie/claims wiring (OtOpcUa Blazor Admin control-plane; MxGateway MxGatewayDashboard; ScadaBridge ZB.MOM.WW.ScadaBridge.Auth). Keep each cookie name; share canonical claim types + attributes. Steps: adopt the shared claim-type constants + cookie attribute defaults → auth-flow test (login sets canonical claims) → commit. Acceptance: each app keeps its cookie name but emits canonical claims.

Task 1.6: Unify dev GLAuth base DN (GAPS #6, all 3 + fixtures)

Classification: small (dev-only) Estimated implement time: ~3 min Parallelizable with: 1.5

Files: dev appsettings + LDAP/GLAuth fixtures/infra in each repo. Pick one shared base DN (open decision A3 — resolve in Task 1.0). Acceptance: dev fixtures + all 3 apps share one base DN; dev login still works.

Task 1.7: Canonical roles — canonical → native expansion (GAPS #8, all 3 repos)

Classification: high-risk (security policy) Estimated implement time: ~5 min/repo Parallelizable with: none (after 1.1)

Files: each repo's role-enforcement mapping. ScadaBridge accepted collapse: AuditReadOnly→Viewer, Audit→Administrator (auditor/admin SoD removed). OtOpcUa: publish ⊂ FleetAdmin (no first-class Deployer). MxGateway: assign applicable subset (no Designer/Deployer). Steps: map each canonical role to native enforcement; test that each LDAP group still authorizes its expected actions; document the SoD change → commit. Acceptance: canonical six standardized org-wide; per-project native enforcement unchanged except the documented ScadaBridge collapse.

Phase 1 exit gate: all 3 repos consume ZB.MOM.WW.Auth.* from the feed; bespoke LDAP/ApiKey/role code removed or delegated; existing auth tests + new parity tests green per repo; SB token-format interop check green. Merge each feat/adopt-zb-auth to the repo's local default branch (no push).


PHASE 2 — Audit adoption (audit GAPS #1#3, #5, #6)

⚠️ RE-SCOPED 2026-06-02 — the task specs below are SUPERSEDED. The Task 2.0 gate (verified against live code) found these specs materially wrong: MxGateway's audit files moved into the shared library (Phase 1), OtOpcUa's structured audit path is dormant (zero emit sites), and the ScadaBridge "outright rename" is structurally impossible (its filter is typed to its own 24-field record, not the library's 9-field one). The user chose DEEP adopt (canonical record) + pause for review. The corrected, gate-grounded deep design is in 2026-06-02-auth-audit-normalization-phase2-deep.mdimplementation is PAUSED pending user review of that doc (esp. the ScadaBridge audit-subsystem re-architecture cost). The original specs below are kept for historical context only.

Branch feat/adopt-zb-audit per repo. Behaviour-preserving except the OtOpcUa Outcome column + ClusterId visibility fix. Concrete paths below come from components/audit/current-state/*.

Task 2.0: Explore audit source + confirm elaboration (GATE — light, paths already known)

Classification: trivial Estimated implement time: ~3 min (read-only) Parallelizable with: none (blocks 2.x)

Files (read-only): the exact files cited in the tasks below (OtOpcUa AuditWriterActor.cs, Commons/Messages/Audit/AuditEvent.cs, ConfigAuditLog.cs, OtOpcUaConfigDbContext.cs, ClusterAudit.razor; MxGateway IApiKeyAuditStore.cs, SqliteApiKeyAuditStore.cs, ApiKeyAuditEntry.cs, ConstraintEnforcer.cs, the 3 producers; ScadaBridge IAuditPayloadFilter.cs, IAuditWriter.cs, AuditEvent.cs, the 4 enums). Confirm line refs still hold; adjust if drifted.

Task 2.1: OtOpcUa — canonical record + AuditWriterActor : IAuditWriter + Outcome (GAPS #1)

Classification: high-risk (actor model + data contract) Estimated implement time: split (record swap ~5 min; actor seam ~5 min; Outcome derivation ~5 min) Parallelizable with: 2.3, 2.5 (different repos)

Files:

  • Modify: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Audit/AuditEvent.cs (replace with canonical record usage; bridge NodeId/CorrelationId value-types at construction)
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs (implement IAuditWriter; map at :75-84)
  • Modify: tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/AuditWriterActorTests.cs

Steps: TDD — extend actor tests to assert Outcome derivation (OpcUaAccessDenied/CrossClusterNamespaceAttempt→Denied, config verbs→Success) and the canonical record mapping → red → swap record + implement seam + derive Outcome at emit sites → keep 500/5s batching + two-layer dedup → green → commit. Acceptance: existing tests + new Outcome tests green; transport/dedup unchanged.

Task 2.2: OtOpcUa — Outcome column migration + ClusterId visibility fix (GAPS #1 storage, #5)

Classification: high-risk (EF migration + UI query) Estimated implement time: ~5 min Parallelizable with: none (after 2.1)

Files:

  • Modify: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigAuditLog.cs (add nullable Outcome)
  • Modify: .../OtOpcUaConfigDbContext.cs (mapping ~:429-463)
  • Create: Migrations/<ts>_AddConfigAuditLogOutcome.cs
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Clusters/ClusterAudit.razor:78 (so structured actor rows — which set NodeId not ClusterId — are discoverable)

Steps: add column + migration → dotnet ef migrations add + apply on a test DB → adjust the query so structured rows appear under a cluster → commit. Leave the SP path bespoke (documented). Acceptance: migration applies forward; structured AuditEvent rows now visible in ClusterAudit.razor.

Task 2.3: MxGateway — IApiKeyAuditStoreIAuditWriter adapter (GAPS #2, #6)

Classification: standard Estimated implement time: ~5 min Parallelizable with: 2.1, 2.5

Files:

  • Modify: src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/IApiKeyAuditStore.cs, SqliteApiKeyAuditStore.cs, ApiKeyAuditEntry.cs, AuthStoreServiceCollectionExtensions.cs:23, and the 3 producers (ApiKeyAdminCliRunner, DashboardApiKeyManagementService, ConstraintEnforcer.cs:117)
  • Test: gateway audit tests (SqliteAuthStoreTests, ApiKeyAdminCliRunnerTests)

Steps: map to canonical AuditEvent — generate EventId; KeyId→Actor with "system"/"cli" fallback; EventType→Action; CreatedUtc→OccurredAtUtc; RemoteAddress→SourceNode; constraint-denied→Outcome.Denied else Success; Category="ApiKey"; Details→DetailsJson wrapped as a JSON object; add CorrelationId capture + structured Target (#6). Wrap AppendAsync so it never throws (best-effort contract). Producers keep call sites; only the injected type changes. → tests green → commit. Acceptance: writes produce canonical events; writer never propagates; tests green.

Task 2.5: ScadaBridge — rename IAuditPayloadFilterIAuditRedactor + adopt AuditOutcome (GAPS #3)

Classification: high-risk (HIGH blast radius rename across site/central/wiring) Estimated implement time: ~5 min (compiler-driven) Parallelizable with: 2.1, 2.3

Files:

  • Modify: src/ZB.MOM.WW.ScadaBridge.AuditLog/Payload/IAuditPayloadFilter.cs → adopt ZB.MOM.WW.Audit.IAuditRedactor (outright rename; DefaultAuditPayloadFilter/SafeDefaultAuditPayloadFilter implement it unchanged)
  • Modify: all references across AuditLog/Site, AuditLog/Central, wiring, Commons
  • Adopt canonical AuditOutcome enum; confirm IAuditWriter signature is byte-identical (keep the bespoke ~25-field record as storage shape — option (a))

Steps: outright rename (let the compiler enumerate sites) → adopt AuditOutcome and the Status→Outcome projection (Delivered→Success; Failed/Parked/Discarded→Failure; InboundAuthFailure→Denied) for cross-project reporting → build + full audit test suite green → commit. Acceptance: compiles clean; no transport/storage/CLI/UI behaviour change; enum + interface names canonical.

Phase 2 exit gate: all 3 repos consume ZB.MOM.WW.Audit; seams/record/enum canonical; existing audit suites green; OtOpcUa Outcome migration applies; ScadaBridge rename clean. Merge each feat/adopt-zb-audit locally (no push).


PHASE 3 — Wire Actor from the Auth principal (audit GAPS #4)

Task 3.1: Introduce IAuditActorAccessor seam

Classification: standard Estimated implement time: ~4 min Parallelizable with: none (blocks 3.23.4)

Files: a small accessor per app (HTTP impl reads HttpContext.User; non-HTTP returns a threaded/fallback principal). Exact location decided in Task 1.0/3.1 from the now-adopted Auth.AspNetCore principal plumbing. Steps: define the interface + an HTTP-backed impl + a fallback impl → unit test both → commit. Acceptance: accessor returns the Auth principal on authenticated paths, a fallback otherwise.

Task 3.2 / 3.3 / 3.4: Wire emit sites — OtOpcUa / MxGateway / ScadaBridge

Classification: standard (each) Estimated implement time: ~4 min each Parallelizable with: each other (different repos), after 3.1

Files: each repo's audit emit sites (OtOpcUa config-write/authz emitters; MxGateway 3 producers — keep "system"/"cli" for keyless CLI; ScadaBridge ManagementActor/inbound boundary). Steps: inject IAuditActorAccessor; set AuditEvent.Actor = accessor.CurrentPrincipal at each emit site → test Actor == authenticated principal on authenticated paths, fallback retained otherwise → commit. Acceptance: every authenticated emit carries the real Auth principal; keyless/system paths retain explicit fallbacks.

Program exit gate: Audit.Actor == Auth principal end-to-end across all 3 repos; all suites green; everything on local default branches (no push). Update components/auth/GAPS.md and components/audit/GAPS.md to mark the adopted items done, and refresh the relevant CLAUDE.md status rows.


Risk gates (cross-cutting)

  • Never publish a red library (Task 0.2 gates 0.3). If a parity gap forces a lib fix, bump 0.1.00.1.1 and re-publish; don't edit a published version.
  • Phase 1 parity tests must be green before any auth cutover merges — this is the security gate.
  • A green build in one repo does not prove interop. The ScadaBridge token-format change (Task 1.3) is the one cross-boundary contract change and needs the explicit interop check.
  • Waterfall enforced by deps: Phase 1 fully lands before Phase 2; Phase 3 after both.