Files
scadaproj/docs/plans/2026-06-02-auth-audit-normalization-phase1.md
T

42 KiB
Raw Blame History

Phase 1 (Auth adoption) — elaborated steps + Task 1.0 findings

Companion to 2026-06-02-auth-audit-normalization.md. Produced by the Task 1.0 read-only exploration gate (4 parallel explorers: library surface + 3 repos). All paths verified 2026-06-02 against source.

Cutover target — ZB.MOM.WW.Auth public surface

Package Consumer entry points
.Abstractions NB: IGroupRoleMapper<TRole>/GroupRoleMapping<TRole>/CanonicalRole live in namespace ZB.MOM.WW.Auth.Abstractions.Roles (verified during Task 1.1). ILdapAuthService, LdapOptions (Transport: LdapTransport{Ldaps,StartTls,None}, AllowInsecure, UserNameAttribute, GroupAttribute, ServiceAccountDn/Password, SearchBase, ConnectionTimeoutMs, ServerCertificateValidationCallback), LdapAuthResult(Succeeded,Username,DisplayName,Groups,Failure), LdapAuthFailure, CanonicalRole{Viewer,Operator,Engineer,Designer,Deployer,Administrator}, IGroupRoleMapper<TRole> (no default impl — consumer writes it) → GroupRoleMapping<TRole>(Roles, Scope:object?), plus API-key abstractions (IApiKeyVerifier, ApiKeyVerification, ApiKeyIdentity, IApiKeyStore/IApiKeyAdminStore/IApiKeyAuditStore, ApiKeyOptions{TokenPrefix,PepperSecretName,SqlitePath,RunMigrationsOnStartup})
.Ldap LdapAuthService(LdapOptions) : ILdapAuthService. Bind-then-search, fail-closed, never throws. LdapOptionsValidator (TLS-or-AllowInsecure) auto-registered.
.ApiKeys ApiKeyVerifier(ApiKeyOptions, IApiKeyStore, IApiKeyPepperProvider, TimeProvider?), ApiKeyParser.TryParse (<prefix>_<keyId>_<secret>), ApiKeySecretGenerator.NewSecret(), default SQLite stores, ConfigurationApiKeyPepperProvider. Extracted from MxGateway — near-1:1 with its pipeline.
.AspNetCore ZbClaimTypes{Name,Role,DisplayName,Username,ScopeId}, ZbCookieDefaults.Apply(opts, requireHttps, idleTimeout), DI: AddZbLdapAuth(services, config, sectionPath), AddZbApiKeyAuth(services, config, sectionPath).

Per-app current state (verified) and elaborated cutover

OtOpcUa — packages: Abstractions + Ldap + AspNetCore (no ApiKeys)

Current LDAP: src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapAuthService.cs (impl), ILdapAuthService.cs, LdapOptions.cs (section Security:Ldap, UseTls bool, Enabled, DevStubMode, embedded GroupToRole dict), LdapAuthResult.cs (already carries Roles). Role mapping is config + DB: RoleMapper.Map (config GroupToRole) + RoleMapper.Merge with DB LdapGroupRoleMappingService/LdapGroupRoleMapping (system-wide rows). Native roles AdminRole{ConfigViewer,ConfigEditor,FleetAdmin} (control-plane only; data-plane is a separate NodePermissions bitmask). DI: two TryAddSingleton<ILdapAuthService,LdapAuthService> sites (Security/ServiceCollectionExtensions.cs:42 + Host/Program.cs:106). Cookie ZB.MOM.WW.OtOpcUa.Auth, single Cookie scheme (JWT inside cookie). Second LDAP consumer: OPC UA data-plane LdapOpcUaUserAuthenticator + OpcUaApplicationHost.HandleImpersonation call the LDAP service too.

  • 1.1 mapper: implement IGroupRoleMapper<AdminRole> (or <string>) wrapping RoleMapper.Map + DB Merge.
  • 1.2 Ldap: replace LdapAuthService with Auth.Ldap; restructure flow to ILdapAuthService → Groups → IGroupRoleMapper → roles → claims; preserve DevStubMode app-side (library has no stub); wire BOTH consumers (login endpoint + OPC UA impersonation).
  • 1.4 config: UseTlsTransport enum (section already Security:Ldap — see Finding #1).
  • 1.5 cookie/claims: use ZbClaimTypes + ZbCookieDefaults.Apply; keep cookie name.
  • 1.7 roles: ConfigViewer→Viewer, ConfigEditor→Designer, FleetAdmin→Administrator(+Deployer; publish⊂FleetAdmin). Data-plane NodePermissions unaffected.

MxAccessGateway — packages: all 4 (ApiKeys source, cuts over first)

Current API keys (src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/): ApiKeyParser (mxgw_<id>_<secret>), ApiKeySecretHasher (HMAC-SHA256 + pepper MxGateway:ApiKeyPepper), ApiKeySecretGenerator, ApiKeyVerifier (FixedTimeEquals), SQLite stores, ConstraintEnforcer + rich ApiKeyConstraints, gRPC GatewayGrpcAuthorizationInterceptor + GatewayScopes. DI AddSqliteAuthStore(). → near-1:1 with Auth.ApiKeys. LDAP: Dashboard/DashboardAuthenticator.cs (MxGateway:Ldap, UseTls), GroupToRole under MxGateway:Dashboard, roles Admin/Viewer, cookie MxGatewayDashboard.

  • 1.1 mapper: IGroupRoleMapper<string> wrapping DashboardAuthenticator.MapGroupsToRoles.
  • 1.2 Ldap: replace DashboardAuthenticator's LDAP internals with Auth.Ldap (keep dashboard claims/principal build).
  • 1.3 ApiKeys: delete the local parser/hasher/generator/verifier/stores; re-point to Auth.ApiKeys; keep ConstraintEnforcer + gRPC interceptor + scopes on top (constraints carried as the opaque blob). Lowest-risk ApiKeys cutover (it's the donor).
  • 1.4 config: UseTlsTransport.
  • 1.5/1.7: ZbClaimTypes/cookie defaults; Viewer→Viewer, Admin→Administrator.

ScadaBridge — packages: all 4 (Ldap source; ApiKeys consumer)

Current LDAP (src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs): the hardened reference (RFC-4514 DN escape, filter escape, per-op timeout, fail-closed group lookup, username trim, service-account-bind distinction). Config is flat ScadaBridge:Security:Ldap* in SecurityOptions.cs with LdapTransport enum already (Ldaps/StartTls/None), AllowInsecureLdap, LdapUserIdAttribute, LdapGroupAttribute, validated by SecurityOptionsValidator : OptionsValidatorBase. Role mapping DB-backed with site-scoping: RoleMapper.MapGroupsToRolesAsyncRoleMappingResult(Roles, PermittedSiteIds, IsSystemWideDeployment) over LdapGroupMapping + SiteScopeRule (SQL Server). Roles Admin/Design/Deployment/Audit/AuditReadOnly; SoD via OperationalAudit{Admin,Audit,AuditReadOnly} + AuditExport{Admin,Audit}. Cookie ZB.MOM.WW.ScadaBridge.Auth; JWT-in-cookie via JwtTokenService. Inbound API keys (InboundAPI/ApiKeyValidator.cs): raw X-API-Key, deterministic HMAC (ApiKeyHasher, no per-row salt, by-value lookup), ApiKey{Name,KeyHash,IsEnabled} in SQL Server, per-method approval via ApiMethod.ApprovedApiKeyIdsarchitecturally different from the library's keyId/scope/SQLite model.

  • 1.1 mapper: IGroupRoleMapper<string> wrapping RoleMapper.MapGroupsToRolesAsync, carrying PermittedSiteIds/IsSystemWideDeployment in GroupRoleMapping.Scope.
  • 1.2 Ldap: ScadaBridge is the donor — confirm Auth.Ldap behaviour-matches, then re-point LdapAuthService usages to the library type. Lowest-risk Ldap cutover.
  • 1.3 ApiKeys: see Finding #3 — bigger than a token reformat; needs a scope decision.
  • 1.4 config: nest flat Security:Ldap* under a sub-section + rename LdapUserIdAttribute→UserNameAttribute, LdapGroupAttribute→GroupAttribute, LdapTransport→Transport (+ SecurityOptionsValidator + appsettings). Enum already matches.
  • 1.7 roles: Admin→Administrator, Design→Designer, Deployment→Deployer, Audit→Administrator (collapse), AuditReadOnly→Viewer (collapse) — removes the OperationalAudit/AuditExport SoD (accepted).

Key findings that change the plan

  1. OtOpcUa LDAP section is Security:Ldap, not Authentication:Ldap. Both components/auth/GAPS.md §1 and the auth current-state doc are wrong; the code (and the prior fix in memory) use Security:Ldap. → Task 1.4 for OtOpcUa is only UseTlsTransport, not a section move.
  2. OtOpcUa "double-singleton bug" is already mitigated. Both registration sites use TryAddSingleton (dedupes); the Enabled flag is an intentional fail-closed master switch. → Not a blocking fix; verify and keep Enabled. Removes a risk the plan flagged.
  3. ScadaBridge inbound API keys are a re-architecture, not a token reformat. The library's ApiKeys model (<prefix>_<keyId>_<secret> Bearer, keyId lookup + constant-time compare, SQLite store, scopes + opaque constraints) is fundamentally different from ScadaBridge's (raw X-API-Key, deterministic by-value HMAC lookup, SQL Server ApiKey{Name,KeyHash}, per-method approval list). Wholesale adoption means re-architecting inbound-API auth AND resolving a SQLite-vs-SQL-Server storage mismatch. Needs a scope decision (Decision A).
  4. OtOpcUa role mapping is config + DB, not just config (RoleMapper.Map baseline + DB Merge). The IGroupRoleMapper impl must combine both. OtOpcUa also has DevStubMode (no library equivalent — keep app-side) and a second LDAP consumer (OPC UA data-plane impersonation) that must be re-wired too.
  5. MxGateway ApiKeys cutover is the donor path — lowest risk (delete locals, re-point to library; keep ConstraintEnforcer/gRPC/scopes on top). Confirms the GAPS sequencing (gateway first).

Task 1.2 (LDAP cutover) — implemented + reviewed (2026-06-02)

Commits: OtOpcUa 257caa7, MxGateway c3b466e, ScadaBridge ac34dac. All targeted tests green. Security review verdict: sound, no credential-leak regression in any repo (insecure-transport guards fire correctly; DevStubMode cannot leak to prod; claim shapes preserved). All three returned CHANGES-REQUESTED for fixable issues:

  • OtOpcUa (no Critical): (I1) insecure-transport guard is login-time only — add startup validation gated on Enabled for defense-in-depth, verify prod overlays still boot; (I2) integration stub pre-populates Roles so the Groups→mapper path isn't actually exercised — fix the stub; (I3) document/test the zero-role fail-closed fallback.
  • MxGateway (2 Critical): (C1) library strips group DNs to short RDN names before the LdapGroupClaimType claim → verify prior behaviour, document, drop the now-dead full-DN branch in the mapper, add a claim-value assertion; (C2) gateway's local LdapOptions is now a shadow copy (validated but unused at runtime) → fold to the shared type or document the drift. (I1) shared LdapOptionsValidator has no Enabled=false guard → validates even when LDAP is disabled (real for MxGateway, which can disable dashboard LDAP).
  • ScadaBridge (2 Critical): (C1) ConfigSecretsTests still checks the OLD flat key → passes vacuously, no longer guards secret-in-config — repoint to nested key; (C2) production-checklist.md still lists deleted flat keys → update; (I) unsafe (RoleMappingResult)Scope! cast → null-guard.

Cross-cutting decision — shared library LdapOptionsValidator Enabled guard: the validator runs regardless of Enabled, requiring Server/SearchBase/ServiceAccountDn even when LDAP is off. Correct fix = add an if (!Enabled) return Success guard to the shared validator and republish 0.1.1, re-pinning all consumers. (Alternative: each consumer always supplies those fields. The library fix is the principled one.)

Task 1.2/1.4 — DONE (reviewed + fixed, 2026-06-02)

Library hardened to 0.1.1 (LdapOptionsValidator skips when Enabled=false), republished, re-pinned in all 3 repos. Fix commits: OtOpcUa c4f315e (startup insecure-transport guard gated on Enabled/DevStub + Transport: Ldaps declared in the 3 prod overlays + test fidelity), MxGateway f4dc11b (group-claim shape documented as non-breaking — claim read nowhere in prod; shadow LdapOptions kept with a drift-warning doc), ScadaBridge 4db8c37 (secret-test repointed to nested key, prod checklist updated, Scope cast guarded). All targeted suites green. 1.2 (LDAP) + 1.4 (config) complete across all 3 repos.

Remaining Phase 1: 1.3 ApiKeys (MxGateway donor cutover — low risk; ScadaBridge full re-architecture — largest single item: SQLite store + Bearer format + scopes + key re-issuance), 1.5 claims/cookies, 1.6 dev base DN, 1.7 canonical roles.

Task 1.3 ApiKeys — MxGateway DONE; ScadaBridge pending (2026-06-02)

Library bumped to 0.1.2: Auth.ApiKeys SQLite migrator now stamps schema version 2 (was 1) to match the donor gateway's deployed gateway-auth.db — without it the gateway would fail to boot (migrator threw on a newer on-disk version). Final schema byte-identical since v1; no key re-issuance. Republished, re-pinned in MxGateway. (+2 migrator tests.)

MxGateway 1.3 — DONE + APPROVED (commit 05009d7): deleted 28 local pipeline files, adopted Auth.ApiKeys 0.1.2 via AddZbApiKeyAuth; kept ConstraintEnforcer/gRPC interceptor/scopes/CLI/dashboard on top via a GatewayApiKeyIdentityMapper (library identity → gateway identity-with-EffectiveConstraints). Review: no Critical; no auth bypass, schema compat + crypto parity + gRPC status mapping verified. Non-blocking follow-ups: (a) dashboard mutations now write two audit rows (library + dashboard-*) — fine, note for Phase 2 audit bridging; (b) nit: GatewayApiKeyIdentityMapper uses Constraints as string (opaque coupling) — consider a guard/contract test.

ScadaBridge 1.3 — PENDING: the full inbound-API re-architecture (SQL Server → SQLite store, X-API-Key → Bearer, per-method-approval → scopes/constraints, all inbound keys re-issued). Largest/highest-risk single item in the program; warrants its own focused pass (likely decomposed).

ScadaBridge ApiKeys re-architecture — spec (FULL ADOPT, 2026-06-02)

Decision: full adopt the library SQLite store + scopes model. Single consistent contract all layers build to:

  • Token format: Authorization: Bearer sbk_<keyId>_<secret> (prefix sbk). Replaces the raw X-API-Key header.
  • Scope model = method name. A key's Scopes set = the API-method names it may call. ApiMethod.ApprovedApiKeyIds (CSV of key int IDs) is retired; per-method approval moves to the key's scopes. Auth check at the endpoint: identity.Scopes.Contains(methodName).
  • Storage: inbound keys move to the library's SQLite store (new ScadaBridge:InboundApi:ApiKeyStore sqlite path
    • pepper via ApiKeyOptions.PepperSecretName, RunMigrationsOnStartup). The SQL Server ApiKey entity is retired; ApiMethod is KEPT minus ApprovedApiKeyIds (EF migration drops the column). InboundApiRepository loses its ApiKey methods + GetApprovedKeysForMethodAsync.
  • Auth path (InboundAPI): endpoint reads Bearer, calls library IApiKeyVerifier.VerifyAsync, then the scope check. PRESERVE the security invariants: 401 (missing/invalid/disabled), 403 identical message for both "method not found" and "not in scope" (enumeration-safety, InboundAPI-011), constant-time compare (library does it), active-node 503 + body-cap 413 filters unchanged, audit actor = key DisplayName. Delete ApiKeyValidator hashing + ApiKeyHasher.
  • Management (ManagementActor + CLI security api-key + Commons messages): drive the library IApiKeyAdminStore + ApiKeySecretGenerator. create returns sbk_<keyId>_<secret> once (plaintext-once preserved); methods a key may call = its scopes, set on create/update (e.g. --methods a,b or grant/revoke-method commands). list returns id/name/enabled (no secret), update --enabled, delete/revoke. Audit preserved.
  • CentralUI: ApiKeys.razor (list/create/toggle/delete via admin store; show token once), ApiKeyForm.razor (edit the key's method-scopes), ApiMethodForm.razor (method-side "approved keys" now reads/writes key scopes across keys).
  • Breaking change: all inbound keys re-issued (new format); clients switch X-API-KeyAuthorization: Bearer. Needs a runbook + CHANGELOG. Re-pin ScadaBridge Auth packages to 0.1.2.

Sub-tasks (sequential where files overlap): (A) storage retire + EF migration + library wiring/options; (B) auth-path rewrite (Bearer + verifier + scope check); (C) management (ManagementActor + CLI + messages); (D) CentralUI pages; (E) runbook/CHANGELOG + integration test sweep. A→(B,C)→D→E. Sequencing note: doing it additively (add library path, switch auth, rewire mgmt/UI, retire SQL Server entity LAST) keeps the build green at each step.

Re-arch progress

  • A+B foundation — DONE + reviewed+fixed (commits a94558c, 1fcc4f5; re-pinned to 0.1.2). Library AddZbApiKeyAuth wired additively (ScadaBridge:InboundApi:ApiKeyStore, prefix sbk, reuses inbound pepper); inbound endpoint now uses the library verifier + Bearer + Scopes.Contains(methodName). Security invariants preserved: 401 generic / 403 identical body for not-found AND not-in-scope (enumeration-safe, pinned to a literal in tests), scope-check-before-DB (no timing oracle), fail-fast pepper preflight (Central), audit actor = DisplayName. Old SQL Server path still compiles (retired in E). 163/163 InboundAPI tests green. NOTE for E: the library's ApiKeySecretGenerator.NewSecret() is internal — seed/create keys via the public ApiKeyAdminCommands.CreateKeyAsync seam (returns the assembled sbk_… token).
  • Library 0.1.3 — DONE + reviewed + PUBLISHED (scadaproj commits 468959c impl, 290e85c tests; pushed to Gitea, ApiKeys 0.1.3 nupkg verified HTTP 200). Added IApiKeyAdminStore.SetScopesAsync(keyId, scopes, ct) + SetEnabledAsync(keyId, enabled, whenUtc, ct) (+ audited facade verbs ApiKeyAdminCommands.SetScopesAsync/SetEnabledAsync → eventTypes set-scopes/enable-key/disable-key). No schema change (CurrentVersion stays 2): scopes column already exists; revoked_utc doubles as the enabled flag (null = enabled), so enable/disable is a reversible toggle that preserves the secret (proven by test asserting SecretHash.SequenceEqual + unchanged last_used_utc). This is what lets C/D edit a key's method-scopes and toggle enabled WITHOUT re-issuing the token. ScadaBridge must re-pin Auth packages 0.1.2 → 0.1.3.
  • C (management), D (CentralUI), E (retire SQL Server ApiKey + ApiMethod.ApprovedApiKeyIds migration + runbook/CHANGELOG) — IN PROGRESS. Mapping: CreateApiKeyCommandCreateKeyAsync (keyId = Guid.NewGuid().ToString("N"), DisplayName = name, scopes = --methods); ListApiKeysCommandListKeysAsync (enabled = RevokedUtc is null); UpdateApiKeyCommand(IsEnabled)SetEnabledAsync; new set-scopes path → SetScopesAsync; DeleteApiKeyCommand → revoke-then-DeleteKeyAsync. All management message keys switch int ApiKeyIdstring KeyId.

Discovered architecture (CentralUI Explore, 2026-06-02) — expands C/D/E

Two facts the original AE spec missed:

  1. CentralUI bypasses the ManagementActor. Components/Pages/Admin/ApiKeys.razor, ApiKeyForm.razor, and Components/Pages/Design/ApiMethodForm.razor call IInboundApiRepository (SQL Server EF) directly — they do NOT send the CreateApiKeyCommand/etc. management messages. So there are two management entry points to rewire (CLI→ManagementActor uses the messages; CentralUI→repository uses the entities). Decoupling: introduce one app-side IInboundApiKeyAdmin seam over the library ApiKeyAdminCommands, and route BOTH CLI and CentralUI through it (DRY + single audit path). The message-contract change (int→string) touches only CLI+ManagementActor; the entity/repository change (ApiKey.Id, ApiMethod.ApprovedApiKeyIds) touches CentralUI + TransportExport.
  2. TransportExport couples API keys + methods into config export/import (Components/Pages/Design/TransportExport.razor
    • .razor.cs, HashSet<int> selections, ExportSelection). With keys now in the library SQLite store (per-env pepper, secret-once), a key can't be exported/re-imported usefully. Decision (user, 2026-06-02): EXCLUDE inbound API keys from transport — export API methods only; keys are re-created + method-scopes re-granted per environment.

CentralUI blast radius (string keyId + scopes replace int Id + ApprovedApiKeyIds CSV): Admin/ApiKeys.razor, Admin/ApiKeyForm.razor, Design/ApiMethodForm.razor (approved-keys ↔ key-scopes), Design/TransportExport.razor(.cs), Design/ExternalSystems.razor (uses method int id — methods STAY int in SQL Server, so unaffected for keys), Dashboard.razor (key count), test Admin/ApiKeyFormAuditDrillinTests.cs.

C/D/E decomposition — 5 reviewed green sub-commits (user: "coordinated multi-commit now", 2026-06-02)

  • C1 — re-pin ScadaBridge Auth 0.1.2→0.1.3; add app-side IInboundApiKeyAdmin seam (string-keyId model: Create(name,methods)→(keyId,token) / List / SetEnabled / SetMethods / Delete[=revoke+delete] / GetMethodsForKey / GetKeysForMethod) over the library facade; register ApiKeyAdminCommands + the seam in Host and CentralUI DI; seam unit tests. Purely additive — build green.
  • C2 — Commons Messages/Management/SecurityCommands.cs contracts int→string keyId + add Methods + new SetApiKeyMethodsCommand; rewire ManagementActor handlers + CLI security api-key onto the seam; update ManagementActor tests. (CentralUI unaffected — it doesn't use these messages.)
  • C3 — CentralUI ApiKeys.razor/ApiKeyForm.razor/ApiMethodForm.razor (+ Dashboard count) off IInboundApiRepository- for-keys onto the seam; string keyId; method-scope editing replaces ApprovedApiKeyIds; update bUnit test. (Methods stay in SQL Server; just stop using the ApprovedApiKeyIds column — dropped in C5.)
  • C4 — TransportExport: remove API-key selection/export (methods-only); drop key HashSet<int> + ExportSelection keys; tests.
  • C5 (=E) — retire SQL Server ApiKey entity + DbContext reg + IInboundApiRepository key methods + GetApprovedKeysForMethodAsync; drop ApiMethod.ApprovedApiKeyIds; EF migration (drop ApiKeys table + column); delete residual ApiKeyValidator/ApiKeyHasher; runbook + CHANGELOG (breaking: re-issue keys, X-API-KeyAuthorization: Bearer); full build+test sweep.

Re-arch sub-commit progress (2026-06-02)

  • C1 — DONE + reviewed (ScadaBridge commits d09def2 seam+re-pin-0.1.3, 7f7ea3f review polish). IInboundApiKeyAdmin seam (interface in Commons, LibraryInboundApiKeyAdmin impl in the Security project over ApiKeyAdminCommands), DI in Host (CentralUI shares that container). Spec PASS + code-review APPROVED (guard name, doc throws/O(n) contract). Two pre-existing Host.Tests reds from the prior session's Auth work (uncaught because Host.Tests weren't run) fixed as part of restoring a green baseline: (a) 7e25efa — A+B's Central pepper preflight (1fcc4f5) needs a ≥16-char test ApiKeyPepper; supplied via env vars in the Central test fixtures (test-only) + 3 guard tests; Host.Tests 86 fail → 1. (b) 55099b1 — LDAP cutover (ac34dac) made component-lib AddSecurity(IConfiguration) violate ScadaBridge's OptionsTests arch rule; moved AddZbLdapAuth to the Host composition root, dropped the param (behaviour-preserving); Host.Tests 1 fail → 0. Green baseline now: build 0/0, Host.Tests 228, Security.Tests 89, InboundAPI 163, CentralUI 584. NOTE for Phase 2: AuditLog.AddAuditLog(IConfiguration) also takes IConfiguration but is intentionally NOT in the OptionsTests scanned set — revisit during audit adoption (Task 2.5), don't silently "fix".
  • C2 — DONE + reviewed (SB commits 6518e93 rewire, 8219b8e review fixes). Commons messages int→string keyId
    • Methods + new SetApiKeyMethodsCommand; ManagementActor's 5 API-key handlers + CLI security api-key now drive IInboundApiKeyAdmin; ScadaBridge management audit preserved (actor = user.Username; secret/token never audited/logged). Spec PASS, code-review APPROVED after fixes: not-found now throws ManagementCommandException BEFORE audit (no spurious audit on no-op update/delete/set-methods); empty Methods rejected server-side (prevents unusable key on create + stealth- disable via set-methods ""); token advisory→stderr. Green: ManagementService 125, CLI 188, + Security/InboundAPI/Host/ CentralUI unchanged. CentralUI + SQL Server ApiKey entity/repo untouched (C3/C5).
  • C3 — DONE + reviewed (SB commits 107e524 rewire, d1191fd review fixes). CentralUI Admin/ApiKeys.razor, Admin/ApiKeyForm.razor, Design/ApiMethodForm.razor, Dashboard.razor onto IInboundApiKeyAdmin: string keyId, method-NAME scopes replace the ApprovedApiKeyIds CSV, one-time token display on create, key Name fixed-after-create (no rename in the lib model). The "approved keys ↔ key scopes" inversion is a pure tested helper CentralUI/Services/ApiMethodKeyScopeReconciler.cs (save method entity first, then reconcile each affected key's full scope set fresh; empty-last-scope revoke is blocked with a clear message, never pushes an empty set). Spec PASS, code-review APPROVED after fixes: seam bool not-found now surfaced (no silent success), partial-reconcile-failure guidance ("method saved, key scopes partially applied — review on API Keys page"), create validation order, concurrent- edit reconciler test. CentralUI.Tests 595 green; all other suites unchanged. TransportExport + SQL Server entities/repo untouched (C4/C5). (Also removed a stray Name artifact file from an accidental redirect — not committed.)
  • C4 — DONE + reviewed (SB commits 731cfd3 rewire, b13d7b3 review polish). TransportExport excludes inbound API keys (methods-only) end-to-end — UI selection, ExportSelection, DependencyResolver, EntitySerializer/DTOs, BundleExporter, manifest/summary, CLI --api-keys, ManagementActor HandleExportBundle, and the IMPORT path (BundleImporter/ArtifactDiff: no key creation; method overwrite PRESERVES the destination's existing ApprovedApiKeyIds, doesn't clobber). Method export drops ApprovedApiKeyIds. Backward-compat: legacy bundles with an apiKeys section still deserialize (tolerant ApiKeys? field via shared BundleJsonOptions + WhenWritingNull) and are IGNORED on import with an ImportResult.ApiKeysIgnored count + audit stamp; new exports omit the field. UI info note added. Spec PASS, code-review APPROVED (note: review I-1 "added-unrestricted count" intentionally SKIPPED — wrong model: inbound auth is scope-based, the verifier ignores ApprovedApiKeyIds, so a new method is callable by NO key until a scope is granted). Transport.Tests 60, IntegrationTests 34 green. SQL Server ApiKey/ApiMethod entities + repo untouched (C5).
  • C5 (=E) — DONE + reviewed (SB commit afa5598). Retired SQL Server ApiKey entity + 7 IInboundApiRepository key methods + ApiMethod.ApprovedApiKeyIds + DbSet<ApiKey>/fluent config + residual ApiKeyHasher/IApiKeyHasher/ ApiKeyValidator (+ their tests). EF migration RetireInboundApiKeyStore (DropTable ApiKeys + DropColumn ApprovedApiKeyIds; Down recreates both byte-faithfully; ModelSnapshot consistent). CHANGELOG.md + tracked runbook docs/operations/inbound-api-key-reissue.md (BREAKING: X-API-KeyAuthorization: Bearer sbk_…, all keys re-issued; per-env SqlitePath + ≥16-char ApiKeyPepper). Spec PASS, code-review APPROVED: migration Down/snapshot verified, inbound verifier path (A+B) intact, no live consumer broke. Green: ConfigurationDatabase 241, InboundAPI 148 (was 163: removed validator/hasher tests), Security 89, Host 227 (was 228: removed validator DI test), ManagementService 125, CLI 188, CentralUI 595, Transport 60+34. (Pre-existing infra-dependent failures — IntegrationTests ×11, AuditLog ×1, needing live LDAP/SQL/SMTP — proven identical at baseline b13d7b3 via git-stash; StaleTagMonitor flaky timer tests pass 13/13 isolated.) Installer/secret note: the C5 code-review flagged the (untracked, intentionally .gitignored /deploy/) install.ps1 not injecting the pepper — fixed ON DISK (the on-disk installer now takes -ApiKeyPepper); a subagent had force-committed the ignored deploy script (which embeds a real default JWT key) — that commit was RESET (git reset --mixed), keeping the edit on disk and the secret OUT of git history (branch was never pushed). The pepper requirement is documented in the tracked runbook.

Task 1.3 (Adopt ZB.MOM.WW.Auth.ApiKeys) COMPLETE across all repos

MxGateway donor cutover + ScadaBridge full re-architecture (C1 seam → C2 mgmt/CLI → C3 CentralUI → C4 TransportExport → C5 retire+migration+runbook), all reviewed, lib at 0.1.3. ScadaBridge inbound API is now 100% on the shared library (Bearer sbk_<keyId>_<secret>, scope = method name, per-key SQLite store + per-env pepper); the SQL Server key model is fully retired. Remaining Phase 1: 1.5 (AspNetCore claims/cookies, 3 UIs), 1.6 (dev GLAuth base DN), 1.7 (canonical roles, 3 repos). Then Phase 2 (audit) + Phase 3 (Actor wiring).

Resolved decisions (2026-06-02)

  • Decision A — ScadaBridge inbound API keys depth → (a) FULL ADOPT. Re-architect inbound-API auth to the library's model: <prefix>_<keyId>_<secret> Bearer token format, keyId lookup + constant-time compare, scopes/constraints, and move inbound API keys into the library's SQLite store (separate from the SQL Server config DB). This is the largest, highest-risk item in Phase 1. Implications to handle in Task 1.3:
    • New SQLite auth DB for ScadaBridge inbound keys (path via ApiKeyOptions.SqlitePath); migrate/retire the SQL Server ApiKey{Name,KeyHash} table + ApiMethod.ApprovedApiKeyIds linkage.
    • Re-model per-method approval as the library's scopes/constraints (or the opaque constraint blob) — the ApiMethod.ApprovedApiKeyIds set becomes per-key scope grants.
    • Switch the inbound transport from X-API-Key header to Authorization: Bearer <token> (a client-visible contract change — extends the already-accepted token-format change; needs the interop check + a doc/CHANGELOG note).
    • Existing raw keys cannot be migrated (deterministic-by-value hash, no keyId/secret split) → re-issue all inbound API keys; call this out in the cutover runbook.
  • Decision B — canonical role mappings → confirmed as tabled above (OtOpcUa ConfigViewer→Viewer, ConfigEditor→Designer, FleetAdmin→Administrator+Deployer; MxGateway Viewer/Admin; ScadaBridge Admin→Administrator, Design→Designer, Deployment→Deployer, Audit→Administrator, AuditReadOnly→Viewer).
  • Decision C — dev escape hatches → keep app-side, unchanged. OtOpcUa DevStubMode and MxGateway AllowAnonymousLocalhost/loopback bypass have no library equivalent; preserve them in each app outside the shared Auth.Ldap path.

Phase 1 tail — decisions + current state (2026-06-02, resumed)

Task 1.0 gate read-only re-exploration confirmed the post-cutover state for 1.5/1.6/1.7 (3 parallel Explore agents):

  • None of the 3 repos reference ZbClaimTypes/ZbCookieDefaults yet. ZbClaimTypes.Name/Role alias the framework URIs (ClaimTypes.Name/.Role); DisplayName/Username/ScopeId = new zb:-prefixed strings.
  • Claim mints today: OtOpcUa AuthEndpoints.cs uses ClaimTypes.NameIdentifier + JwtTokenService.{Username,DisplayName}ClaimType ("Username"/"DisplayName") + ClaimTypes.Role (JWT-in-cookie). MxGateway DashboardAuthenticator.CreatePrincipal uses ClaimTypes.{NameIdentifier,Name,Role} + custom mxgateway:ldap_group. ScadaBridge CentralUI/Auth/AuthEndpoints.cs + JwtTokenService use plain "DisplayName"/"Username"/"Role"/"SiteId"/"LastActivity" strings — "Role"/"SiteId" are load-bearing in TokenValidationParameters + every AuthorizationPolicies RequireClaim.
  • Cookie names confirmed: ZB.MOM.WW.OtOpcUa.Auth / MxGatewayDashboard / ZB.MOM.WW.ScadaBridge.Auth. All three apps already do HttpOnly+SameSite=Strict+sliding+SecurePolicy via hand-rolled PostConfigure (no ZbCookieDefaults.Apply).
  • Dev base DNs today: OtOpcUa + MxGateway = dc=lmxopcua,dc=local; ScadaBridge = dc=scadabridge,dc=local.
  • CanonicalRole is referenced nowhere in any repo yet (Task 1.7 is its first use).

Decision A3 (Task 1.6 dev base DN) → dc=zb,dc=local (product-neutral, matches the ZB.MOM.WW family; all 3 dev fixtures + dev appsettings move to it — prod directories untouched). ScadaBridge GLAuth user DNs become cn=<user>,ou=<group>,ou=users,dc=zb,dc=local; OtOpcUa/MxGateway leave dc=lmxopcua.

Decision (Task 1.5 ScadaBridge depth) → FULL canonical incl. role/scope. Migrate ScadaBridge's role claim to the framework URI (ZbClaimTypes.Role) and the site claim to ZbClaimTypes.ScopeId across cookie + JWT mint + TokenValidationParameters + every policy RequireClaim + tests (cleanest: redefine the JwtTokenService.*ClaimType constants to alias ZbClaimTypes.* so all existing references inherit canonical values). Treated as high-risk for the ScadaBridge slice (serial spec→code review, full ScadaBridge suite). OtOpcUa/MxGateway slices stay standard.

Task 1.5 (AspNetCore claims/cookies) COMPLETE across all 3 repos (reviewed)

  • OtOpcUa 83856b7 + review-fix d0777ee (spec , code ): .Security adds the Auth.AspNetCore pkg ref; JwtTokenService.{Username,DisplayName}ClaimType alias ZbClaimTypes.{Username,DisplayName}; cookie principal emits ZbClaimTypes.Name (replaced NameIdentifier — grep-confirmed no other reader) + ZbClaimTypes.Role; cookie via ZbCookieDefaults.Apply, name kept. Issued JWT is documented as issue-only (no AddJwtBearer in OtOpcUa; role stays short "Role"; BuildValidationParameters pins RoleClaimType/NameClaimType for forward-compat). 35/35.
  • MxGateway 7e1af37 (spec , code ): DashboardAuthenticator emits ZbClaimTypes.{Username,DisplayName} + identity nameType/roleType=ZbClaimTypes.{Name,Role}; keeps mxgateway:ldap_group + NameIdentifier (HubTokenService reads it); cookie via ZbCookieDefaults.Apply(requireHttps:true, idleTimeout:8h) (8h preserved), RequireHttpsCookie=false dev-HTTP override kept, name kept. Dashboard 85/85; full 575/578 (3 pre-existing FakeWorker reds).
  • ScadaBridge a0938f7 + spelling-fix c185a56 (high-risk; spec , code ): JwtTokenService.*ClaimType constants aliased to ZbClaimTypes.* (RoleClaimType=framework URI, SiteIdClaimType=ScopeId); JWT mint MapInboundClaims=false+OutboundClaimTypeMap.Clear() (instance-isolated, reviewer-verified) and validate MapInboundClaims=false+pinned RoleClaimType/NameClaimType → byte-symmetric round-trip; cookie identity roleType=RoleClaimType; every site-scope read on SiteIdClaimType; cookie via ZbCookieDefaults.Apply (30-min idle), name kept. No AddJwtBearer middleware (sole JWT path = JwtTokenService.ValidateToken). Role VALUES unchanged. Security 93/93, CentralUI 595/595, ManagementService 125/125, Host 227/227; infra reds (Integration ×11, AuditLog ×1, flaky StaleTagMonitor) confirmed pre-existing by stash-at-HEAD. Minor (deferred): a stale "PostConfigure" comment word; JWT-validated principals have null Identity.Name (no regression, no bearer path).

Task 1.6 (unify dev LDAP base DN → dc=zb,dc=local) COMPLETE across all 3 repos (reviewed, code-review-only per small class)

Mechanical, grep-verified substitution of each repo's dev directory base DN to the neutral dc=zb,dc=local; prod left untouched (no in-repo prod overlay carries the dev DN; /deploy is gitignored and was not touched). OU structure preserved throughout.

  • OtOpcUa 8ba289f: LdapOptions.SearchBase default, integration docker-compose.yml LDAP_ROOT + TwoNodeClusterHarness SearchBase/ServiceAccountDn, AclEdit.razor placeholder, docs/v2/{dev-environment,phase-7-e2e-smoke}. grep dc=lmxopcua→empty. Security 35, AdminUI 121, ControlPlane 29, Runtime 74 green.
  • MxGateway 9572045: LdapOptions defaults, appsettings.json, dashboard test group-DNs, glauth.md (dev DNs only — the DC=corp,… prod-example column left intact), CLAUDE.md index line. grep dc=lmxopcua→empty. 575/578 (3 pre-existing FakeWorker).
  • ScadaBridge 6ae6051 (14 files): app appsettings.Central.json, the 4 docker/docker-env2 central-node configs, infra/glauth/config.toml baseDN, infra/tools/ldap_tool.py, 4 test fixtures, docs/test_infra/*. Cluster nodes use the shared scadabridge-ldap container backed by the now-updated infra/glauth/config.toml (no separate seed). grep dc=scadabridge→only the 2 excluded historical docs/plans/* records + synthetic dc=example left. Full non-infra suite green (Security 93, CentralUI 595, ManagementService 125, Host 227, ConfigurationDatabase 241).

Task 1.7 (canonical roles) — inventory + decisions (2026-06-02)

Read-only role inventory (3 parallel Explore agents) found the canonical-role standardization is bigger than the plan's "~5 min/repo": it changes role string VALUES (claims + config-DB + enforcement), needs config-DB DATA migrations, and makes the ScadaBridge SoD collapse real. EF persistence confirmed: OtOpcUa AdminRole is HasConversion<string>().HasMaxLength(32) (stores the enum MEMBER NAME); ScadaBridge LdapGroupMappings.Role is free-text nvarchar(500) with HasData seed. Both → renaming role values requires a data migration.

Resolved per-repo mapping (Decision B + filled gaps):

  • MxGateway: Viewer→Viewer (no-op), Admin→Administrator. Clean rename of DashboardRoles.Admin VALUE + GroupToRole config + GatewayOptionsValidator allowed-set. NO DB (dashboard roles not persisted). ⚠️ MUST NOT touch the separate gRPC GatewayScopes.Admin = "admin" data-plane scope.
  • OtOpcUa: ConfigViewer→Viewer, ConfigEditor→Designer, FleetAdmin→Administrator, DriverOperator→Operator (plan-omitted gap). Rename AdminRole members + DevStub/appsettings GroupToRole values + every [Authorize(Roles=)]/RequireRole role string. Config-DB data migration on LdapGroupRoleMappings.Role (raw SQL UPDATE old→new; column is the same string col so it's a data, not schema, change). Data-plane NodePermissions bitmask UNTOUCHED. Enforcement preserved: Designer(←ConfigEditor) keeps the deploy access it has today (Deployments.razor Roles="FleetAdmin,ConfigEditor""Administrator,Designer"). Policy NAMES (e.g. "DriverOperator"/"FleetAdmin" policy keys) may stay as internal indirections; only the role STRINGS they check become canonical.
  • ScadaBridge (heaviest): Admin→Administrator, Design→Designer, Deployment→Deployer, Audit→Administrator (collapse), AuditReadOnly→Viewer (collapse). Requires: config-DB data migration (LdapGroupMappings.Role UPDATE + HasData seed + ModelSnapshot); ~20 hard-coded role-string sites (ManagementActor site-scope bypass ×6 + GetRequiredRole, DebugStreamHub ×2, BrowseService/BindingTester, policy arrays); SoD policy rework OperationalAuditRoles→{Administrator,Viewer} + AuditExportRoles→{Administrator} so former AuditReadOnly(→Viewer) keeps audit-READ but still can't export; all role-asserting tests. Real security consequence (accepted): Audit→Administrator grants former audit-only users the full admin surface (create sites, manage LDAP mappings/API keys, import bundles). Site-scoping stays orthogonal (computed from PermittedSiteIds, Deployment-only).

Decisions (2026-06-02): depth = FULL canonical (values change, incl. config-DB migrations + real SoD escalation); cadence = proceed now. Execution: MxGateway + OtOpcUa single high-risk commits each (parallel); ScadaBridge as a focused atomic change (12 coupled commits — the rename + seed + migration are coupled, so it does not cleanly split into 1.3-style green sub-increments). High-risk serial review (spec→code) per repo + full ScadaBridge suite.

Task 1.7 (canonical roles) COMPLETE across all 3 repos (high-risk; spec + code each)

  • MxGateway 04bce3ff (spec , code ): DashboardRoles.Admin value "Admin"→"Administrator" (Viewer unchanged) + GroupToRole config; validator/enforcement inherit the constant. NO DB (dashboard roles not persisted). gRPC GatewayScopes.Admin="admin" proven untouched. 577/580 (3 pre-existing FakeWorker).
  • OtOpcUa c1619d9 (spec , code ): AdminRole enum members → Viewer/Designer/Administrator; DriverOperator role string → Operator (policy NAMES kept stable); DevStub ["Administrator"]. Data migration 20260602112419_CanonicalizeAdminRoles (UPDATE LdapGroupRoleMapping old→new, reverse Down, snapshot unchanged, no pending model changes). Deployments.razor [Authorize(Roles="Administrator,Designer")] (deploy access preserved). Data-plane NodePermissions/NodeAcl/evaluator untouched (proven). Security 45, Configuration 90, AdminUI 121 green. (Minor non-issues: an ou=FleetAdmin placeholder DN + a data-plane doc-comment — both LDAP-group/doc text, not role values.)
  • ScadaBridge b104760 + doc-fix 4118452 (high-risk; spec , code ): Roles → canonical {Administrator,Designer,Deployer,Viewer} (Audit/AuditReadOnly removed); SoD reworked OperationalAudit={Administrator,Viewer}, AuditExport={Administrator} (Viewer reads-not-exports audit; Administrator does both + full admin). All enforcement literals moved incl. the 6 ManagementActor site-scope bypasses + DebugStreamHub + BrowseService/BindingTester. Migration 20260602113822_CanonicalizeRoles (seed UpdateData + idempotent raw catch-all for operator rows; lossy Down documented; snapshot consistent). Real SoD escalation (Audit→Administrator gains full admin) documented in CHANGELOG. Full non-infra suite green (Security 93, CentralUI 595, ManagementService 125, Host 227, ConfigurationDatabase 241); infra reds pre-existing (stash-at-HEAD confirmed). 4118452 corrected stale role-name prose in NavMenu comments (comment-only; CentralUI rebuild 0/0).

PHASE 1 COMPLETE (2026-06-02)

All of Tasks 1.01.7 done across OtOpcUa, MxAccessGateway, ScadaBridge — each on its local-only feat/adopt-zb-auth branch, nothing pushed. The three apps now consume ZB.MOM.WW.Auth.* from the Gitea feed (OtOpcUa 0.1.1 Abstractions+Ldap+AspNetCore; MxGateway 0.1.2 all-four; ScadaBridge 0.1.3 all-four): shared LDAP (Auth.Ldap), shared API-key model (Auth.ApiKeys, ScadaBridge fully re-architected), IGroupRoleMapper<TRole> seam, nested/Transport-enum config, canonical ZbClaimTypes/ZbCookieDefaults, unified dev base DN dc=zb,dc=local, and the canonical-six role vocabulary (with ScadaBridge's accepted auditor/admin SoD collapse). Every task spec- and code-reviewed; high-risk ones via the serial chain + full-suite runs. Phase 1 exit gate met. Next: Phase 2 (audit component — the original ask) starting at the Task 2.0 gate, then Phase 3 (wire audit Actor from the Auth principal).