Files
scadaproj/docs/plans/2026-06-02-auth-audit-normalization-phase1.md
T

14 KiB

Phase 1 (Auth adoption) — elaborated steps + Task 1.0 findings

Companion to 2026-06-02-auth-audit-normalization.md. Produced by the Task 1.0 read-only exploration gate (4 parallel explorers: library surface + 3 repos). All paths verified 2026-06-02 against source.

Cutover target — ZB.MOM.WW.Auth public surface

Package Consumer entry points
.Abstractions NB: IGroupRoleMapper<TRole>/GroupRoleMapping<TRole>/CanonicalRole live in namespace ZB.MOM.WW.Auth.Abstractions.Roles (verified during Task 1.1). ILdapAuthService, LdapOptions (Transport: LdapTransport{Ldaps,StartTls,None}, AllowInsecure, UserNameAttribute, GroupAttribute, ServiceAccountDn/Password, SearchBase, ConnectionTimeoutMs, ServerCertificateValidationCallback), LdapAuthResult(Succeeded,Username,DisplayName,Groups,Failure), LdapAuthFailure, CanonicalRole{Viewer,Operator,Engineer,Designer,Deployer,Administrator}, IGroupRoleMapper<TRole> (no default impl — consumer writes it) → GroupRoleMapping<TRole>(Roles, Scope:object?), plus API-key abstractions (IApiKeyVerifier, ApiKeyVerification, ApiKeyIdentity, IApiKeyStore/IApiKeyAdminStore/IApiKeyAuditStore, ApiKeyOptions{TokenPrefix,PepperSecretName,SqlitePath,RunMigrationsOnStartup})
.Ldap LdapAuthService(LdapOptions) : ILdapAuthService. Bind-then-search, fail-closed, never throws. LdapOptionsValidator (TLS-or-AllowInsecure) auto-registered.
.ApiKeys ApiKeyVerifier(ApiKeyOptions, IApiKeyStore, IApiKeyPepperProvider, TimeProvider?), ApiKeyParser.TryParse (<prefix>_<keyId>_<secret>), ApiKeySecretGenerator.NewSecret(), default SQLite stores, ConfigurationApiKeyPepperProvider. Extracted from MxGateway — near-1:1 with its pipeline.
.AspNetCore ZbClaimTypes{Name,Role,DisplayName,Username,ScopeId}, ZbCookieDefaults.Apply(opts, requireHttps, idleTimeout), DI: AddZbLdapAuth(services, config, sectionPath), AddZbApiKeyAuth(services, config, sectionPath).

Per-app current state (verified) and elaborated cutover

OtOpcUa — packages: Abstractions + Ldap + AspNetCore (no ApiKeys)

Current LDAP: src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapAuthService.cs (impl), ILdapAuthService.cs, LdapOptions.cs (section Security:Ldap, UseTls bool, Enabled, DevStubMode, embedded GroupToRole dict), LdapAuthResult.cs (already carries Roles). Role mapping is config + DB: RoleMapper.Map (config GroupToRole) + RoleMapper.Merge with DB LdapGroupRoleMappingService/LdapGroupRoleMapping (system-wide rows). Native roles AdminRole{ConfigViewer,ConfigEditor,FleetAdmin} (control-plane only; data-plane is a separate NodePermissions bitmask). DI: two TryAddSingleton<ILdapAuthService,LdapAuthService> sites (Security/ServiceCollectionExtensions.cs:42 + Host/Program.cs:106). Cookie ZB.MOM.WW.OtOpcUa.Auth, single Cookie scheme (JWT inside cookie). Second LDAP consumer: OPC UA data-plane LdapOpcUaUserAuthenticator + OpcUaApplicationHost.HandleImpersonation call the LDAP service too.

  • 1.1 mapper: implement IGroupRoleMapper<AdminRole> (or <string>) wrapping RoleMapper.Map + DB Merge.
  • 1.2 Ldap: replace LdapAuthService with Auth.Ldap; restructure flow to ILdapAuthService → Groups → IGroupRoleMapper → roles → claims; preserve DevStubMode app-side (library has no stub); wire BOTH consumers (login endpoint + OPC UA impersonation).
  • 1.4 config: UseTlsTransport enum (section already Security:Ldap — see Finding #1).
  • 1.5 cookie/claims: use ZbClaimTypes + ZbCookieDefaults.Apply; keep cookie name.
  • 1.7 roles: ConfigViewer→Viewer, ConfigEditor→Designer, FleetAdmin→Administrator(+Deployer; publish⊂FleetAdmin). Data-plane NodePermissions unaffected.

MxAccessGateway — packages: all 4 (ApiKeys source, cuts over first)

Current API keys (src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/): ApiKeyParser (mxgw_<id>_<secret>), ApiKeySecretHasher (HMAC-SHA256 + pepper MxGateway:ApiKeyPepper), ApiKeySecretGenerator, ApiKeyVerifier (FixedTimeEquals), SQLite stores, ConstraintEnforcer + rich ApiKeyConstraints, gRPC GatewayGrpcAuthorizationInterceptor + GatewayScopes. DI AddSqliteAuthStore(). → near-1:1 with Auth.ApiKeys. LDAP: Dashboard/DashboardAuthenticator.cs (MxGateway:Ldap, UseTls), GroupToRole under MxGateway:Dashboard, roles Admin/Viewer, cookie MxGatewayDashboard.

  • 1.1 mapper: IGroupRoleMapper<string> wrapping DashboardAuthenticator.MapGroupsToRoles.
  • 1.2 Ldap: replace DashboardAuthenticator's LDAP internals with Auth.Ldap (keep dashboard claims/principal build).
  • 1.3 ApiKeys: delete the local parser/hasher/generator/verifier/stores; re-point to Auth.ApiKeys; keep ConstraintEnforcer + gRPC interceptor + scopes on top (constraints carried as the opaque blob). Lowest-risk ApiKeys cutover (it's the donor).
  • 1.4 config: UseTlsTransport.
  • 1.5/1.7: ZbClaimTypes/cookie defaults; Viewer→Viewer, Admin→Administrator.

ScadaBridge — packages: all 4 (Ldap source; ApiKeys consumer)

Current LDAP (src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs): the hardened reference (RFC-4514 DN escape, filter escape, per-op timeout, fail-closed group lookup, username trim, service-account-bind distinction). Config is flat ScadaBridge:Security:Ldap* in SecurityOptions.cs with LdapTransport enum already (Ldaps/StartTls/None), AllowInsecureLdap, LdapUserIdAttribute, LdapGroupAttribute, validated by SecurityOptionsValidator : OptionsValidatorBase. Role mapping DB-backed with site-scoping: RoleMapper.MapGroupsToRolesAsyncRoleMappingResult(Roles, PermittedSiteIds, IsSystemWideDeployment) over LdapGroupMapping + SiteScopeRule (SQL Server). Roles Admin/Design/Deployment/Audit/AuditReadOnly; SoD via OperationalAudit{Admin,Audit,AuditReadOnly} + AuditExport{Admin,Audit}. Cookie ZB.MOM.WW.ScadaBridge.Auth; JWT-in-cookie via JwtTokenService. Inbound API keys (InboundAPI/ApiKeyValidator.cs): raw X-API-Key, deterministic HMAC (ApiKeyHasher, no per-row salt, by-value lookup), ApiKey{Name,KeyHash,IsEnabled} in SQL Server, per-method approval via ApiMethod.ApprovedApiKeyIdsarchitecturally different from the library's keyId/scope/SQLite model.

  • 1.1 mapper: IGroupRoleMapper<string> wrapping RoleMapper.MapGroupsToRolesAsync, carrying PermittedSiteIds/IsSystemWideDeployment in GroupRoleMapping.Scope.
  • 1.2 Ldap: ScadaBridge is the donor — confirm Auth.Ldap behaviour-matches, then re-point LdapAuthService usages to the library type. Lowest-risk Ldap cutover.
  • 1.3 ApiKeys: see Finding #3 — bigger than a token reformat; needs a scope decision.
  • 1.4 config: nest flat Security:Ldap* under a sub-section + rename LdapUserIdAttribute→UserNameAttribute, LdapGroupAttribute→GroupAttribute, LdapTransport→Transport (+ SecurityOptionsValidator + appsettings). Enum already matches.
  • 1.7 roles: Admin→Administrator, Design→Designer, Deployment→Deployer, Audit→Administrator (collapse), AuditReadOnly→Viewer (collapse) — removes the OperationalAudit/AuditExport SoD (accepted).

Key findings that change the plan

  1. OtOpcUa LDAP section is Security:Ldap, not Authentication:Ldap. Both components/auth/GAPS.md §1 and the auth current-state doc are wrong; the code (and the prior fix in memory) use Security:Ldap. → Task 1.4 for OtOpcUa is only UseTlsTransport, not a section move.
  2. OtOpcUa "double-singleton bug" is already mitigated. Both registration sites use TryAddSingleton (dedupes); the Enabled flag is an intentional fail-closed master switch. → Not a blocking fix; verify and keep Enabled. Removes a risk the plan flagged.
  3. ScadaBridge inbound API keys are a re-architecture, not a token reformat. The library's ApiKeys model (<prefix>_<keyId>_<secret> Bearer, keyId lookup + constant-time compare, SQLite store, scopes + opaque constraints) is fundamentally different from ScadaBridge's (raw X-API-Key, deterministic by-value HMAC lookup, SQL Server ApiKey{Name,KeyHash}, per-method approval list). Wholesale adoption means re-architecting inbound-API auth AND resolving a SQLite-vs-SQL-Server storage mismatch. Needs a scope decision (Decision A).
  4. OtOpcUa role mapping is config + DB, not just config (RoleMapper.Map baseline + DB Merge). The IGroupRoleMapper impl must combine both. OtOpcUa also has DevStubMode (no library equivalent — keep app-side) and a second LDAP consumer (OPC UA data-plane impersonation) that must be re-wired too.
  5. MxGateway ApiKeys cutover is the donor path — lowest risk (delete locals, re-point to library; keep ConstraintEnforcer/gRPC/scopes on top). Confirms the GAPS sequencing (gateway first).

Task 1.2 (LDAP cutover) — implemented + reviewed (2026-06-02)

Commits: OtOpcUa 257caa7, MxGateway c3b466e, ScadaBridge ac34dac. All targeted tests green. Security review verdict: sound, no credential-leak regression in any repo (insecure-transport guards fire correctly; DevStubMode cannot leak to prod; claim shapes preserved). All three returned CHANGES-REQUESTED for fixable issues:

  • OtOpcUa (no Critical): (I1) insecure-transport guard is login-time only — add startup validation gated on Enabled for defense-in-depth, verify prod overlays still boot; (I2) integration stub pre-populates Roles so the Groups→mapper path isn't actually exercised — fix the stub; (I3) document/test the zero-role fail-closed fallback.
  • MxGateway (2 Critical): (C1) library strips group DNs to short RDN names before the LdapGroupClaimType claim → verify prior behaviour, document, drop the now-dead full-DN branch in the mapper, add a claim-value assertion; (C2) gateway's local LdapOptions is now a shadow copy (validated but unused at runtime) → fold to the shared type or document the drift. (I1) shared LdapOptionsValidator has no Enabled=false guard → validates even when LDAP is disabled (real for MxGateway, which can disable dashboard LDAP).
  • ScadaBridge (2 Critical): (C1) ConfigSecretsTests still checks the OLD flat key → passes vacuously, no longer guards secret-in-config — repoint to nested key; (C2) production-checklist.md still lists deleted flat keys → update; (I) unsafe (RoleMappingResult)Scope! cast → null-guard.

Cross-cutting decision — shared library LdapOptionsValidator Enabled guard: the validator runs regardless of Enabled, requiring Server/SearchBase/ServiceAccountDn even when LDAP is off. Correct fix = add an if (!Enabled) return Success guard to the shared validator and republish 0.1.1, re-pinning all consumers. (Alternative: each consumer always supplies those fields. The library fix is the principled one.)

Task 1.2/1.4 — DONE (reviewed + fixed, 2026-06-02)

Library hardened to 0.1.1 (LdapOptionsValidator skips when Enabled=false), republished, re-pinned in all 3 repos. Fix commits: OtOpcUa c4f315e (startup insecure-transport guard gated on Enabled/DevStub + Transport: Ldaps declared in the 3 prod overlays + test fidelity), MxGateway f4dc11b (group-claim shape documented as non-breaking — claim read nowhere in prod; shadow LdapOptions kept with a drift-warning doc), ScadaBridge 4db8c37 (secret-test repointed to nested key, prod checklist updated, Scope cast guarded). All targeted suites green. 1.2 (LDAP) + 1.4 (config) complete across all 3 repos.

Remaining Phase 1: 1.3 ApiKeys (MxGateway donor cutover — low risk; ScadaBridge full re-architecture — largest single item: SQLite store + Bearer format + scopes + key re-issuance), 1.5 claims/cookies, 1.6 dev base DN, 1.7 canonical roles.

Resolved decisions (2026-06-02)

  • Decision A — ScadaBridge inbound API keys depth → (a) FULL ADOPT. Re-architect inbound-API auth to the library's model: <prefix>_<keyId>_<secret> Bearer token format, keyId lookup + constant-time compare, scopes/constraints, and move inbound API keys into the library's SQLite store (separate from the SQL Server config DB). This is the largest, highest-risk item in Phase 1. Implications to handle in Task 1.3:
    • New SQLite auth DB for ScadaBridge inbound keys (path via ApiKeyOptions.SqlitePath); migrate/retire the SQL Server ApiKey{Name,KeyHash} table + ApiMethod.ApprovedApiKeyIds linkage.
    • Re-model per-method approval as the library's scopes/constraints (or the opaque constraint blob) — the ApiMethod.ApprovedApiKeyIds set becomes per-key scope grants.
    • Switch the inbound transport from X-API-Key header to Authorization: Bearer <token> (a client-visible contract change — extends the already-accepted token-format change; needs the interop check + a doc/CHANGELOG note).
    • Existing raw keys cannot be migrated (deterministic-by-value hash, no keyId/secret split) → re-issue all inbound API keys; call this out in the cutover runbook.
  • Decision B — canonical role mappings → confirmed as tabled above (OtOpcUa ConfigViewer→Viewer, ConfigEditor→Designer, FleetAdmin→Administrator+Deployer; MxGateway Viewer/Admin; ScadaBridge Admin→Administrator, Design→Designer, Deployment→Deployer, Audit→Administrator, AuditReadOnly→Viewer).
  • Decision C — dev escape hatches → keep app-side, unchanged. OtOpcUa DevStubMode and MxGateway AllowAnonymousLocalhost/loopback bypass have no library equivalent; preserve them in each app outside the shared Auth.Ldap path.