Files
scadaproj/docs/plans/2026-06-02-auth-audit-normalization-phase1.md
T

367 lines
42 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase 1 (Auth adoption) — elaborated steps + Task 1.0 findings
Companion to `2026-06-02-auth-audit-normalization.md`. Produced by the Task 1.0 read-only
exploration gate (4 parallel explorers: library surface + 3 repos). All paths verified
2026-06-02 against source.
## Cutover target — `ZB.MOM.WW.Auth` public surface
| Package | Consumer entry points |
|---|---|
| `.Abstractions` | **NB: `IGroupRoleMapper<TRole>`/`GroupRoleMapping<TRole>`/`CanonicalRole` live in namespace `ZB.MOM.WW.Auth.Abstractions.Roles`** (verified during Task 1.1). `ILdapAuthService`, `LdapOptions` (`Transport: LdapTransport{Ldaps,StartTls,None}`, `AllowInsecure`, `UserNameAttribute`, `GroupAttribute`, `ServiceAccountDn/Password`, `SearchBase`, `ConnectionTimeoutMs`, `ServerCertificateValidationCallback`), `LdapAuthResult(Succeeded,Username,DisplayName,Groups,Failure)`, `LdapAuthFailure`, `CanonicalRole{Viewer,Operator,Engineer,Designer,Deployer,Administrator}`, `IGroupRoleMapper<TRole>` (**no default impl — consumer writes it**) → `GroupRoleMapping<TRole>(Roles, Scope:object?)`, plus API-key abstractions (`IApiKeyVerifier`, `ApiKeyVerification`, `ApiKeyIdentity`, `IApiKeyStore`/`IApiKeyAdminStore`/`IApiKeyAuditStore`, `ApiKeyOptions{TokenPrefix,PepperSecretName,SqlitePath,RunMigrationsOnStartup}`) |
| `.Ldap` | `LdapAuthService(LdapOptions)` : `ILdapAuthService`. Bind-then-search, fail-closed, never throws. `LdapOptionsValidator` (TLS-or-AllowInsecure) auto-registered. |
| `.ApiKeys` | `ApiKeyVerifier(ApiKeyOptions, IApiKeyStore, IApiKeyPepperProvider, TimeProvider?)`, `ApiKeyParser.TryParse` (`<prefix>_<keyId>_<secret>`), `ApiKeySecretGenerator.NewSecret()`, default SQLite stores, `ConfigurationApiKeyPepperProvider`. **Extracted from MxGateway — near-1:1 with its pipeline.** |
| `.AspNetCore` | `ZbClaimTypes{Name,Role,DisplayName,Username,ScopeId}`, `ZbCookieDefaults.Apply(opts, requireHttps, idleTimeout)`, DI: `AddZbLdapAuth(services, config, sectionPath)`, `AddZbApiKeyAuth(services, config, sectionPath)`. |
## Per-app current state (verified) and elaborated cutover
### OtOpcUa — packages: Abstractions + Ldap + AspNetCore (no ApiKeys)
Current LDAP: `src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapAuthService.cs` (impl), `ILdapAuthService.cs`,
`LdapOptions.cs` (**section `Security:Ldap`**, `UseTls` bool, `Enabled`, `DevStubMode`, embedded `GroupToRole` dict),
`LdapAuthResult.cs` (already carries `Roles`). Role mapping is **config + DB**: `RoleMapper.Map` (config
`GroupToRole`) + `RoleMapper.Merge` with DB `LdapGroupRoleMappingService`/`LdapGroupRoleMapping` (system-wide rows).
Native roles `AdminRole{ConfigViewer,ConfigEditor,FleetAdmin}` (control-plane only; data-plane is a separate
`NodePermissions` bitmask). DI: two `TryAddSingleton<ILdapAuthService,LdapAuthService>` sites
(`Security/ServiceCollectionExtensions.cs:42` + `Host/Program.cs:106`). Cookie `ZB.MOM.WW.OtOpcUa.Auth`,
single Cookie scheme (JWT inside cookie). **Second LDAP consumer:** OPC UA data-plane
`LdapOpcUaUserAuthenticator` + `OpcUaApplicationHost.HandleImpersonation` call the LDAP service too.
- **1.1 mapper:** implement `IGroupRoleMapper<AdminRole>` (or `<string>`) wrapping `RoleMapper.Map` + DB `Merge`.
- **1.2 Ldap:** replace `LdapAuthService` with `Auth.Ldap`; restructure flow to `ILdapAuthService → Groups → IGroupRoleMapper → roles → claims`; **preserve `DevStubMode` app-side** (library has no stub); wire BOTH consumers (login endpoint + OPC UA impersonation).
- **1.4 config:** `UseTls``Transport` enum (section already `Security:Ldap` — see Finding #1).
- **1.5 cookie/claims:** use `ZbClaimTypes` + `ZbCookieDefaults.Apply`; keep cookie name.
- **1.7 roles:** `ConfigViewer→Viewer`, `ConfigEditor→Designer`, `FleetAdmin→Administrator(+Deployer; publish⊂FleetAdmin)`. Data-plane `NodePermissions` unaffected.
### MxAccessGateway — packages: all 4 (ApiKeys **source**, cuts over first)
Current API keys (`src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/`): `ApiKeyParser` (`mxgw_<id>_<secret>`),
`ApiKeySecretHasher` (HMAC-SHA256 + pepper `MxGateway:ApiKeyPepper`), `ApiKeySecretGenerator`, `ApiKeyVerifier`
(`FixedTimeEquals`), SQLite stores, `ConstraintEnforcer` + rich `ApiKeyConstraints`, gRPC
`GatewayGrpcAuthorizationInterceptor` + `GatewayScopes`. DI `AddSqliteAuthStore()`. → **near-1:1 with `Auth.ApiKeys`.**
LDAP: `Dashboard/DashboardAuthenticator.cs` (`MxGateway:Ldap`, `UseTls`), `GroupToRole` under `MxGateway:Dashboard`,
roles `Admin`/`Viewer`, cookie `MxGatewayDashboard`.
- **1.1 mapper:** `IGroupRoleMapper<string>` wrapping `DashboardAuthenticator.MapGroupsToRoles`.
- **1.2 Ldap:** replace `DashboardAuthenticator`'s LDAP internals with `Auth.Ldap` (keep dashboard claims/principal build).
- **1.3 ApiKeys:** delete the local parser/hasher/generator/verifier/stores; re-point to `Auth.ApiKeys`; **keep** `ConstraintEnforcer` + gRPC interceptor + scopes on top (constraints carried as the opaque blob). Lowest-risk ApiKeys cutover (it's the donor).
- **1.4 config:** `UseTls``Transport`.
- **1.5/1.7:** `ZbClaimTypes`/cookie defaults; `Viewer→Viewer`, `Admin→Administrator`.
### ScadaBridge — packages: all 4 (Ldap **source**; ApiKeys consumer)
Current LDAP (`src/ZB.MOM.WW.ScadaBridge.Security/LdapAuthService.cs`): the hardened reference (RFC-4514 DN escape,
filter escape, per-op timeout, fail-closed group lookup, username trim, service-account-bind distinction). Config is
**flat** `ScadaBridge:Security:Ldap*` in `SecurityOptions.cs` with **`LdapTransport` enum already** (`Ldaps/StartTls/None`),
`AllowInsecureLdap`, `LdapUserIdAttribute`, `LdapGroupAttribute`, validated by `SecurityOptionsValidator : OptionsValidatorBase`.
Role mapping **DB-backed** with **site-scoping**: `RoleMapper.MapGroupsToRolesAsync``RoleMappingResult(Roles, PermittedSiteIds, IsSystemWideDeployment)` over `LdapGroupMapping` + `SiteScopeRule` (SQL Server). Roles
`Admin/Design/Deployment/Audit/AuditReadOnly`; SoD via `OperationalAudit{Admin,Audit,AuditReadOnly}` + `AuditExport{Admin,Audit}`.
Cookie `ZB.MOM.WW.ScadaBridge.Auth`; JWT-in-cookie via `JwtTokenService`.
**Inbound API keys** (`InboundAPI/ApiKeyValidator.cs`): **raw `X-API-Key`**, **deterministic** HMAC (`ApiKeyHasher`, no per-row salt, by-value lookup), `ApiKey{Name,KeyHash,IsEnabled}` in **SQL Server**, **per-method approval** via `ApiMethod.ApprovedApiKeyIds`**architecturally different from the library's keyId/scope/SQLite model.**
- **1.1 mapper:** `IGroupRoleMapper<string>` wrapping `RoleMapper.MapGroupsToRolesAsync`, carrying `PermittedSiteIds`/`IsSystemWideDeployment` in `GroupRoleMapping.Scope`.
- **1.2 Ldap:** ScadaBridge is the donor — confirm `Auth.Ldap` behaviour-matches, then re-point `LdapAuthService` usages to the library type. Lowest-risk Ldap cutover.
- **1.3 ApiKeys:** **see Finding #3 — bigger than a token reformat; needs a scope decision.**
- **1.4 config:** nest flat `Security:Ldap*` under a sub-section + rename `LdapUserIdAttribute→UserNameAttribute`, `LdapGroupAttribute→GroupAttribute`, `LdapTransport→Transport` (+ `SecurityOptionsValidator` + appsettings). Enum already matches.
- **1.7 roles:** `Admin→Administrator`, `Design→Designer`, `Deployment→Deployer`, `Audit→Administrator` (collapse), `AuditReadOnly→Viewer` (collapse) — removes the `OperationalAudit`/`AuditExport` SoD (accepted).
## Key findings that change the plan
1. **OtOpcUa LDAP section is `Security:Ldap`, not `Authentication:Ldap`.** Both `components/auth/GAPS.md §1`
and the auth current-state doc are wrong; the code (and the prior fix in memory) use `Security:Ldap`.
→ Task 1.4 for OtOpcUa is only `UseTls``Transport`, not a section move.
2. **OtOpcUa "double-singleton bug" is already mitigated.** Both registration sites use `TryAddSingleton`
(dedupes); the `Enabled` flag is an intentional fail-closed master switch. → Not a blocking fix; verify and
keep `Enabled`. Removes a risk the plan flagged.
3. **ScadaBridge inbound API keys are a re-architecture, not a token reformat.** The library's ApiKeys model
(`<prefix>_<keyId>_<secret>` Bearer, keyId lookup + constant-time compare, SQLite store, scopes + opaque
constraints) is fundamentally different from ScadaBridge's (raw `X-API-Key`, deterministic by-value HMAC
lookup, SQL Server `ApiKey{Name,KeyHash}`, per-method approval list). Wholesale adoption means re-architecting
inbound-API auth AND resolving a SQLite-vs-SQL-Server storage mismatch. **Needs a scope decision (Decision A).**
4. **OtOpcUa role mapping is config + DB**, not just config (`RoleMapper.Map` baseline + DB `Merge`). The
`IGroupRoleMapper` impl must combine both. OtOpcUa also has `DevStubMode` (no library equivalent — keep app-side)
and a **second LDAP consumer** (OPC UA data-plane impersonation) that must be re-wired too.
5. **MxGateway ApiKeys cutover is the donor path — lowest risk** (delete locals, re-point to library; keep
`ConstraintEnforcer`/gRPC/scopes on top). Confirms the GAPS sequencing (gateway first).
## Task 1.2 (LDAP cutover) — implemented + reviewed (2026-06-02)
Commits: OtOpcUa `257caa7`, MxGateway `c3b466e`, ScadaBridge `ac34dac`. All targeted tests green.
Security review verdict: **sound, no credential-leak regression** in any repo (insecure-transport
guards fire correctly; DevStubMode cannot leak to prod; claim shapes preserved). All three returned
CHANGES-REQUESTED for fixable issues:
- **OtOpcUa** (no Critical): (I1) insecure-transport guard is login-time only — add startup
validation gated on `Enabled` for defense-in-depth, verify prod overlays still boot; (I2) integration
stub pre-populates `Roles` so the Groups→mapper path isn't actually exercised — fix the stub; (I3)
document/test the zero-role fail-closed fallback.
- **MxGateway** (2 Critical): (C1) library strips group DNs to short RDN names before the
`LdapGroupClaimType` claim → verify prior behaviour, document, drop the now-dead full-DN branch in the
mapper, add a claim-value assertion; (C2) gateway's local `LdapOptions` is now a shadow copy (validated
but unused at runtime) → fold to the shared type or document the drift. (I1) shared `LdapOptionsValidator`
has **no `Enabled=false` guard** → validates even when LDAP is disabled (real for MxGateway, which can
disable dashboard LDAP).
- **ScadaBridge** (2 Critical): (C1) `ConfigSecretsTests` still checks the OLD flat key → passes
vacuously, no longer guards secret-in-config — repoint to nested key; (C2) `production-checklist.md`
still lists deleted flat keys → update; (I) unsafe `(RoleMappingResult)Scope!` cast → null-guard.
**Cross-cutting decision — shared library `LdapOptionsValidator` `Enabled` guard:** the validator runs
regardless of `Enabled`, requiring Server/SearchBase/ServiceAccountDn even when LDAP is off. Correct fix =
add an `if (!Enabled) return Success` guard to the shared validator and republish `0.1.1`, re-pinning all
consumers. (Alternative: each consumer always supplies those fields. The library fix is the principled one.)
## Task 1.2/1.4 — DONE (reviewed + fixed, 2026-06-02)
Library hardened to **`0.1.1`** (`LdapOptionsValidator` skips when `Enabled=false`), republished, re-pinned in all 3 repos.
Fix commits: OtOpcUa `c4f315e` (startup insecure-transport guard gated on Enabled/DevStub + `Transport: Ldaps`
declared in the 3 prod overlays + test fidelity), MxGateway `f4dc11b` (group-claim shape documented as
non-breaking — claim read nowhere in prod; shadow `LdapOptions` kept with a drift-warning doc), ScadaBridge
`4db8c37` (secret-test repointed to nested key, prod checklist updated, `Scope` cast guarded). All targeted
suites green. **1.2 (LDAP) + 1.4 (config) complete across all 3 repos.**
Remaining Phase 1: **1.3 ApiKeys** (MxGateway donor cutover — low risk; ScadaBridge full re-architecture —
largest single item: SQLite store + Bearer format + scopes + key re-issuance), **1.5** claims/cookies,
**1.6** dev base DN, **1.7** canonical roles.
## Task 1.3 ApiKeys — MxGateway DONE; ScadaBridge pending (2026-06-02)
**Library bumped to `0.1.2`**: `Auth.ApiKeys` SQLite migrator now stamps schema version **2** (was 1) to
match the donor gateway's deployed `gateway-auth.db` — without it the gateway would fail to boot (migrator
threw on a newer on-disk version). Final schema byte-identical since v1; no key re-issuance. Republished,
re-pinned in MxGateway. (+2 migrator tests.)
**MxGateway 1.3 — DONE + APPROVED** (commit `05009d7`): deleted 28 local pipeline files, adopted
`Auth.ApiKeys 0.1.2` via `AddZbApiKeyAuth`; kept `ConstraintEnforcer`/gRPC interceptor/scopes/CLI/dashboard
on top via a `GatewayApiKeyIdentityMapper` (library identity → gateway identity-with-EffectiveConstraints).
Review: no Critical; no auth bypass, schema compat + crypto parity + gRPC status mapping verified. Non-blocking
follow-ups: (a) dashboard mutations now write two audit rows (library + `dashboard-*`) — fine, note for Phase 2
audit bridging; (b) nit: `GatewayApiKeyIdentityMapper` uses `Constraints as string` (opaque coupling) — consider
a guard/contract test.
**ScadaBridge 1.3 — PENDING**: the full inbound-API re-architecture (SQL Server → SQLite store, `X-API-Key`
→ Bearer, per-method-approval → scopes/constraints, **all inbound keys re-issued**). Largest/highest-risk
single item in the program; warrants its own focused pass (likely decomposed).
## ScadaBridge ApiKeys re-architecture — spec (FULL ADOPT, 2026-06-02)
Decision: **full adopt** the library SQLite store + scopes model. Single consistent contract all layers build to:
- **Token format**: `Authorization: Bearer sbk_<keyId>_<secret>` (prefix `sbk`). Replaces the raw `X-API-Key` header.
- **Scope model = method name.** A key's `Scopes` set = the API-method names it may call. `ApiMethod.ApprovedApiKeyIds`
(CSV of key int IDs) is **retired**; per-method approval moves to the key's scopes. Auth check at the endpoint:
`identity.Scopes.Contains(methodName)`.
- **Storage**: inbound keys move to the library's SQLite store (new `ScadaBridge:InboundApi:ApiKeyStore` sqlite path
+ pepper via `ApiKeyOptions.PepperSecretName`, `RunMigrationsOnStartup`). The SQL Server `ApiKey` entity is retired;
`ApiMethod` is KEPT minus `ApprovedApiKeyIds` (EF migration drops the column). `InboundApiRepository` loses its ApiKey
methods + `GetApprovedKeysForMethodAsync`.
- **Auth path** (`InboundAPI`): endpoint reads Bearer, calls library `IApiKeyVerifier.VerifyAsync`, then the scope check.
PRESERVE the security invariants: 401 (missing/invalid/disabled), **403 identical message for both "method not found"
and "not in scope"** (enumeration-safety, InboundAPI-011), constant-time compare (library does it), active-node 503 +
body-cap 413 filters unchanged, audit actor = key DisplayName. Delete `ApiKeyValidator` hashing + `ApiKeyHasher`.
- **Management** (`ManagementActor` + CLI `security api-key` + Commons messages): drive the library `IApiKeyAdminStore` +
`ApiKeySecretGenerator`. `create` returns `sbk_<keyId>_<secret>` once (plaintext-once preserved); methods a key may call
= its scopes, set on create/update (e.g. `--methods a,b` or grant/revoke-method commands). `list` returns id/name/enabled
(no secret), `update --enabled`, `delete`/revoke. Audit preserved.
- **CentralUI**: `ApiKeys.razor` (list/create/toggle/delete via admin store; show token once), `ApiKeyForm.razor` (edit the
key's method-scopes), `ApiMethodForm.razor` (method-side "approved keys" now reads/writes key scopes across keys).
- **Breaking change**: all inbound keys re-issued (new format); clients switch `X-API-Key``Authorization: Bearer`.
Needs a runbook + CHANGELOG. Re-pin ScadaBridge Auth packages to **0.1.2**.
Sub-tasks (sequential where files overlap): **(A)** storage retire + EF migration + library wiring/options;
**(B)** auth-path rewrite (Bearer + verifier + scope check); **(C)** management (ManagementActor + CLI + messages);
**(D)** CentralUI pages; **(E)** runbook/CHANGELOG + integration test sweep. A→(B,C)→D→E.
Sequencing note: doing it **additively** (add library path, switch auth, rewire mgmt/UI, retire SQL Server entity LAST)
keeps the build green at each step.
### Re-arch progress
- **A+B foundation — DONE + reviewed+fixed** (commits `a94558c`, `1fcc4f5`; re-pinned to 0.1.2). Library `AddZbApiKeyAuth`
wired additively (`ScadaBridge:InboundApi:ApiKeyStore`, prefix `sbk`, reuses inbound pepper); inbound endpoint now uses
the library verifier + Bearer + `Scopes.Contains(methodName)`. Security invariants preserved: 401 generic / 403 identical
body for not-found AND not-in-scope (enumeration-safe, pinned to a literal in tests), scope-check-before-DB (no timing
oracle), fail-fast pepper preflight (Central), audit actor = DisplayName. Old SQL Server path still compiles (retired in E).
163/163 InboundAPI tests green. **NOTE for E:** the library's `ApiKeySecretGenerator.NewSecret()` is `internal` — seed/create
keys via the public `ApiKeyAdminCommands.CreateKeyAsync` seam (returns the assembled `sbk_…` token).
- **Library 0.1.3 — DONE + reviewed + PUBLISHED** (scadaproj commits `468959c` impl, `290e85c` tests; pushed to Gitea,
ApiKeys 0.1.3 nupkg verified HTTP 200). Added `IApiKeyAdminStore.SetScopesAsync(keyId, scopes, ct)` + `SetEnabledAsync(keyId,
enabled, whenUtc, ct)` (+ audited facade verbs `ApiKeyAdminCommands.SetScopesAsync`/`SetEnabledAsync` → eventTypes
`set-scopes`/`enable-key`/`disable-key`). **No schema change** (`CurrentVersion` stays 2): scopes column already exists;
`revoked_utc` doubles as the enabled flag (null = enabled), so enable/disable is a reversible toggle that preserves the
secret (proven by test asserting `SecretHash.SequenceEqual` + unchanged `last_used_utc`). This is what lets C/D edit a key's
method-scopes and toggle enabled WITHOUT re-issuing the token. **ScadaBridge must re-pin Auth packages 0.1.2 → 0.1.3.**
- **C (management), D (CentralUI), E (retire SQL Server ApiKey + ApiMethod.ApprovedApiKeyIds migration + runbook/CHANGELOG)
— IN PROGRESS.** Mapping: `CreateApiKeyCommand``CreateKeyAsync` (keyId = `Guid.NewGuid().ToString("N")`,
DisplayName = name, scopes = `--methods`); `ListApiKeysCommand``ListKeysAsync` (enabled = `RevokedUtc is null`);
`UpdateApiKeyCommand(IsEnabled)``SetEnabledAsync`; new set-scopes path → `SetScopesAsync`; `DeleteApiKeyCommand`
revoke-then-`DeleteKeyAsync`. All management message keys switch `int ApiKeyId``string KeyId`.
### Discovered architecture (CentralUI Explore, 2026-06-02) — expands C/D/E
Two facts the original AE spec missed:
1. **CentralUI bypasses the ManagementActor.** `Components/Pages/Admin/ApiKeys.razor`, `ApiKeyForm.razor`, and
`Components/Pages/Design/ApiMethodForm.razor` call `IInboundApiRepository` (SQL Server EF) **directly** — they do NOT
send the `CreateApiKeyCommand`/etc. management messages. So there are **two** management entry points to rewire
(CLI→ManagementActor uses the messages; CentralUI→repository uses the entities). Decoupling: introduce one app-side
**`IInboundApiKeyAdmin` seam** over the library `ApiKeyAdminCommands`, and route BOTH CLI and CentralUI through it
(DRY + single audit path). The message-contract change (int→string) touches only CLI+ManagementActor; the
entity/repository change (`ApiKey.Id`, `ApiMethod.ApprovedApiKeyIds`) touches CentralUI + TransportExport.
2. **TransportExport couples API keys + methods into config export/import** (`Components/Pages/Design/TransportExport.razor`
+ `.razor.cs`, `HashSet<int>` selections, `ExportSelection`). With keys now in the library SQLite store (per-env pepper,
secret-once), a key can't be exported/re-imported usefully. **Decision (user, 2026-06-02): EXCLUDE inbound API keys from
transport — export API methods only; keys are re-created + method-scopes re-granted per environment.**
CentralUI blast radius (string keyId + scopes replace int Id + ApprovedApiKeyIds CSV): `Admin/ApiKeys.razor`,
`Admin/ApiKeyForm.razor`, `Design/ApiMethodForm.razor` (approved-keys ↔ key-scopes), `Design/TransportExport.razor(.cs)`,
`Design/ExternalSystems.razor` (uses method `int` id — methods STAY int in SQL Server, so unaffected for keys),
`Dashboard.razor` (key count), test `Admin/ApiKeyFormAuditDrillinTests.cs`.
### C/D/E decomposition — 5 reviewed green sub-commits (user: "coordinated multi-commit now", 2026-06-02)
- **C1** — re-pin ScadaBridge Auth 0.1.2→0.1.3; add app-side `IInboundApiKeyAdmin` seam (string-keyId model:
Create(name,methods)→(keyId,token) / List / SetEnabled / SetMethods / Delete[=revoke+delete] / GetMethodsForKey /
GetKeysForMethod) over the library facade; register `ApiKeyAdminCommands` + the seam in Host **and** CentralUI DI; seam
unit tests. **Purely additive — build green.**
- **C2** — Commons `Messages/Management/SecurityCommands.cs` contracts int→string keyId + add `Methods` + new
`SetApiKeyMethodsCommand`; rewire ManagementActor handlers + CLI `security api-key` onto the seam; update ManagementActor
tests. (CentralUI unaffected — it doesn't use these messages.)
- **C3** — CentralUI `ApiKeys.razor`/`ApiKeyForm.razor`/`ApiMethodForm.razor` (+ Dashboard count) off `IInboundApiRepository`-
for-keys onto the seam; string keyId; method-scope editing replaces `ApprovedApiKeyIds`; update bUnit test. (Methods stay
in SQL Server; just stop using the `ApprovedApiKeyIds` column — dropped in C5.)
- **C4** — TransportExport: remove API-key selection/export (methods-only); drop key `HashSet<int>` + `ExportSelection` keys;
tests.
- **C5 (=E)** — retire SQL Server `ApiKey` entity + DbContext reg + `IInboundApiRepository` key methods +
`GetApprovedKeysForMethodAsync`; drop `ApiMethod.ApprovedApiKeyIds`; EF migration (drop ApiKeys table + column); delete
residual `ApiKeyValidator`/`ApiKeyHasher`; runbook + CHANGELOG (breaking: re-issue keys, `X-API-Key``Authorization: Bearer`);
full build+test sweep.
#### Re-arch sub-commit progress (2026-06-02)
- **C1 — DONE + reviewed** (ScadaBridge commits `d09def2` seam+re-pin-0.1.3, `7f7ea3f` review polish). `IInboundApiKeyAdmin`
seam (interface in Commons, `LibraryInboundApiKeyAdmin` impl in the Security project over `ApiKeyAdminCommands`), DI in
Host (CentralUI shares that container). Spec PASS + code-review APPROVED (guard `name`, doc throws/O(n) contract).
**Two pre-existing Host.Tests reds from the prior session's Auth work (uncaught because Host.Tests weren't run) fixed as
part of restoring a green baseline:** (a) `7e25efa` — A+B's Central pepper preflight (`1fcc4f5`) needs a ≥16-char test
`ApiKeyPepper`; supplied via env vars in the Central test fixtures (test-only) + 3 guard tests; Host.Tests 86 fail → 1.
(b) `55099b1` — LDAP cutover (`ac34dac`) made component-lib `AddSecurity(IConfiguration)` violate ScadaBridge's
`OptionsTests` arch rule; moved `AddZbLdapAuth` to the Host composition root, dropped the param (behaviour-preserving);
Host.Tests 1 fail → **0**. Green baseline now: build 0/0, Host.Tests 228, Security.Tests 89, InboundAPI 163, CentralUI 584.
**NOTE for Phase 2:** `AuditLog.AddAuditLog(IConfiguration)` also takes IConfiguration but is intentionally NOT in the
`OptionsTests` scanned set — revisit during audit adoption (Task 2.5), don't silently "fix".
- **C2 — DONE + reviewed** (SB commits `6518e93` rewire, `8219b8e` review fixes). Commons messages int→string keyId
+ `Methods` + new `SetApiKeyMethodsCommand`; ManagementActor's 5 API-key handlers + CLI `security api-key` now drive
`IInboundApiKeyAdmin`; ScadaBridge management audit preserved (actor = user.Username; secret/token never audited/logged).
Spec PASS, code-review APPROVED after fixes: not-found now throws `ManagementCommandException` BEFORE audit (no spurious
audit on no-op update/delete/set-methods); empty `Methods` rejected server-side (prevents unusable key on create + stealth-
disable via `set-methods ""`); token advisory→stderr. Green: ManagementService 125, CLI 188, + Security/InboundAPI/Host/
CentralUI unchanged. CentralUI + SQL Server `ApiKey` entity/repo untouched (C3/C5).
- **C3 — DONE + reviewed** (SB commits `107e524` rewire, `d1191fd` review fixes). CentralUI `Admin/ApiKeys.razor`,
`Admin/ApiKeyForm.razor`, `Design/ApiMethodForm.razor`, `Dashboard.razor` onto `IInboundApiKeyAdmin`: string keyId,
method-NAME scopes replace the `ApprovedApiKeyIds` CSV, one-time token display on create, key Name fixed-after-create
(no rename in the lib model). The "approved keys ↔ key scopes" inversion is a pure tested helper
`CentralUI/Services/ApiMethodKeyScopeReconciler.cs` (save method entity first, then reconcile each affected key's full
scope set fresh; empty-last-scope revoke is blocked with a clear message, never pushes an empty set). Spec PASS,
code-review APPROVED after fixes: seam `bool` not-found now surfaced (no silent success), partial-reconcile-failure
guidance ("method saved, key scopes partially applied — review on API Keys page"), create validation order, concurrent-
edit reconciler test. CentralUI.Tests 595 green; all other suites unchanged. TransportExport + SQL Server entities/repo
untouched (C4/C5). (Also removed a stray `Name` artifact file from an accidental redirect — not committed.)
- **C4 — DONE + reviewed** (SB commits `731cfd3` rewire, `b13d7b3` review polish). TransportExport excludes inbound API
keys (methods-only) end-to-end — UI selection, `ExportSelection`, DependencyResolver, EntitySerializer/DTOs, BundleExporter,
manifest/summary, CLI `--api-keys`, ManagementActor `HandleExportBundle`, and the IMPORT path (BundleImporter/ArtifactDiff:
no key creation; method overwrite PRESERVES the destination's existing `ApprovedApiKeyIds`, doesn't clobber). Method export
drops `ApprovedApiKeyIds`. Backward-compat: legacy bundles with an `apiKeys` section still deserialize (tolerant `ApiKeys?`
field via shared `BundleJsonOptions` + `WhenWritingNull`) and are IGNORED on import with an `ImportResult.ApiKeysIgnored`
count + audit stamp; new exports omit the field. UI info note added. Spec PASS, code-review APPROVED (note: review I-1
"added-unrestricted count" intentionally SKIPPED — wrong model: inbound auth is scope-based, the verifier ignores
`ApprovedApiKeyIds`, so a new method is callable by NO key until a scope is granted). Transport.Tests 60, IntegrationTests
34 green. SQL Server `ApiKey`/`ApiMethod` entities + repo untouched (C5).
- **C5 (=E) — DONE + reviewed** (SB commit `afa5598`). Retired SQL Server `ApiKey` entity + 7 `IInboundApiRepository` key
methods + `ApiMethod.ApprovedApiKeyIds` + `DbSet<ApiKey>`/fluent config + residual `ApiKeyHasher`/`IApiKeyHasher`/
`ApiKeyValidator` (+ their tests). EF migration `RetireInboundApiKeyStore` (DropTable `ApiKeys` + DropColumn
`ApprovedApiKeyIds`; `Down` recreates both byte-faithfully; ModelSnapshot consistent). CHANGELOG.md + tracked runbook
`docs/operations/inbound-api-key-reissue.md` (BREAKING: `X-API-Key``Authorization: Bearer sbk_…`, all keys re-issued;
per-env SqlitePath + ≥16-char ApiKeyPepper). Spec PASS, code-review APPROVED: migration Down/snapshot verified, inbound
verifier path (A+B) intact, no live consumer broke. Green: ConfigurationDatabase 241, InboundAPI 148 (was 163: removed
validator/hasher tests), Security 89, Host 227 (was 228: removed validator DI test), ManagementService 125, CLI 188,
CentralUI 595, Transport 60+34. (Pre-existing infra-dependent failures — IntegrationTests ×11, AuditLog ×1, needing live
LDAP/SQL/SMTP — proven identical at baseline `b13d7b3` via git-stash; StaleTagMonitor flaky timer tests pass 13/13 isolated.)
**Installer/secret note:** the C5 code-review flagged the (untracked, intentionally `.gitignore`d `/deploy/`) `install.ps1`
not injecting the pepper — fixed ON DISK (the on-disk installer now takes `-ApiKeyPepper`); a subagent had force-committed
the ignored deploy script (which embeds a real default JWT key) — that commit was RESET (`git reset --mixed`), keeping the
edit on disk and the secret OUT of git history (branch was never pushed). The pepper requirement is documented in the
tracked runbook.
### ✅ Task 1.3 (Adopt ZB.MOM.WW.Auth.ApiKeys) COMPLETE across all repos
MxGateway donor cutover + ScadaBridge full re-architecture (C1 seam → C2 mgmt/CLI → C3 CentralUI → C4 TransportExport →
C5 retire+migration+runbook), all reviewed, lib at **0.1.3**. ScadaBridge inbound API is now 100% on the shared library
(Bearer `sbk_<keyId>_<secret>`, scope = method name, per-key SQLite store + per-env pepper); the SQL Server key model is
fully retired. Remaining Phase 1: **1.5** (AspNetCore claims/cookies, 3 UIs), **1.6** (dev GLAuth base DN), **1.7**
(canonical roles, 3 repos). Then Phase 2 (audit) + Phase 3 (Actor wiring).
## Resolved decisions (2026-06-02)
- **Decision A — ScadaBridge inbound API keys depth → (a) FULL ADOPT.** Re-architect inbound-API auth to the
library's model: `<prefix>_<keyId>_<secret>` Bearer token format, keyId lookup + constant-time compare,
scopes/constraints, and **move inbound API keys into the library's SQLite store** (separate from the SQL Server
config DB). This is the largest, highest-risk item in Phase 1. Implications to handle in Task 1.3:
- New SQLite auth DB for ScadaBridge inbound keys (path via `ApiKeyOptions.SqlitePath`); migrate/retire the
SQL Server `ApiKey{Name,KeyHash}` table + `ApiMethod.ApprovedApiKeyIds` linkage.
- Re-model **per-method approval** as the library's scopes/constraints (or the opaque constraint blob) — the
`ApiMethod.ApprovedApiKeyIds` set becomes per-key scope grants.
- Switch the inbound transport from `X-API-Key` header to `Authorization: Bearer <token>` (a client-visible
contract change — extends the already-accepted token-format change; needs the interop check + a doc/CHANGELOG note).
- Existing raw keys cannot be migrated (deterministic-by-value hash, no keyId/secret split) → **re-issue** all
inbound API keys; call this out in the cutover runbook.
- **Decision B — canonical role mappings → confirmed as tabled above** (OtOpcUa `ConfigViewer→Viewer`,
`ConfigEditor→Designer`, `FleetAdmin→Administrator+Deployer`; MxGateway `Viewer/Admin`; ScadaBridge
`Admin→Administrator`, `Design→Designer`, `Deployment→Deployer`, `Audit→Administrator`, `AuditReadOnly→Viewer`).
- **Decision C — dev escape hatches → keep app-side, unchanged.** OtOpcUa `DevStubMode` and MxGateway
`AllowAnonymousLocalhost`/loopback bypass have no library equivalent; preserve them in each app outside the
shared `Auth.Ldap` path.
## Phase 1 tail — decisions + current state (2026-06-02, resumed)
Task 1.0 gate read-only re-exploration confirmed the post-cutover state for 1.5/1.6/1.7 (3 parallel Explore agents):
- **None of the 3 repos reference `ZbClaimTypes`/`ZbCookieDefaults` yet.** `ZbClaimTypes.Name`/`Role` alias the framework
URIs (`ClaimTypes.Name`/`.Role`); `DisplayName`/`Username`/`ScopeId` = new `zb:`-prefixed strings.
- Claim mints today: **OtOpcUa** `AuthEndpoints.cs` uses `ClaimTypes.NameIdentifier` + `JwtTokenService.{Username,DisplayName}ClaimType` ("Username"/"DisplayName") + `ClaimTypes.Role` (JWT-in-cookie). **MxGateway** `DashboardAuthenticator.CreatePrincipal` uses `ClaimTypes.{NameIdentifier,Name,Role}` + custom `mxgateway:ldap_group`. **ScadaBridge** `CentralUI/Auth/AuthEndpoints.cs` + `JwtTokenService` use **plain** `"DisplayName"/"Username"/"Role"/"SiteId"/"LastActivity"` strings — `"Role"`/`"SiteId"` are load-bearing in `TokenValidationParameters` + every `AuthorizationPolicies` `RequireClaim`.
- Cookie names confirmed: `ZB.MOM.WW.OtOpcUa.Auth` / `MxGatewayDashboard` / `ZB.MOM.WW.ScadaBridge.Auth`. All three apps already do HttpOnly+SameSite=Strict+sliding+SecurePolicy via hand-rolled `PostConfigure` (no `ZbCookieDefaults.Apply`).
- Dev base DNs today: OtOpcUa + MxGateway = `dc=lmxopcua,dc=local`; ScadaBridge = `dc=scadabridge,dc=local`.
- `CanonicalRole` is referenced **nowhere** in any repo yet (Task 1.7 is its first use).
**Decision A3 (Task 1.6 dev base DN) → `dc=zb,dc=local`** (product-neutral, matches the ZB.MOM.WW family; all 3 dev
fixtures + dev appsettings move to it — prod directories untouched). ScadaBridge GLAuth user DNs become
`cn=<user>,ou=<group>,ou=users,dc=zb,dc=local`; OtOpcUa/MxGateway leave `dc=lmxopcua`.
**Decision (Task 1.5 ScadaBridge depth) → FULL canonical incl. role/scope.** Migrate ScadaBridge's role claim to the
framework URI (`ZbClaimTypes.Role`) and the site claim to `ZbClaimTypes.ScopeId` across cookie + JWT mint +
`TokenValidationParameters` + every policy `RequireClaim` + tests (cleanest: redefine the `JwtTokenService.*ClaimType`
constants to alias `ZbClaimTypes.*` so all existing references inherit canonical values). **Treated as high-risk** for the
ScadaBridge slice (serial spec→code review, full ScadaBridge suite). OtOpcUa/MxGateway slices stay standard.
### ✅ Task 1.5 (AspNetCore claims/cookies) COMPLETE across all 3 repos (reviewed)
- **OtOpcUa** `83856b7` + review-fix `d0777ee` (spec ✅, code ✅): `.Security` adds the `Auth.AspNetCore` pkg ref; `JwtTokenService.{Username,DisplayName}ClaimType` alias `ZbClaimTypes.{Username,DisplayName}`; cookie principal emits `ZbClaimTypes.Name` (replaced `NameIdentifier` — grep-confirmed no other reader) + `ZbClaimTypes.Role`; cookie via `ZbCookieDefaults.Apply`, name kept. Issued JWT is documented as issue-only (no `AddJwtBearer` in OtOpcUa; role stays short `"Role"`; `BuildValidationParameters` pins `RoleClaimType`/`NameClaimType` for forward-compat). 35/35.
- **MxGateway** `7e1af37` (spec ✅, code ✅): `DashboardAuthenticator` emits `ZbClaimTypes.{Username,DisplayName}` + identity `nameType/roleType=ZbClaimTypes.{Name,Role}`; keeps `mxgateway:ldap_group` + `NameIdentifier` (HubTokenService reads it); cookie via `ZbCookieDefaults.Apply(requireHttps:true, idleTimeout:8h)` (8h preserved), `RequireHttpsCookie=false` dev-HTTP override kept, name kept. Dashboard 85/85; full 575/578 (3 pre-existing FakeWorker reds).
- **ScadaBridge** `a0938f7` + spelling-fix `c185a56` (high-risk; spec ✅, code ✅): `JwtTokenService.*ClaimType` constants aliased to `ZbClaimTypes.*` (`RoleClaimType`=framework URI, `SiteIdClaimType`=`ScopeId`); JWT mint `MapInboundClaims=false`+`OutboundClaimTypeMap.Clear()` (instance-isolated, reviewer-verified) and validate `MapInboundClaims=false`+pinned `RoleClaimType`/`NameClaimType` → byte-symmetric round-trip; cookie identity `roleType=RoleClaimType`; every site-scope read on `SiteIdClaimType`; cookie via `ZbCookieDefaults.Apply` (30-min idle), name kept. No `AddJwtBearer` middleware (sole JWT path = `JwtTokenService.ValidateToken`). Role VALUES unchanged. Security 93/93, CentralUI 595/595, ManagementService 125/125, Host 227/227; infra reds (Integration ×11, AuditLog ×1, flaky StaleTagMonitor) confirmed pre-existing by stash-at-HEAD. **Minor (deferred):** a stale "PostConfigure" comment word; JWT-validated principals have null `Identity.Name` (no regression, no bearer path).
### ✅ Task 1.6 (unify dev LDAP base DN → `dc=zb,dc=local`) COMPLETE across all 3 repos (reviewed, code-review-only per `small` class)
Mechanical, grep-verified substitution of each repo's dev directory base DN to the neutral `dc=zb,dc=local`; prod left untouched (no in-repo prod overlay carries the dev DN; `/deploy` is gitignored and was not touched). OU structure preserved throughout.
- **OtOpcUa** `8ba289f`: `LdapOptions.SearchBase` default, integration `docker-compose.yml` `LDAP_ROOT` + `TwoNodeClusterHarness` SearchBase/ServiceAccountDn, `AclEdit.razor` placeholder, `docs/v2/{dev-environment,phase-7-e2e-smoke}`. `grep dc=lmxopcua`→empty. Security 35, AdminUI 121, ControlPlane 29, Runtime 74 green.
- **MxGateway** `9572045`: `LdapOptions` defaults, `appsettings.json`, dashboard test group-DNs, `glauth.md` (dev DNs only — the `DC=corp,…` prod-example column left intact), `CLAUDE.md` index line. `grep dc=lmxopcua`→empty. 575/578 (3 pre-existing FakeWorker).
- **ScadaBridge** `6ae6051` (14 files): app `appsettings.Central.json`, the 4 docker/docker-env2 central-node configs, `infra/glauth/config.toml` baseDN, `infra/tools/ldap_tool.py`, 4 test fixtures, `docs/test_infra/*`. Cluster nodes use the shared `scadabridge-ldap` container backed by the now-updated `infra/glauth/config.toml` (no separate seed). `grep dc=scadabridge`→only the 2 excluded historical `docs/plans/*` records + synthetic `dc=example` left. Full non-infra suite green (Security 93, CentralUI 595, ManagementService 125, Host 227, ConfigurationDatabase 241).
## Task 1.7 (canonical roles) — inventory + decisions (2026-06-02)
Read-only role inventory (3 parallel Explore agents) found the canonical-role standardization is bigger than the plan's "~5 min/repo": it changes role string VALUES (claims + config-DB + enforcement), needs config-DB DATA migrations, and makes the ScadaBridge SoD collapse real. **EF persistence confirmed:** OtOpcUa `AdminRole` is `HasConversion<string>().HasMaxLength(32)` (stores the enum MEMBER NAME); ScadaBridge `LdapGroupMappings.Role` is free-text `nvarchar(500)` with HasData seed. Both → renaming role values requires a data migration.
**Resolved per-repo mapping (Decision B + filled gaps):**
- **MxGateway:** `Viewer→Viewer` (no-op), `Admin→Administrator`. Clean rename of `DashboardRoles.Admin` VALUE + `GroupToRole` config + `GatewayOptionsValidator` allowed-set. NO DB (dashboard roles not persisted). ⚠️ MUST NOT touch the separate gRPC `GatewayScopes.Admin = "admin"` data-plane scope.
- **OtOpcUa:** `ConfigViewer→Viewer`, `ConfigEditor→Designer`, `FleetAdmin→Administrator`, **`DriverOperator→Operator`** (plan-omitted gap). Rename `AdminRole` members + DevStub/appsettings `GroupToRole` values + every `[Authorize(Roles=)]`/`RequireRole` role string. **Config-DB data migration** on `LdapGroupRoleMappings.Role` (raw SQL UPDATE old→new; column is the same string col so it's a data, not schema, change). Data-plane `NodePermissions` bitmask UNTOUCHED. Enforcement preserved: `Designer`(←ConfigEditor) keeps the deploy access it has today (`Deployments.razor` `Roles="FleetAdmin,ConfigEditor"``"Administrator,Designer"`). Policy NAMES (e.g. `"DriverOperator"`/`"FleetAdmin"` policy keys) may stay as internal indirections; only the role STRINGS they check become canonical.
- **ScadaBridge (heaviest):** `Admin→Administrator`, `Design→Designer`, `Deployment→Deployer`, **`Audit→Administrator`** (collapse), **`AuditReadOnly→Viewer`** (collapse). Requires: config-DB data migration (`LdapGroupMappings.Role` UPDATE + HasData seed + ModelSnapshot); ~20 hard-coded role-string sites (ManagementActor site-scope bypass ×6 + `GetRequiredRole`, DebugStreamHub ×2, BrowseService/BindingTester, policy arrays); SoD policy rework `OperationalAuditRoles→{Administrator,Viewer}` + `AuditExportRoles→{Administrator}` so former `AuditReadOnly`(→Viewer) keeps audit-READ but still can't export; all role-asserting tests. **Real security consequence (accepted):** `Audit→Administrator` grants former audit-only users the full admin surface (create sites, manage LDAP mappings/API keys, import bundles). Site-scoping stays orthogonal (computed from `PermittedSiteIds`, Deployment-only).
**Decisions (2026-06-02):** depth = **FULL canonical (values change, incl. config-DB migrations + real SoD escalation)**; cadence = **proceed now**. Execution: MxGateway + OtOpcUa single high-risk commits each (parallel); ScadaBridge as a focused atomic change (12 coupled commits — the rename + seed + migration are coupled, so it does not cleanly split into 1.3-style green sub-increments). High-risk serial review (spec→code) per repo + full ScadaBridge suite.
### ✅ Task 1.7 (canonical roles) COMPLETE across all 3 repos (high-risk; spec ✅ + code ✅ each)
- **MxGateway** `04bce3ff` (spec ✅, code ✅): `DashboardRoles.Admin` value `"Admin"→"Administrator"` (Viewer unchanged) + `GroupToRole` config; validator/enforcement inherit the constant. NO DB (dashboard roles not persisted). gRPC `GatewayScopes.Admin="admin"` proven untouched. 577/580 (3 pre-existing FakeWorker).
- **OtOpcUa** `c1619d9` (spec ✅, code ✅): `AdminRole` enum members → `Viewer/Designer/Administrator`; `DriverOperator` role string → `Operator` (policy NAMES kept stable); DevStub `["Administrator"]`. **Data migration** `20260602112419_CanonicalizeAdminRoles` (`UPDATE LdapGroupRoleMapping` old→new, reverse Down, snapshot unchanged, no pending model changes). `Deployments.razor` `[Authorize(Roles="Administrator,Designer")]` (deploy access preserved). Data-plane `NodePermissions`/`NodeAcl`/evaluator untouched (proven). Security 45, Configuration 90, AdminUI 121 green. (Minor non-issues: an `ou=FleetAdmin` placeholder DN + a data-plane doc-comment — both LDAP-group/doc text, not role values.)
- **ScadaBridge** `b104760` + doc-fix `4118452` (high-risk; spec ✅, code ✅): `Roles` → canonical `{Administrator,Designer,Deployer,Viewer}` (Audit/AuditReadOnly removed); **SoD reworked** `OperationalAudit={Administrator,Viewer}`, `AuditExport={Administrator}` (Viewer reads-not-exports audit; Administrator does both + full admin). All enforcement literals moved incl. the 6 ManagementActor site-scope bypasses + DebugStreamHub + BrowseService/BindingTester. **Migration** `20260602113822_CanonicalizeRoles` (seed `UpdateData` + idempotent raw catch-all for operator rows; lossy Down documented; snapshot consistent). **Real SoD escalation** (Audit→Administrator gains full admin) documented in CHANGELOG. Full non-infra suite green (Security 93, CentralUI 595, ManagementService 125, Host 227, ConfigurationDatabase 241); infra reds pre-existing (stash-at-HEAD confirmed). `4118452` corrected stale role-name prose in NavMenu comments (comment-only; CentralUI rebuild 0/0).
## ✅ PHASE 1 COMPLETE (2026-06-02)
All of Tasks 1.01.7 done across OtOpcUa, MxAccessGateway, ScadaBridge — each on its local-only `feat/adopt-zb-auth` branch, **nothing pushed**. The three apps now consume `ZB.MOM.WW.Auth.*` from the Gitea feed (OtOpcUa 0.1.1 Abstractions+Ldap+AspNetCore; MxGateway 0.1.2 all-four; ScadaBridge 0.1.3 all-four): shared LDAP (`Auth.Ldap`), shared API-key model (`Auth.ApiKeys`, ScadaBridge fully re-architected), `IGroupRoleMapper<TRole>` seam, nested/`Transport`-enum config, canonical `ZbClaimTypes`/`ZbCookieDefaults`, unified dev base DN `dc=zb,dc=local`, and the canonical-six role vocabulary (with ScadaBridge's accepted auditor/admin SoD collapse). Every task spec- and code-reviewed; high-risk ones via the serial chain + full-suite runs. **Phase 1 exit gate met.** Next: Phase 2 (audit component — the original ask) starting at the Task 2.0 gate, then Phase 3 (wire audit Actor from the Auth principal).