diff --git a/docs/security.md b/docs/security.md index a6946e20..86ec9106 100644 --- a/docs/security.md +++ b/docs/security.md @@ -4,12 +4,15 @@ > Paths + project names moved: `OtOpcUa.Server/Security/` → `OtOpcUa.Security/` > (`Ldap/`, `Jwt/`, `Endpoints/AuthEndpoints.cs`), `OtOpcUa.Admin` is gone (its > auth + role-grant pages live in `OtOpcUa.AdminUI`), and Admin auth policies -> register in `OtOpcUa.Host/Program.cs` via `AddOtOpcUaAuth` rather than in a -> separate Admin process. The v2 `Security:Jwt` section adds JWT bearer auth -> alongside the existing cookie scheme (`AddJwtBearer` wired via -> `IPostConfigureOptions` in `OtOpcUa.Security`). DataProtection -> keys persist to the shared `ConfigDb.DataProtectionKeys` table so cookies -> survive failover between admin-role nodes. +> register from `OtOpcUa.Host/Program.cs` via `AddOtOpcUaAuth` +> (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`) rather +> than in a separate Admin process. The Admin UI uses a **single Cookie +> authentication scheme** — there is no `AddJwtBearer` pipeline. The +> `Security:Jwt` section configures `JwtTokenService`, which mints a JWT at the +> `/auth/token` endpoint for **external** consumers (OPC UA clients / automation +> scripts); the cookie itself stores the `ClaimsPrincipal` directly. DataProtection +> keys persist to the shared Config DB (`PersistKeysToDbContext`) +> so cookies survive failover between admin-role nodes. > > See `docs/plans/2026-05-26-akka-hosting-alignment-design.md` §5 for the v2 > auth + DataProtection rationale. @@ -18,8 +21,8 @@ OtOpcUa has four independent security concerns. This document covers all four: 1. **Transport security** — OPC UA secure channel (signing, encryption, X.509 trust). 2. **OPC UA authentication** — Anonymous / UserName / X.509 session identities; UserName tokens authenticated by LDAP bind. -3. **Data-plane authorization** — who can browse, read, subscribe, write, acknowledge alarms on which nodes. Evaluated by `PermissionTrie` against the Config DB `NodeAcl` tree. -4. **Control-plane authorization** — who can view or edit fleet configuration in the Admin UI. Gated by the `AdminRole` (`ConfigViewer` / `ConfigEditor` / `FleetAdmin`) claim from `LdapGroupRoleMapping`. +3. **Data-plane authorization** — who can browse, read, subscribe, write, acknowledge alarms on which nodes. Evaluated by `TriePermissionEvaluator` over a `PermissionTrie` built from the Config DB `NodeAcl` tree. +4. **Control-plane authorization** — who can view or edit fleet configuration in the Admin UI. Gated by the `AdminRole` (`Viewer` / `Designer` / `Administrator`) claim resolved from `LdapGroupRoleMapping`. Transport security and OPC UA authentication are per-node concerns configured in the Server's bootstrap `appsettings.json`. Data-plane ACLs and Admin role grants live in the Config DB. @@ -33,42 +36,43 @@ The OtOpcUa Server supports configurable OPC UA transport security profiles that There are two distinct layers of security in OPC UA: -- **Transport security** -- secures the communication channel itself using TLS-style certificate exchange, message signing, and encryption. This is what the `OpcUaServer:SecurityProfile` setting controls. +- **Transport security** -- secures the communication channel itself using TLS-style certificate exchange, message signing, and encryption. This is what the `OpcUa:EnabledSecurityProfiles` setting controls. - **UserName token encryption** -- protects user credentials (username/password) sent during session activation. The OPC UA stack encrypts UserName tokens using the server's application certificate regardless of the transport security mode. UserName authentication therefore works on `None` endpoints too — the credentials themselves are always encrypted. A secure transport profile adds protection against message-level tampering and eavesdropping of data payloads. ### Supported security profiles -The server supports seven transport security profiles: +The profiles are the members of the `OpcUaSecurityProfile` enum (`src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OpcUaApplicationHost.cs`). The server ships **three** baseline profiles; the config value is the bare enum-member name (no hyphens, no underscores): -| Profile Name | Security Policy | Message Security Mode | Description | -|-----------------------------------|----------------------------|-----------------------|--------------------------------------------------| -| `None` | None | None | No signing or encryption. Suitable for development and isolated networks only. | -| `Basic256Sha256-Sign` | Basic256Sha256 | Sign | Messages are signed but not encrypted. Protects against tampering but data is visible on the wire. | -| `Basic256Sha256-SignAndEncrypt` | Basic256Sha256 | SignAndEncrypt | Messages are both signed and encrypted. Full protection against tampering and eavesdropping. | -| `Aes128_Sha256_RsaOaep-Sign` | Aes128_Sha256_RsaOaep | Sign | Modern profile with AES-128 encryption and SHA-256 signing. | -| `Aes128_Sha256_RsaOaep-SignAndEncrypt` | Aes128_Sha256_RsaOaep | SignAndEncrypt | Modern profile with AES-128 encryption. Recommended for production. | -| `Aes256_Sha256_RsaPss-Sign` | Aes256_Sha256_RsaPss | Sign | Strongest profile with AES-256 and RSA-PSS signatures. | -| `Aes256_Sha256_RsaPss-SignAndEncrypt` | Aes256_Sha256_RsaPss | SignAndEncrypt | Strongest profile. Recommended for high-security deployments. | +| Enum member | Security Policy | Message Security Mode | Description | +|---------------------------------|------------------|-----------------------|--------------------------------------------------| +| `None` | None | None | No signing or encryption. Suitable for development and isolated networks only. | +| `Basic256Sha256Sign` | Basic256Sha256 | Sign | Messages are signed but not encrypted. Protects against tampering but data is visible on the wire. | +| `Basic256Sha256SignAndEncrypt` | Basic256Sha256 | SignAndEncrypt | Messages are both signed and encrypted. Full protection against tampering and eavesdropping. | -The server exposes a separate endpoint for each configured profile, and clients select the one they prefer during connection. +`BuildSecurityPolicies` (`OpcUaApplicationHost.cs`) maps each configured profile to an SDK `ServerSecurityPolicy`. The server exposes a separate endpoint per configured profile and clients select the one they prefer at session open. The enum's XML doc notes that Aes128/Aes256 variants can be added later by extending the enum + `BuildSecurityPolicies` — the wiring is profile-agnostic — but they are **not implemented today**. There is no `SecurityProfileResolver` class. + +> **Config value form.** The enum binds by member name, so a profile string with hyphens (e.g. `Basic256Sha256-Sign`) does **not** bind — use the exact enum-member spelling above. If `EnabledSecurityProfiles` is empty, the server falls back to a single `None` endpoint (logged, very visible) so it still has a listening endpoint. ### Configuration -Transport security is configured in the `OpcUaServer` section of the Server process's bootstrap `appsettings.json`: +Transport security is configured in the `OpcUa` section of the Host process's bootstrap `appsettings.json` (bound to `OpcUaApplicationHostOptions`): ```json { - "OpcUaServer": { - "EndpointUrl": "opc.tcp://0.0.0.0:4840/OtOpcUa", - "ApplicationName": "OtOpcUa Server", + "OpcUa": { + "ApplicationName": "OtOpcUa", "ApplicationUri": "urn:node-a:OtOpcUa", + "PublicHostname": "0.0.0.0", + "OpcUaPort": 4840, "PkiStoreRoot": "C:/ProgramData/OtOpcUa/pki", "AutoAcceptUntrustedClientCertificates": false, - "SecurityProfile": "Basic256Sha256-SignAndEncrypt" + "EnabledSecurityProfiles": [ "Basic256Sha256Sign", "Basic256Sha256SignAndEncrypt" ] } } ``` +`EnabledSecurityProfiles` is a **list** — the server publishes one endpoint per entry. The default (when the key is omitted) is all three baseline profiles (`None`, `Basic256Sha256Sign`, `Basic256Sha256SignAndEncrypt`); production deployments typically drop `None`. The list must contain at least one entry (`OpcUaApplicationHostOptionsValidator` enforces `MinCount(…, 1)`). + The server certificate is auto-generated on first start if none exists in `PkiStoreRoot/own/`. Always generated even for `None`-only deployments because UserName token encryption depends on it. ### PKI directory layout @@ -91,13 +95,13 @@ When a client connects using a secure profile (`Sign` or `SignAndEncrypt`), the 4. If not found and `AutoAcceptUntrustedClientCertificates` is `true`, the certificate is automatically copied to `trusted/` and the connection proceeds. 5. If not found and `AutoAcceptUntrustedClientCertificates` is `false`, the certificate is copied to `rejected/` and the connection is refused. -The Admin UI `Certificates.razor` page uses `CertTrustService` (singleton reading `CertTrustOptions` for the Server's `PkiStoreRoot`) to promote rejected client certs to trusted without operators having to file-copy manually. +The Admin UI `Certificates.razor` page (`src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Certificates.razor`) lists the contents of each PKI sub-store (own / trusted / issuer / rejected) by reading the `OpcUa:PkiStoreRoot` path from configuration. It is currently a **read-only viewer** — promoting a rejected cert to trusted is still a file move (copy the `.der` from `rejected/` to `trusted/certs/`); the SDK trust list reloads on the next handshake. ### Production hardening - Set `AutoAcceptUntrustedClientCertificates = false`. -- Drop `None` from the profile set. -- Use the Admin UI to promote trusted client certs rather than the auto-accept fallback. +- Drop `None` from `EnabledSecurityProfiles`. +- Promote trusted client certs by moving the `.der` from `rejected/` to `trusted/certs/` rather than relying on the auto-accept fallback. (The Admin UI Certificates page shows what is in each store.) - Periodically audit the `rejected/` directory; an unexpected entry is often a misconfigured client or a probe attempt. --- @@ -108,59 +112,55 @@ The Server accepts three OPC UA identity-token types: | Token | Handler | Notes | |---|---|---| -| Anonymous | `IUserAuthenticator.AuthenticateAsync(username: "", password: "")` | Refused in strict mode unless explicit anonymous grants exist; allowed in lax mode for backward compatibility. | -| UserName/Password | `LdapOpcUaUserAuthenticator` (`src/Server/ZB.MOM.WW.OtOpcUa.Host/OpcUa/LdapOpcUaUserAuthenticator.cs`, backed by `LdapAuthService` at `src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapAuthService.cs`) | LDAP bind + group lookup; resolved `LdapGroups` flow into the session's identity bearer (`ILdapGroupsBearer`). | -| X.509 Certificate | Stack-level acceptance + role mapping via CN | X.509 identity carries `AuthenticatedUser` + read roles; finer-grain authorization happens through the data-plane ACLs. | +| Anonymous | No `IOpcUaUserAuthenticator` call — the SDK admits anonymous sessions at the channel. | Data-plane authorization (below) still default-denies any node a session has no ACL grant for. | +| UserName/Password | `LdapOpcUaUserAuthenticator.AuthenticateUserNameAsync` (`src/Server/ZB.MOM.WW.OtOpcUa.Host/OpcUa/LdapOpcUaUserAuthenticator.cs`, implements `IOpcUaUserAuthenticator`), backed by the app `ILdapAuthService` — `OtOpcUaLdapAuthService` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/OtOpcUaLdapAuthService.cs`). | LDAP bind + group lookup. The returned LDAP groups are mapped to roles via `IGroupRoleMapper` (`OtOpcUaGroupRoleMapper`) and attached to the OPC UA session identity for the downstream ACL evaluator. | +| X.509 Certificate | Stack-level acceptance during the secure-channel handshake. | The certificate must be trusted (see PKI trust flow); finer-grain authorization happens through the data-plane ACLs. | -### LDAP bind flow (`LdapUserAuthenticator`) +When no authenticator is supplied, `OpcUaApplicationHost` falls back to `NullOpcUaUserAuthenticator`; the Host wires the real `LdapOpcUaUserAuthenticator` as a singleton in `Program.cs`. -`Program.cs` in the Server registers the authenticator based on `OpcUaServer:Ldap`: +### LDAP bind flow (`OtOpcUaLdapAuthService`) -```csharp -builder.Services.AddSingleton(sp => ldapOptions.Enabled - ? new LdapUserAuthenticator(ldapOptions, sp.GetRequiredService>()) - : new DenyAllUserAuthenticator()); -``` +LDAP is configured under the `Security:Ldap` section (bound to `LdapOptions`, `src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapOptions.cs`, `SectionName = "Security:Ldap"`). The app authenticator is `OtOpcUaLdapAuthService` — a thin wrapper around the shared `ZB.MOM.WW.Auth.Ldap` directory client that adds two app-only concerns the shared library deliberately does not model: the `Enabled` master switch and `DevStubMode`. The same `ILdapAuthService` instance serves **both** the Admin UI cookie login (`/auth/login`) and the OPC UA UserName path (via `LdapOpcUaUserAuthenticator`), so operators use one credential across both planes. -`LdapUserAuthenticator`: +`OtOpcUaLdapAuthService.AuthenticateAsync`: -1. Refuses to bind over plain-LDAP unless `AllowInsecureLdap = true` (dev/test only). -2. Connects to `Server:Port`, optionally upgrades to TLS (`UseTls = true`, port 636 for AD). -3. Binds as the service account; searches `SearchBase` for `UserNameAttribute = username`. -4. Rebinds as the resolved user DN with the supplied password (the actual credential check). -5. Reads `GroupAttribute` (default `memberOf`) and strips the leading `CN=` so operators configure friendly group names in `GroupToRole`. -6. Returns a `UserAuthResult` carrying the validated username + the set of LDAP groups. The set flows through to the session identity via `ILdapGroupsBearer.LdapGroups`. +1. If `Enabled = false`, denies outright — no bind, no DevStub bypass (the master switch wins). +2. If `DevStubMode = true`, accepts any non-empty credentials and grants the `Administrator` role **without any network bind** (dev only — must be `false` in production). +3. Refuses to bind over a plaintext transport (`Transport = None`) unless `AllowInsecure = true` (dev/test only). This is enforced at login, not at startup. +4. Delegates the real path to the shared `ZB.MOM.WW.Auth.Ldap` client: it binds (search-then-bind via `ServiceAccountDn`, or direct-bind `cn={user},{SearchBase}` when no service account is set), verifies the password, and reads the user's group memberships. +5. Returns an `LdapAuthResult` carrying the validated username + the **groups** (never roles). Failure codes are folded into opaque user-facing error strings so a probe cannot distinguish "unknown user" from "wrong password". -Configuration example (Active Directory production): +**Group → role mapping happens downstream**, not in the auth service: `LdapOpcUaUserAuthenticator` resolves `IGroupRoleMapper` (`OtOpcUaGroupRoleMapper`) per call and unions its output with any pre-resolved roles (the DevStub `Administrator` grant). The roles are attached to the OPC UA session identity for the ACL evaluator. A mapper fault (e.g. a Config DB outage) falls back to the pre-resolved baseline rather than denying an otherwise-authenticated session. + +`Transport` replaces the former `UseTls` bool: `Ldaps` (implicit TLS), `StartTls` (upgrade), or `None` (plaintext, requires `AllowInsecure`). Configuration example (Active Directory production): ```json { - "OpcUaServer": { + "Security": { "Ldap": { "Enabled": true, + "DevStubMode": false, "Server": "dc01.corp.example.com", "Port": 636, - "UseTls": true, - "AllowInsecureLdap": false, + "Transport": "Ldaps", + "AllowInsecure": false, "SearchBase": "DC=corp,DC=example,DC=com", "ServiceAccountDn": "CN=OtOpcUaSvc,OU=Service Accounts,DC=corp,DC=example,DC=com", "ServiceAccountPassword": "", "GroupAttribute": "memberOf", + "DisplayNameAttribute": "cn", "UserNameAttribute": "sAMAccountName", "GroupToRole": { - "OPCUA-Operators": "WriteOperate", - "OPCUA-Engineers": "WriteConfigure", - "OPCUA-Tuners": "WriteTune", - "OPCUA-AlarmAck": "AlarmAck" + "OPCUA-Designers": "Designer", + "OPCUA-Admins": "Administrator", + "OPCUA-Operators": "Operator" } } } } ``` -`UserNameAttribute: "sAMAccountName"` is the critical AD override — the default `uid` is not populated on AD user entries. Use `userPrincipalName` instead if operators log in with `user@corp.example.com` form. Nested group membership is not expanded — assign users directly to the role-mapped groups, or pre-flatten in AD. - -The same options bind the Admin's `LdapAuthService` (cookie auth / login form) so operators authenticate with a single credential across both processes. +`GroupToRole` maps LDAP group names → Admin roles (case-insensitive); a user gets every role whose source group is in their membership. The values are the canonical control-plane role strings (`Viewer` / `Designer` / `Administrator`, plus the appsettings-only `Operator` for the `DriverOperator` policy). `UserNameAttribute: "sAMAccountName"` is the critical AD override — the GLAuth dev default is `cn`, which is not how AD users are looked up; use `userPrincipalName` instead if operators log in with `user@corp.example.com` form. `LdapOptionsValidator` (`src/Server/ZB.MOM.WW.OtOpcUa.Host/Configuration/LdapOptionsValidator.cs`) fails startup when `Transport = None` and `AllowInsecure = false` on a real-LDAP (non-DevStub) config. --- @@ -172,20 +172,27 @@ Per decision #129 the model is **additive-only — no explicit Deny**. Grants at ### Hierarchy -ACLs are evaluated against the UNS path: +ACLs are evaluated against the node's scope path. `NodeScope` (`src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/NodeScope.cs`) carries a `Kind` that selects between two hierarchy shapes: ``` -ClusterId → Namespace → UnsArea → UnsLine → Equipment → Tag +Equipment (UNS) kind: Cluster → Namespace → UnsArea → UnsLine → Equipment → Tag +SystemPlatform (Galaxy) kind: Cluster → Namespace → FolderSegment(s) → Tag ``` +On the Galaxy/SystemPlatform path each folder segment takes one trie level, so a deeply-nested Galaxy folder reaches the same depth as a full UNS path. Unset mid-path levels leave the corresponding id `null` and the evaluator walks only as far as the scope goes. + Each level can carry `NodeAcl` rows (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/NodeAcl.cs`) that grant a permission bundle to a set of `LdapGroups`. ### Permission flags +`NodePermissions` (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/NodePermissions.cs`), stored as an `int` bitmask in `NodeAcl.PermissionFlags`: + ```csharp [Flags] -public enum NodePermissions : uint +public enum NodePermissions : int { + None = 0, + Browse = 1 << 0, Read = 1 << 1, Subscribe = 1 << 2, @@ -215,20 +222,20 @@ The three Write tiers map to Galaxy's v1 `SecurityClassification` — `FreeAcces | Class | Role | |---|---| | `PermissionTrie` | Cluster-scoped trie; each node carries `(GroupId → NodePermissions)` grants. | -| `PermissionTrieBuilder` | Builds a trie from the current `NodeAcl` rows in one pass. | -| `PermissionTrieCache` | Per-cluster memoised trie; invalidated via `AclChangeNotifier` when the Admin publishes a draft that touches ACLs. | -| `TriePermissionEvaluator` | Implements `IPermissionEvaluator.Authorize(session, operation, scope)` — walks from the root to the leaf for the supplied `NodeScope`, unions grants along the path, compares required permission to the union. | +| `PermissionTrieBuilder` | Builds a trie from the current `NodeAcl` rows in one pass and installs it into the cache. | +| `PermissionTrieCache` | Process-singleton cache keyed on `(ClusterId, GenerationId)`. Generation-sealed: `Install(trie)` adds a new generation + advances the "current" pointer; older generations are retained (in-flight requests still resolve) and GC'd by `Prune`. `Invalidate(clusterId)` drops every cached trie for a cluster. There is **no** `AclChangeNotifier` — a publish installs a new generation rather than signalling an invalidation. | +| `TriePermissionEvaluator` | Implements `IPermissionEvaluator.Authorize(session, operation, scope)`. Walks the cluster trie for the supplied `NodeScope`, unions grants along the path, and returns an `AuthorizationDecision`. Evaluates against the **session's bound generation** (`session.AuthGenerationId`), not just "current", so a grant added/removed in a newer generation cannot take effect mid-session. | -`NodeScope` carries `(ClusterId, NamespaceId, AreaId, LineId, EquipmentId, TagId)`; any suffix may be null — a tag-level ACL is more specific than an area-level ACL but both contribute via union. +`NodeScope` is described above (Equipment-kind vs SystemPlatform-kind). The evaluator unions the matched grants along the path — a tag-level ACL and an area-level ACL both contribute. ### Dispatch gate — `IPermissionEvaluator` -`IPermissionEvaluator.Authorize(session, operation, scope)` (default impl `TriePermissionEvaluator` at `src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/TriePermissionEvaluator.cs`) bridges the OPC UA stack's `ISystemContext.UserIdentity` to the trie. The dispatch path calls it on every Read, Write, HistoryRead, Browse, Subscribe, AckAlarm, Call. A non-allow decision short-circuits the dispatch with `BadUserAccessDenied`. +`IPermissionEvaluator.Authorize(UserAuthorizationState session, OpcUaOperation operation, NodeScope scope)` (default impl `TriePermissionEvaluator` at `src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/TriePermissionEvaluator.cs`) returns an `AuthorizationDecision`. The dispatch path calls it on every Read, Write, HistoryRead, Browse, Subscribe, AckAlarm, Call; a `NotGranted` decision denies the operation. Key properties: - **Driver-agnostic.** No driver-level code participates in authorization decisions. Drivers report `SecurityClassification` as metadata on tag discovery; everything else flows through the evaluator. -- **Fail-open-during-transition.** `StrictMode = false` (default during ACL rollouts) lets sessions without resolved LDAP groups proceed; flip `Authorization:StrictMode = true` in production once ACLs are populated. +- **Strictly fail-closed (default-deny).** Every guard path returns `NotGranted` — a stale session (past the staleness ceiling, decision #152), a cluster mismatch between session and scope, a missing trie, a pruned bound generation, or simply no matching grant. There is no `StrictMode` / fail-open mode; absence of a grant is always a deny. - **Evaluator stays pure.** `TriePermissionEvaluator` has no OPC UA stack dependency — it's tested directly from xUnit. ### Full model @@ -241,24 +248,25 @@ See [`docs/v2/acl-design.md`](v2/acl-design.md) for the complete design: trie in Control-plane authorization governs **the Admin UI** — who can view fleet config, edit drafts, publish generations, manage cluster nodes + credentials. -Per decision #150 control-plane roles are **deliberately independent of data-plane ACLs**. An operator who can read every OPC UA tag in production may not be allowed to edit cluster config; conversely a ConfigEditor may not have any data-plane grants at all. +Per decision #150 control-plane roles are **deliberately independent of data-plane ACLs**. An operator who can read every OPC UA tag in production may not be allowed to edit cluster config; conversely a `Designer` may not have any data-plane grants at all. ### Roles -The `AdminRole` enum (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/AdminRole.cs`) defines: +The `AdminRole` enum (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/AdminRole.cs`) defines three roles. Task 1.7 standardized the member names on the canonical `ZB.MOM.WW.Auth` `CanonicalRole` vocabulary (`ConfigViewer → Viewer`, `ConfigEditor → Designer`, `FleetAdmin → Administrator`); a data migration (`CanonicalizeAdminRoles`) rewrote existing rows. This was a rename, not a permission change. | Role | Capabilities | |---|---| -| `ConfigViewer` | Read-only access to drafts, generations, audit log, fleet status. | -| `ConfigEditor` | ConfigViewer plus draft editing (UNS, equipment, tags, ACLs, driver instances, reservations, CSV imports). Cannot publish. | -| `FleetAdmin` | ConfigEditor plus publish, cluster/node CRUD, credential management, role-grant management. Also satisfies the `DriverOperator` authorization policy. | -| `DriverOperator` | May issue **Reconnect** and **Restart** commands against live driver instances from the Admin UI `DriverStatusPanel`. Gated by the `DriverOperator` named policy in `AddAuthorization` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`). Map an LDAP group via `GroupToRole`, e.g. `"ot-driver-operator": "DriverOperator"`. | +| `Viewer` | Read-only access to drafts, generations, audit log, fleet status. (Was `ConfigViewer`.) | +| `Designer` | Viewer plus draft authoring (UNS, equipment, tags, ACLs, driver instances, reservations, CSV imports). Cannot publish. (Was `ConfigEditor`.) | +| `Administrator` | Designer plus publish, cluster/node CRUD, credential management, role-grant management. Satisfies both the `FleetAdmin` and `DriverOperator` authorization policies. (Was `FleetAdmin`.) | -In v2 the authentication + authorization stack is wired centrally by `AddOtOpcUaAuth` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`) and Razor pages gate inline with the role names, e.g. `@attribute [Authorize(Roles = "FleetAdmin,ConfigEditor")]` on `Deployments.razor`. Nav-menu sections hide via ``. +`DriverOperator` is an **authorization policy name** (kept stable), not an `AdminRole` member. It gates **Reconnect** / **Restart** commands against live driver instances from the Admin UI `DriverStatusPanel` and requires the canonical role `Operator` or `Administrator` (`policy.RequireRole("Operator", "Administrator")` in `AddAuthorization`, `src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`). `Operator` is an appsettings-only string role (not an `AdminRole` member); map an LDAP group to it via `GroupToRole`, e.g. `"ot-driver-operator": "Operator"`. The `FleetAdmin` policy requires the `Administrator` role. + +In v2 the authentication + authorization stack is wired centrally by `AddOtOpcUaAuth` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`), which also installs a `FallbackPolicy` that requires an authenticated user. Razor pages gate inline with the canonical role names, e.g. `@attribute [Authorize(Roles = "Administrator,Designer")]`. Nav-menu sections hide via ``. ### Role grant source -Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/LdapGroupRoleMapping.cs`) — the same pattern as the data-plane `NodeAcl` but scoped to Admin roles + (optionally) cluster scope for multi-site fleets. The `RoleGrants.razor` page lets FleetAdmins edit these mappings without leaving the UI. +Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/LdapGroupRoleMapping.cs`) — the same pattern as the data-plane `NodeAcl` but scoped to Admin roles + (optionally) one cluster for multi-site fleets (a system-wide row, `IsSystemWide = true`, stacks additively with cluster-scoped rows). The `RoleGrants.razor` page lets `Administrator`s edit these mappings without leaving the UI. --- @@ -266,9 +274,9 @@ Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/Core/ZB.MOM.WW. Per-capability resilience (retry, timeout, circuit-breaker, bulkhead) is applied by `CapabilityInvoker` in `src/Core/ZB.MOM.WW.OtOpcUa.Core/Resilience/`. A driver-capability call made **outside** the invoker bypasses resilience entirely — which in production looks like inconsistent timeouts, un-wrapped retries, and unbounded blocking. -`OTOPCUA0001` (Roslyn analyzer at `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs`) fires as a compile-time **warning** when an `async`/`Task`-returning method on one of the seven guarded capability interfaces (`IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider`) is invoked **outside** a lambda passed to `CapabilityInvoker.ExecuteAsync` / `ExecuteWriteAsync` / `AlarmSurfaceInvoker.*`. The analyzer walks up the syntax tree from the call site, finds any enclosing invoker invocation, and verifies the call lives transitively inside that invocation's anonymous-function argument — a sibling pattern (do the call, then invoke `ExecuteAsync` on something unrelated nearby) does not satisfy the rule. +`OTOPCUA0001` (Roslyn analyzer at `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs`) fires with category `OtOpcUa.Resilience` and default severity **Warning** (per `AnalyzerReleases.Shipped.md`) when a method on one of the seven guarded capability interfaces (`IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider` — all in `ZB.MOM.WW.OtOpcUa.Core.Abstractions`) is invoked **outside** a lambda passed to `CapabilityInvoker.ExecuteAsync` / `ExecuteWriteAsync`. `AlarmSurfaceInvoker` is **not** a wrapper home — its own implementation is covered transitively because it routes through the inner `CapabilityInvoker.ExecuteAsync`. The analyzer walks up the syntax tree from the call site, finds any enclosing invoker invocation, and verifies the call lives transitively inside that invocation's anonymous-function argument — a sibling pattern (do the call, then invoke `ExecuteAsync` on something unrelated nearby) does not satisfy the rule. -Five xUnit-v3 + Shouldly tests at `tests/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers.Tests` cover the common fail/pass shapes + the sibling-pattern regression guard. +The xunit.v3 + Shouldly suite at `tests/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers.Tests/UnwrappedCapabilityCallAnalyzerTests.cs` covers the common fail/pass shapes + the sibling-pattern regression guard. The rule is intentionally scoped to async surfaces — pure in-memory accessors like `IHostConnectivityProbe.GetHostStatuses()` return synchronously and do not require the invoker wrap. @@ -276,8 +284,8 @@ The rule is intentionally scoped to async surfaces — pure in-memory accessors ## Audit Logging -- **Server**: Serilog `AUDIT:` prefix on every authentication success/failure, certificate validation result, write access denial. Written alongside the regular rolling file sink. -- **Admin**: `AuditLogService` writes `ConfigAuditLog` rows to the Config DB for every publish, rollback, cluster-node CRUD, credential rotation. Visible in the Audit page for operators with `ConfigViewer` or above. +- **Server**: authentication, certificate-validation, and write-denial events are logged through the regular Serilog rolling file sink. +- **Admin**: `AuditWriterActor` (`src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs`) writes `ConfigAuditLog` rows (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigAuditLog.cs`) to the Config DB for publish, rollback, cluster-node CRUD, and credential rotation. Visible on the cluster Audit page (`ClusterAudit.razor`) for operators with `Viewer` or above. --- @@ -285,16 +293,16 @@ The rule is intentionally scoped to async surfaces — pure in-memory accessors ### Certificate trust failure -Check `{PkiStoreRoot}/rejected/` for the client's cert. Promote via Admin UI Certificates page, or copy the `.der` file manually to `trusted/`. +Check `{PkiStoreRoot}/rejected/` for the client's cert. Copy the `.der` file to `trusted/certs/`; the SDK trust list reloads on the next handshake. The Admin UI Certificates page shows what is in each store but does not move certs. ### LDAP users can connect but fail authorization -Verify (a) `OpcUaServer:Ldap:GroupAttribute` returns groups in the form `CN=MyGroup,…` (OtOpcUa strips the `CN=` for matching), (b) a `NodeAcl` grant exists at any level of the node's UNS path that unions to the required permission, (c) `Authorization:StrictMode` is correctly set for the deployment stage. +Verify (a) `Security:Ldap:GroupAttribute` (default `memberOf`) returns the user's groups, (b) `Security:Ldap:GroupToRole` maps those groups to the expected roles, and (c) a `NodeAcl` grant exists at some level of the node's scope path that unions to the required permission. The data-plane evaluator is strictly default-deny — there is no fail-open mode to fall back on. ### LDAP bind rejected as "insecure" -Set `UseTls = true` + `Port = 636`, or temporarily flip `AllowInsecureLdap = true` in dev. Production Active Directory increasingly refuses plain-LDAP bind under LDAP-signing enforcement. +Set `Security:Ldap:Transport = "Ldaps"` (or `"StartTls"`) with the matching port (636 for AD `Ldaps`), or temporarily set `Security:Ldap:AllowInsecure = true` in dev. Production Active Directory increasingly refuses plain-LDAP bind under LDAP-signing enforcement. -### `AuthorizationGate` denies every call after a publish +### Stale ACL trie after a publish -`AclChangeNotifier` invalidates the `PermissionTrieCache` on publish; a stuck cache is usually a missed notification. Restart the Server as a quick mitigation and file a bug — the design is to stay fresh without restarts. +A publish installs a **new generation** into `PermissionTrieCache` via `PermissionTrieBuilder` rather than signalling an invalidation; the evaluator binds each session to a generation. If grants appear stale, confirm the new generation was installed (publish completed) and that sessions re-resolved their auth state — a session past its staleness ceiling fails closed and must re-authenticate. As a last resort `PermissionTrieCache.Invalidate(clusterId)` drops a cluster's cached tries.