docs(audit): security.md — accuracy pass (profiles, LDAP, ACL, analyzer)

STRUCTURAL (links-report.md):
- Repointed missing src/.../Security/Ldap/LdapAuthService.cs -> the real
  OtOpcUaLdapAuthService.cs (Ldap/OtOpcUaLdapAuthService.cs implements
  ILdapAuthService). Class was reorganized as a wrapper over shared
  ZB.MOM.WW.Auth.Ldap. check_links now clean for docs/security.md.

CODE-REALITY — transport profiles (OpcUaApplicationHost.cs:15-23,59-64,374-409):
- Only THREE profiles exist: None, Basic256Sha256Sign,
  Basic256Sha256SignAndEncrypt (NO hyphens, NO underscores). Removed the four
  fabricated Aes128/Aes256 rows. Config binds by enum-member name; hyphenated
  form does NOT bind. Documented this + the empty-list fallback to None.
- Config section is OpcUa (not OpcUaServer); key is the LIST
  EnabledSecurityProfiles (not singular SecurityProfile). Program.cs:120 binds
  'OpcUa'; Certificates.razor:80 reads OpcUa:PkiStoreRoot.
- No SecurityProfileResolver class exists — stated so explicitly.

CODE-REALITY — LDAP (LdapOptions.cs:21, OtOpcUaLdapAuthService.cs):
- Section is Security:Ldap (LdapOptions.SectionName), not OpcUaServer:Ldap.
- Authenticator is OtOpcUaLdapAuthService (wrapper) + LdapOpcUaUserAuthenticator
  (IOpcUaUserAuthenticator.AuthenticateUserNameAsync), not bespoke
  LdapUserAuthenticator/IUserAuthenticator.
- UseTls bool -> Transport enum (Ldaps/StartTls/None); AllowInsecureLdap ->
  AllowInsecure. Added Enabled master switch + DevStubMode.
- Group->role mapping is downstream via IGroupRoleMapper<string>
  (OtOpcUaGroupRoleMapper), NOT in the auth service. ILdapGroupsBearer and
  DenyAllUserAuthenticator do not exist (fallback is NullOpcUaUserAuthenticator).
- GroupToRole values corrected to canonical roles (Viewer/Designer/
  Administrator/Operator).

CODE-REALITY — ACL trie (TriePermissionEvaluator.cs, PermissionTrieCache.cs,
NodeScope.cs, NodePermissions.cs):
- NodePermissions backing type is int (not uint); lives in Configuration/Enums.
- Authorize(UserAuthorizationState, OpcUaOperation, NodeScope) returns
  AuthorizationDecision.
- Evaluator is strictly fail-CLOSED. Removed the fabricated
  'fail-open-during-transition' + Authorization:StrictMode key (no StrictMode
  anywhere in source).
- Cache: generation-sealed Install/Invalidate/Prune. AclChangeNotifier does
  NOT exist — removed.
- Added the SystemPlatform (Galaxy) scope hierarchy variant.

CODE-REALITY — control plane (AdminRole.cs, ServiceCollectionExtensions.cs:
113-131):
- AdminRole members are Viewer/Designer/Administrator (Task 1.7 rename from
  ConfigViewer/ConfigEditor/FleetAdmin). DriverOperator/FleetAdmin are POLICY
  names; DriverOperator requires roles Operator|Administrator.

CODE-REALITY — analyzer (UnwrappedCapabilityCallAnalyzer.cs:99-103,
AnalyzerReleases.Shipped.md):
- Confirmed category OtOpcUa.Resilience + severity Warning (already correct).
  Corrected 'Five tests' (suite has 26 cases) and AlarmSurfaceInvoker
  wrapper-home wording.

OTHER FIXES:
- v2 header: removed false AddJwtBearer/IPostConfigureOptions<JwtBearerOptions>
  claim — auth is Cookie-only; JWT is mint-only via /auth/token for external
  consumers (JwtTokenService.cs:25-48).
- Certificates.razor is a read-only viewer; removed fabricated
  CertTrustService/CertTrustOptions promote claim.
- Audit: writer is AuditWriterActor (not AuditLogService); softened the
  unverifiable server-side 'AUDIT:' Serilog-prefix claim.
This commit is contained in:
Joseph Doherty
2026-06-03 16:26:00 -04:00
parent 1b6dedc142
commit b9bdfee189
+90 -82
View File
@@ -4,12 +4,15 @@
> Paths + project names moved: `OtOpcUa.Server/Security/` → `OtOpcUa.Security/` > Paths + project names moved: `OtOpcUa.Server/Security/` → `OtOpcUa.Security/`
> (`Ldap/`, `Jwt/`, `Endpoints/AuthEndpoints.cs`), `OtOpcUa.Admin` is gone (its > (`Ldap/`, `Jwt/`, `Endpoints/AuthEndpoints.cs`), `OtOpcUa.Admin` is gone (its
> auth + role-grant pages live in `OtOpcUa.AdminUI`), and Admin auth policies > auth + role-grant pages live in `OtOpcUa.AdminUI`), and Admin auth policies
> register in `OtOpcUa.Host/Program.cs` via `AddOtOpcUaAuth` rather than in a > register from `OtOpcUa.Host/Program.cs` via `AddOtOpcUaAuth`
> separate Admin process. The v2 `Security:Jwt` section adds JWT bearer auth > (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`) rather
> alongside the existing cookie scheme (`AddJwtBearer` wired via > than in a separate Admin process. The Admin UI uses a **single Cookie
> `IPostConfigureOptions<JwtBearerOptions>` in `OtOpcUa.Security`). DataProtection > authentication scheme** — there is no `AddJwtBearer` pipeline. The
> keys persist to the shared `ConfigDb.DataProtectionKeys` table so cookies > `Security:Jwt` section configures `JwtTokenService`, which mints a JWT at the
> survive failover between admin-role nodes. > `/auth/token` endpoint for **external** consumers (OPC UA clients / automation
> scripts); the cookie itself stores the `ClaimsPrincipal` directly. DataProtection
> keys persist to the shared Config DB (`PersistKeysToDbContext<OtOpcUaConfigDbContext>`)
> so cookies survive failover between admin-role nodes.
> >
> See `docs/plans/2026-05-26-akka-hosting-alignment-design.md` §5 for the v2 > See `docs/plans/2026-05-26-akka-hosting-alignment-design.md` §5 for the v2
> auth + DataProtection rationale. > auth + DataProtection rationale.
@@ -18,8 +21,8 @@ OtOpcUa has four independent security concerns. This document covers all four:
1. **Transport security** — OPC UA secure channel (signing, encryption, X.509 trust). 1. **Transport security** — OPC UA secure channel (signing, encryption, X.509 trust).
2. **OPC UA authentication** — Anonymous / UserName / X.509 session identities; UserName tokens authenticated by LDAP bind. 2. **OPC UA authentication** — Anonymous / UserName / X.509 session identities; UserName tokens authenticated by LDAP bind.
3. **Data-plane authorization** — who can browse, read, subscribe, write, acknowledge alarms on which nodes. Evaluated by `PermissionTrie` against the Config DB `NodeAcl` tree. 3. **Data-plane authorization** — who can browse, read, subscribe, write, acknowledge alarms on which nodes. Evaluated by `TriePermissionEvaluator` over a `PermissionTrie` built from the Config DB `NodeAcl` tree.
4. **Control-plane authorization** — who can view or edit fleet configuration in the Admin UI. Gated by the `AdminRole` (`ConfigViewer` / `ConfigEditor` / `FleetAdmin`) claim from `LdapGroupRoleMapping`. 4. **Control-plane authorization** — who can view or edit fleet configuration in the Admin UI. Gated by the `AdminRole` (`Viewer` / `Designer` / `Administrator`) claim resolved from `LdapGroupRoleMapping`.
Transport security and OPC UA authentication are per-node concerns configured in the Server's bootstrap `appsettings.json`. Data-plane ACLs and Admin role grants live in the Config DB. Transport security and OPC UA authentication are per-node concerns configured in the Server's bootstrap `appsettings.json`. Data-plane ACLs and Admin role grants live in the Config DB.
@@ -33,42 +36,43 @@ The OtOpcUa Server supports configurable OPC UA transport security profiles that
There are two distinct layers of security in OPC UA: There are two distinct layers of security in OPC UA:
- **Transport security** -- secures the communication channel itself using TLS-style certificate exchange, message signing, and encryption. This is what the `OpcUaServer:SecurityProfile` setting controls. - **Transport security** -- secures the communication channel itself using TLS-style certificate exchange, message signing, and encryption. This is what the `OpcUa:EnabledSecurityProfiles` setting controls.
- **UserName token encryption** -- protects user credentials (username/password) sent during session activation. The OPC UA stack encrypts UserName tokens using the server's application certificate regardless of the transport security mode. UserName authentication therefore works on `None` endpoints too — the credentials themselves are always encrypted. A secure transport profile adds protection against message-level tampering and eavesdropping of data payloads. - **UserName token encryption** -- protects user credentials (username/password) sent during session activation. The OPC UA stack encrypts UserName tokens using the server's application certificate regardless of the transport security mode. UserName authentication therefore works on `None` endpoints too — the credentials themselves are always encrypted. A secure transport profile adds protection against message-level tampering and eavesdropping of data payloads.
### Supported security profiles ### Supported security profiles
The server supports seven transport security profiles: The profiles are the members of the `OpcUaSecurityProfile` enum (`src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OpcUaApplicationHost.cs`). The server ships **three** baseline profiles; the config value is the bare enum-member name (no hyphens, no underscores):
| Profile Name | Security Policy | Message Security Mode | Description | | Enum member | Security Policy | Message Security Mode | Description |
|-----------------------------------|----------------------------|-----------------------|--------------------------------------------------| |---------------------------------|------------------|-----------------------|--------------------------------------------------|
| `None` | None | None | No signing or encryption. Suitable for development and isolated networks only. | | `None` | None | None | No signing or encryption. Suitable for development and isolated networks only. |
| `Basic256Sha256-Sign` | Basic256Sha256 | Sign | Messages are signed but not encrypted. Protects against tampering but data is visible on the wire. | | `Basic256Sha256Sign` | Basic256Sha256 | Sign | Messages are signed but not encrypted. Protects against tampering but data is visible on the wire. |
| `Basic256Sha256-SignAndEncrypt` | Basic256Sha256 | SignAndEncrypt | Messages are both signed and encrypted. Full protection against tampering and eavesdropping. | | `Basic256Sha256SignAndEncrypt` | Basic256Sha256 | SignAndEncrypt | Messages are both signed and encrypted. Full protection against tampering and eavesdropping. |
| `Aes128_Sha256_RsaOaep-Sign` | Aes128_Sha256_RsaOaep | Sign | Modern profile with AES-128 encryption and SHA-256 signing. |
| `Aes128_Sha256_RsaOaep-SignAndEncrypt` | Aes128_Sha256_RsaOaep | SignAndEncrypt | Modern profile with AES-128 encryption. Recommended for production. |
| `Aes256_Sha256_RsaPss-Sign` | Aes256_Sha256_RsaPss | Sign | Strongest profile with AES-256 and RSA-PSS signatures. |
| `Aes256_Sha256_RsaPss-SignAndEncrypt` | Aes256_Sha256_RsaPss | SignAndEncrypt | Strongest profile. Recommended for high-security deployments. |
The server exposes a separate endpoint for each configured profile, and clients select the one they prefer during connection. `BuildSecurityPolicies` (`OpcUaApplicationHost.cs`) maps each configured profile to an SDK `ServerSecurityPolicy`. The server exposes a separate endpoint per configured profile and clients select the one they prefer at session open. The enum's XML doc notes that Aes128/Aes256 variants can be added later by extending the enum + `BuildSecurityPolicies` — the wiring is profile-agnostic — but they are **not implemented today**. There is no `SecurityProfileResolver` class.
> **Config value form.** The enum binds by member name, so a profile string with hyphens (e.g. `Basic256Sha256-Sign`) does **not** bind — use the exact enum-member spelling above. If `EnabledSecurityProfiles` is empty, the server falls back to a single `None` endpoint (logged, very visible) so it still has a listening endpoint.
### Configuration ### Configuration
Transport security is configured in the `OpcUaServer` section of the Server process's bootstrap `appsettings.json`: Transport security is configured in the `OpcUa` section of the Host process's bootstrap `appsettings.json` (bound to `OpcUaApplicationHostOptions`):
```json ```json
{ {
"OpcUaServer": { "OpcUa": {
"EndpointUrl": "opc.tcp://0.0.0.0:4840/OtOpcUa", "ApplicationName": "OtOpcUa",
"ApplicationName": "OtOpcUa Server",
"ApplicationUri": "urn:node-a:OtOpcUa", "ApplicationUri": "urn:node-a:OtOpcUa",
"PublicHostname": "0.0.0.0",
"OpcUaPort": 4840,
"PkiStoreRoot": "C:/ProgramData/OtOpcUa/pki", "PkiStoreRoot": "C:/ProgramData/OtOpcUa/pki",
"AutoAcceptUntrustedClientCertificates": false, "AutoAcceptUntrustedClientCertificates": false,
"SecurityProfile": "Basic256Sha256-SignAndEncrypt" "EnabledSecurityProfiles": [ "Basic256Sha256Sign", "Basic256Sha256SignAndEncrypt" ]
} }
} }
``` ```
`EnabledSecurityProfiles` is a **list** — the server publishes one endpoint per entry. The default (when the key is omitted) is all three baseline profiles (`None`, `Basic256Sha256Sign`, `Basic256Sha256SignAndEncrypt`); production deployments typically drop `None`. The list must contain at least one entry (`OpcUaApplicationHostOptionsValidator` enforces `MinCount(…, 1)`).
The server certificate is auto-generated on first start if none exists in `PkiStoreRoot/own/`. Always generated even for `None`-only deployments because UserName token encryption depends on it. The server certificate is auto-generated on first start if none exists in `PkiStoreRoot/own/`. Always generated even for `None`-only deployments because UserName token encryption depends on it.
### PKI directory layout ### PKI directory layout
@@ -91,13 +95,13 @@ When a client connects using a secure profile (`Sign` or `SignAndEncrypt`), the
4. If not found and `AutoAcceptUntrustedClientCertificates` is `true`, the certificate is automatically copied to `trusted/` and the connection proceeds. 4. If not found and `AutoAcceptUntrustedClientCertificates` is `true`, the certificate is automatically copied to `trusted/` and the connection proceeds.
5. If not found and `AutoAcceptUntrustedClientCertificates` is `false`, the certificate is copied to `rejected/` and the connection is refused. 5. If not found and `AutoAcceptUntrustedClientCertificates` is `false`, the certificate is copied to `rejected/` and the connection is refused.
The Admin UI `Certificates.razor` page uses `CertTrustService` (singleton reading `CertTrustOptions` for the Server's `PkiStoreRoot`) to promote rejected client certs to trusted without operators having to file-copy manually. The Admin UI `Certificates.razor` page (`src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Certificates.razor`) lists the contents of each PKI sub-store (own / trusted / issuer / rejected) by reading the `OpcUa:PkiStoreRoot` path from configuration. It is currently a **read-only viewer** promoting a rejected cert to trusted is still a file move (copy the `.der` from `rejected/` to `trusted/certs/`); the SDK trust list reloads on the next handshake.
### Production hardening ### Production hardening
- Set `AutoAcceptUntrustedClientCertificates = false`. - Set `AutoAcceptUntrustedClientCertificates = false`.
- Drop `None` from the profile set. - Drop `None` from `EnabledSecurityProfiles`.
- Use the Admin UI to promote trusted client certs rather than the auto-accept fallback. - Promote trusted client certs by moving the `.der` from `rejected/` to `trusted/certs/` rather than relying on the auto-accept fallback. (The Admin UI Certificates page shows what is in each store.)
- Periodically audit the `rejected/` directory; an unexpected entry is often a misconfigured client or a probe attempt. - Periodically audit the `rejected/` directory; an unexpected entry is often a misconfigured client or a probe attempt.
--- ---
@@ -108,59 +112,55 @@ The Server accepts three OPC UA identity-token types:
| Token | Handler | Notes | | Token | Handler | Notes |
|---|---|---| |---|---|---|
| Anonymous | `IUserAuthenticator.AuthenticateAsync(username: "", password: "")` | Refused in strict mode unless explicit anonymous grants exist; allowed in lax mode for backward compatibility. | | Anonymous | No `IOpcUaUserAuthenticator` call — the SDK admits anonymous sessions at the channel. | Data-plane authorization (below) still default-denies any node a session has no ACL grant for. |
| UserName/Password | `LdapOpcUaUserAuthenticator` (`src/Server/ZB.MOM.WW.OtOpcUa.Host/OpcUa/LdapOpcUaUserAuthenticator.cs`, backed by `LdapAuthService` at `src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapAuthService.cs`) | LDAP bind + group lookup; resolved `LdapGroups` flow into the session's identity bearer (`ILdapGroupsBearer`). | | UserName/Password | `LdapOpcUaUserAuthenticator.AuthenticateUserNameAsync` (`src/Server/ZB.MOM.WW.OtOpcUa.Host/OpcUa/LdapOpcUaUserAuthenticator.cs`, implements `IOpcUaUserAuthenticator`), backed by the app `ILdapAuthService` `OtOpcUaLdapAuthService` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/OtOpcUaLdapAuthService.cs`). | LDAP bind + group lookup. The returned LDAP groups are mapped to roles via `IGroupRoleMapper<string>` (`OtOpcUaGroupRoleMapper`) and attached to the OPC UA session identity for the downstream ACL evaluator. |
| X.509 Certificate | Stack-level acceptance + role mapping via CN | X.509 identity carries `AuthenticatedUser` + read roles; finer-grain authorization happens through the data-plane ACLs. | | X.509 Certificate | Stack-level acceptance during the secure-channel handshake. | The certificate must be trusted (see PKI trust flow); finer-grain authorization happens through the data-plane ACLs. |
### LDAP bind flow (`LdapUserAuthenticator`) When no authenticator is supplied, `OpcUaApplicationHost` falls back to `NullOpcUaUserAuthenticator`; the Host wires the real `LdapOpcUaUserAuthenticator` as a singleton in `Program.cs`.
`Program.cs` in the Server registers the authenticator based on `OpcUaServer:Ldap`: ### LDAP bind flow (`OtOpcUaLdapAuthService`)
```csharp LDAP is configured under the `Security:Ldap` section (bound to `LdapOptions`, `src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapOptions.cs`, `SectionName = "Security:Ldap"`). The app authenticator is `OtOpcUaLdapAuthService` — a thin wrapper around the shared `ZB.MOM.WW.Auth.Ldap` directory client that adds two app-only concerns the shared library deliberately does not model: the `Enabled` master switch and `DevStubMode`. The same `ILdapAuthService` instance serves **both** the Admin UI cookie login (`/auth/login`) and the OPC UA UserName path (via `LdapOpcUaUserAuthenticator`), so operators use one credential across both planes.
builder.Services.AddSingleton<IUserAuthenticator>(sp => ldapOptions.Enabled
? new LdapUserAuthenticator(ldapOptions, sp.GetRequiredService<ILogger<LdapUserAuthenticator>>())
: new DenyAllUserAuthenticator());
```
`LdapUserAuthenticator`: `OtOpcUaLdapAuthService.AuthenticateAsync`:
1. Refuses to bind over plain-LDAP unless `AllowInsecureLdap = true` (dev/test only). 1. If `Enabled = false`, denies outright — no bind, no DevStub bypass (the master switch wins).
2. Connects to `Server:Port`, optionally upgrades to TLS (`UseTls = true`, port 636 for AD). 2. If `DevStubMode = true`, accepts any non-empty credentials and grants the `Administrator` role **without any network bind** (dev only — must be `false` in production).
3. Binds as the service account; searches `SearchBase` for `UserNameAttribute = username`. 3. Refuses to bind over a plaintext transport (`Transport = None`) unless `AllowInsecure = true` (dev/test only). This is enforced at login, not at startup.
4. Rebinds as the resolved user DN with the supplied password (the actual credential check). 4. Delegates the real path to the shared `ZB.MOM.WW.Auth.Ldap` client: it binds (search-then-bind via `ServiceAccountDn`, or direct-bind `cn={user},{SearchBase}` when no service account is set), verifies the password, and reads the user's group memberships.
5. Reads `GroupAttribute` (default `memberOf`) and strips the leading `CN=` so operators configure friendly group names in `GroupToRole`. 5. Returns an `LdapAuthResult` carrying the validated username + the **groups** (never roles). Failure codes are folded into opaque user-facing error strings so a probe cannot distinguish "unknown user" from "wrong password".
6. Returns a `UserAuthResult` carrying the validated username + the set of LDAP groups. The set flows through to the session identity via `ILdapGroupsBearer.LdapGroups`.
Configuration example (Active Directory production): **Group → role mapping happens downstream**, not in the auth service: `LdapOpcUaUserAuthenticator` resolves `IGroupRoleMapper<string>` (`OtOpcUaGroupRoleMapper`) per call and unions its output with any pre-resolved roles (the DevStub `Administrator` grant). The roles are attached to the OPC UA session identity for the ACL evaluator. A mapper fault (e.g. a Config DB outage) falls back to the pre-resolved baseline rather than denying an otherwise-authenticated session.
`Transport` replaces the former `UseTls` bool: `Ldaps` (implicit TLS), `StartTls` (upgrade), or `None` (plaintext, requires `AllowInsecure`). Configuration example (Active Directory production):
```json ```json
{ {
"OpcUaServer": { "Security": {
"Ldap": { "Ldap": {
"Enabled": true, "Enabled": true,
"DevStubMode": false,
"Server": "dc01.corp.example.com", "Server": "dc01.corp.example.com",
"Port": 636, "Port": 636,
"UseTls": true, "Transport": "Ldaps",
"AllowInsecureLdap": false, "AllowInsecure": false,
"SearchBase": "DC=corp,DC=example,DC=com", "SearchBase": "DC=corp,DC=example,DC=com",
"ServiceAccountDn": "CN=OtOpcUaSvc,OU=Service Accounts,DC=corp,DC=example,DC=com", "ServiceAccountDn": "CN=OtOpcUaSvc,OU=Service Accounts,DC=corp,DC=example,DC=com",
"ServiceAccountPassword": "<from your secret store>", "ServiceAccountPassword": "<from your secret store>",
"GroupAttribute": "memberOf", "GroupAttribute": "memberOf",
"DisplayNameAttribute": "cn",
"UserNameAttribute": "sAMAccountName", "UserNameAttribute": "sAMAccountName",
"GroupToRole": { "GroupToRole": {
"OPCUA-Operators": "WriteOperate", "OPCUA-Designers": "Designer",
"OPCUA-Engineers": "WriteConfigure", "OPCUA-Admins": "Administrator",
"OPCUA-Tuners": "WriteTune", "OPCUA-Operators": "Operator"
"OPCUA-AlarmAck": "AlarmAck"
} }
} }
} }
} }
``` ```
`UserNameAttribute: "sAMAccountName"` is the critical AD override — the default `uid` is not populated on AD user entries. Use `userPrincipalName` instead if operators log in with `user@corp.example.com` form. Nested group membership is not expanded — assign users directly to the role-mapped groups, or pre-flatten in AD. `GroupToRole` maps LDAP group names → Admin roles (case-insensitive); a user gets every role whose source group is in their membership. The values are the canonical control-plane role strings (`Viewer` / `Designer` / `Administrator`, plus the appsettings-only `Operator` for the `DriverOperator` policy). `UserNameAttribute: "sAMAccountName"` is the critical AD override — the GLAuth dev default is `cn`, which is not how AD users are looked up; use `userPrincipalName` instead if operators log in with `user@corp.example.com` form. `LdapOptionsValidator` (`src/Server/ZB.MOM.WW.OtOpcUa.Host/Configuration/LdapOptionsValidator.cs`) fails startup when `Transport = None` and `AllowInsecure = false` on a real-LDAP (non-DevStub) config.
The same options bind the Admin's `LdapAuthService` (cookie auth / login form) so operators authenticate with a single credential across both processes.
--- ---
@@ -172,20 +172,27 @@ Per decision #129 the model is **additive-only — no explicit Deny**. Grants at
### Hierarchy ### Hierarchy
ACLs are evaluated against the UNS path: ACLs are evaluated against the node's scope path. `NodeScope` (`src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/NodeScope.cs`) carries a `Kind` that selects between two hierarchy shapes:
``` ```
ClusterId → Namespace → UnsArea → UnsLine → Equipment → Tag Equipment (UNS) kind: Cluster → Namespace → UnsArea → UnsLine → Equipment → Tag
SystemPlatform (Galaxy) kind: Cluster → Namespace → FolderSegment(s) → Tag
``` ```
On the Galaxy/SystemPlatform path each folder segment takes one trie level, so a deeply-nested Galaxy folder reaches the same depth as a full UNS path. Unset mid-path levels leave the corresponding id `null` and the evaluator walks only as far as the scope goes.
Each level can carry `NodeAcl` rows (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/NodeAcl.cs`) that grant a permission bundle to a set of `LdapGroups`. Each level can carry `NodeAcl` rows (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/NodeAcl.cs`) that grant a permission bundle to a set of `LdapGroups`.
### Permission flags ### Permission flags
`NodePermissions` (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/NodePermissions.cs`), stored as an `int` bitmask in `NodeAcl.PermissionFlags`:
```csharp ```csharp
[Flags] [Flags]
public enum NodePermissions : uint public enum NodePermissions : int
{ {
None = 0,
Browse = 1 << 0, Browse = 1 << 0,
Read = 1 << 1, Read = 1 << 1,
Subscribe = 1 << 2, Subscribe = 1 << 2,
@@ -215,20 +222,20 @@ The three Write tiers map to Galaxy's v1 `SecurityClassification` — `FreeAcces
| Class | Role | | Class | Role |
|---|---| |---|---|
| `PermissionTrie` | Cluster-scoped trie; each node carries `(GroupId → NodePermissions)` grants. | | `PermissionTrie` | Cluster-scoped trie; each node carries `(GroupId → NodePermissions)` grants. |
| `PermissionTrieBuilder` | Builds a trie from the current `NodeAcl` rows in one pass. | | `PermissionTrieBuilder` | Builds a trie from the current `NodeAcl` rows in one pass and installs it into the cache. |
| `PermissionTrieCache` | Per-cluster memoised trie; invalidated via `AclChangeNotifier` when the Admin publishes a draft that touches ACLs. | | `PermissionTrieCache` | Process-singleton cache keyed on `(ClusterId, GenerationId)`. Generation-sealed: `Install(trie)` adds a new generation + advances the "current" pointer; older generations are retained (in-flight requests still resolve) and GC'd by `Prune`. `Invalidate(clusterId)` drops every cached trie for a cluster. There is **no** `AclChangeNotifier` — a publish installs a new generation rather than signalling an invalidation. |
| `TriePermissionEvaluator` | Implements `IPermissionEvaluator.Authorize(session, operation, scope)` — walks from the root to the leaf for the supplied `NodeScope`, unions grants along the path, compares required permission to the union. | | `TriePermissionEvaluator` | Implements `IPermissionEvaluator.Authorize(session, operation, scope)`. Walks the cluster trie for the supplied `NodeScope`, unions grants along the path, and returns an `AuthorizationDecision`. Evaluates against the **session's bound generation** (`session.AuthGenerationId`), not just "current", so a grant added/removed in a newer generation cannot take effect mid-session. |
`NodeScope` carries `(ClusterId, NamespaceId, AreaId, LineId, EquipmentId, TagId)`; any suffix may be null — a tag-level ACL is more specific than an area-level ACL but both contribute via union. `NodeScope` is described above (Equipment-kind vs SystemPlatform-kind). The evaluator unions the matched grants along the path — a tag-level ACL and an area-level ACL both contribute.
### Dispatch gate — `IPermissionEvaluator` ### Dispatch gate — `IPermissionEvaluator`
`IPermissionEvaluator.Authorize(session, operation, scope)` (default impl `TriePermissionEvaluator` at `src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/TriePermissionEvaluator.cs`) bridges the OPC UA stack's `ISystemContext.UserIdentity` to the trie. The dispatch path calls it on every Read, Write, HistoryRead, Browse, Subscribe, AckAlarm, Call. A non-allow decision short-circuits the dispatch with `BadUserAccessDenied`. `IPermissionEvaluator.Authorize(UserAuthorizationState session, OpcUaOperation operation, NodeScope scope)` (default impl `TriePermissionEvaluator` at `src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/TriePermissionEvaluator.cs`) returns an `AuthorizationDecision`. The dispatch path calls it on every Read, Write, HistoryRead, Browse, Subscribe, AckAlarm, Call; a `NotGranted` decision denies the operation.
Key properties: Key properties:
- **Driver-agnostic.** No driver-level code participates in authorization decisions. Drivers report `SecurityClassification` as metadata on tag discovery; everything else flows through the evaluator. - **Driver-agnostic.** No driver-level code participates in authorization decisions. Drivers report `SecurityClassification` as metadata on tag discovery; everything else flows through the evaluator.
- **Fail-open-during-transition.** `StrictMode = false` (default during ACL rollouts) lets sessions without resolved LDAP groups proceed; flip `Authorization:StrictMode = true` in production once ACLs are populated. - **Strictly fail-closed (default-deny).** Every guard path returns `NotGranted` — a stale session (past the staleness ceiling, decision #152), a cluster mismatch between session and scope, a missing trie, a pruned bound generation, or simply no matching grant. There is no `StrictMode` / fail-open mode; absence of a grant is always a deny.
- **Evaluator stays pure.** `TriePermissionEvaluator` has no OPC UA stack dependency — it's tested directly from xUnit. - **Evaluator stays pure.** `TriePermissionEvaluator` has no OPC UA stack dependency — it's tested directly from xUnit.
### Full model ### Full model
@@ -241,24 +248,25 @@ See [`docs/v2/acl-design.md`](v2/acl-design.md) for the complete design: trie in
Control-plane authorization governs **the Admin UI** — who can view fleet config, edit drafts, publish generations, manage cluster nodes + credentials. Control-plane authorization governs **the Admin UI** — who can view fleet config, edit drafts, publish generations, manage cluster nodes + credentials.
Per decision #150 control-plane roles are **deliberately independent of data-plane ACLs**. An operator who can read every OPC UA tag in production may not be allowed to edit cluster config; conversely a ConfigEditor may not have any data-plane grants at all. Per decision #150 control-plane roles are **deliberately independent of data-plane ACLs**. An operator who can read every OPC UA tag in production may not be allowed to edit cluster config; conversely a `Designer` may not have any data-plane grants at all.
### Roles ### Roles
The `AdminRole` enum (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/AdminRole.cs`) defines: The `AdminRole` enum (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/AdminRole.cs`) defines three roles. Task 1.7 standardized the member names on the canonical `ZB.MOM.WW.Auth` `CanonicalRole` vocabulary (`ConfigViewer → Viewer`, `ConfigEditor → Designer`, `FleetAdmin → Administrator`); a data migration (`CanonicalizeAdminRoles`) rewrote existing rows. This was a rename, not a permission change.
| Role | Capabilities | | Role | Capabilities |
|---|---| |---|---|
| `ConfigViewer` | Read-only access to drafts, generations, audit log, fleet status. | | `Viewer` | Read-only access to drafts, generations, audit log, fleet status. (Was `ConfigViewer`.) |
| `ConfigEditor` | ConfigViewer plus draft editing (UNS, equipment, tags, ACLs, driver instances, reservations, CSV imports). Cannot publish. | | `Designer` | Viewer plus draft authoring (UNS, equipment, tags, ACLs, driver instances, reservations, CSV imports). Cannot publish. (Was `ConfigEditor`.) |
| `FleetAdmin` | ConfigEditor plus publish, cluster/node CRUD, credential management, role-grant management. Also satisfies the `DriverOperator` authorization policy. | | `Administrator` | Designer plus publish, cluster/node CRUD, credential management, role-grant management. Satisfies both the `FleetAdmin` and `DriverOperator` authorization policies. (Was `FleetAdmin`.) |
| `DriverOperator` | May issue **Reconnect** and **Restart** commands against live driver instances from the Admin UI `DriverStatusPanel`. Gated by the `DriverOperator` named policy in `AddAuthorization` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`). Map an LDAP group via `GroupToRole`, e.g. `"ot-driver-operator": "DriverOperator"`. |
In v2 the authentication + authorization stack is wired centrally by `AddOtOpcUaAuth` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`) and Razor pages gate inline with the role names, e.g. `@attribute [Authorize(Roles = "FleetAdmin,ConfigEditor")]` on `Deployments.razor`. Nav-menu sections hide via `<AuthorizeView>`. `DriverOperator` is an **authorization policy name** (kept stable), not an `AdminRole` member. It gates **Reconnect** / **Restart** commands against live driver instances from the Admin UI `DriverStatusPanel` and requires the canonical role `Operator` or `Administrator` (`policy.RequireRole("Operator", "Administrator")` in `AddAuthorization`, `src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`). `Operator` is an appsettings-only string role (not an `AdminRole` member); map an LDAP group to it via `GroupToRole`, e.g. `"ot-driver-operator": "Operator"`. The `FleetAdmin` policy requires the `Administrator` role.
In v2 the authentication + authorization stack is wired centrally by `AddOtOpcUaAuth` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`), which also installs a `FallbackPolicy` that requires an authenticated user. Razor pages gate inline with the canonical role names, e.g. `@attribute [Authorize(Roles = "Administrator,Designer")]`. Nav-menu sections hide via `<AuthorizeView>`.
### Role grant source ### Role grant source
Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/LdapGroupRoleMapping.cs`) — the same pattern as the data-plane `NodeAcl` but scoped to Admin roles + (optionally) cluster scope for multi-site fleets. The `RoleGrants.razor` page lets FleetAdmins edit these mappings without leaving the UI. Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/LdapGroupRoleMapping.cs`) — the same pattern as the data-plane `NodeAcl` but scoped to Admin roles + (optionally) one cluster for multi-site fleets (a system-wide row, `IsSystemWide = true`, stacks additively with cluster-scoped rows). The `RoleGrants.razor` page lets `Administrator`s edit these mappings without leaving the UI.
--- ---
@@ -266,9 +274,9 @@ Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/Core/ZB.MOM.WW.
Per-capability resilience (retry, timeout, circuit-breaker, bulkhead) is applied by `CapabilityInvoker` in `src/Core/ZB.MOM.WW.OtOpcUa.Core/Resilience/`. A driver-capability call made **outside** the invoker bypasses resilience entirely — which in production looks like inconsistent timeouts, un-wrapped retries, and unbounded blocking. Per-capability resilience (retry, timeout, circuit-breaker, bulkhead) is applied by `CapabilityInvoker` in `src/Core/ZB.MOM.WW.OtOpcUa.Core/Resilience/`. A driver-capability call made **outside** the invoker bypasses resilience entirely — which in production looks like inconsistent timeouts, un-wrapped retries, and unbounded blocking.
`OTOPCUA0001` (Roslyn analyzer at `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs`) fires as a compile-time **warning** when an `async`/`Task`-returning method on one of the seven guarded capability interfaces (`IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider`) is invoked **outside** a lambda passed to `CapabilityInvoker.ExecuteAsync` / `ExecuteWriteAsync` / `AlarmSurfaceInvoker.*`. The analyzer walks up the syntax tree from the call site, finds any enclosing invoker invocation, and verifies the call lives transitively inside that invocation's anonymous-function argument — a sibling pattern (do the call, then invoke `ExecuteAsync` on something unrelated nearby) does not satisfy the rule. `OTOPCUA0001` (Roslyn analyzer at `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs`) fires with category `OtOpcUa.Resilience` and default severity **Warning** (per `AnalyzerReleases.Shipped.md`) when a method on one of the seven guarded capability interfaces (`IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider` — all in `ZB.MOM.WW.OtOpcUa.Core.Abstractions`) is invoked **outside** a lambda passed to `CapabilityInvoker.ExecuteAsync` / `ExecuteWriteAsync`. `AlarmSurfaceInvoker` is **not** a wrapper home — its own implementation is covered transitively because it routes through the inner `CapabilityInvoker.ExecuteAsync`. The analyzer walks up the syntax tree from the call site, finds any enclosing invoker invocation, and verifies the call lives transitively inside that invocation's anonymous-function argument — a sibling pattern (do the call, then invoke `ExecuteAsync` on something unrelated nearby) does not satisfy the rule.
Five xUnit-v3 + Shouldly tests at `tests/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers.Tests` cover the common fail/pass shapes + the sibling-pattern regression guard. The xunit.v3 + Shouldly suite at `tests/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers.Tests/UnwrappedCapabilityCallAnalyzerTests.cs` covers the common fail/pass shapes + the sibling-pattern regression guard.
The rule is intentionally scoped to async surfaces — pure in-memory accessors like `IHostConnectivityProbe.GetHostStatuses()` return synchronously and do not require the invoker wrap. The rule is intentionally scoped to async surfaces — pure in-memory accessors like `IHostConnectivityProbe.GetHostStatuses()` return synchronously and do not require the invoker wrap.
@@ -276,8 +284,8 @@ The rule is intentionally scoped to async surfaces — pure in-memory accessors
## Audit Logging ## Audit Logging
- **Server**: Serilog `AUDIT:` prefix on every authentication success/failure, certificate validation result, write access denial. Written alongside the regular rolling file sink. - **Server**: authentication, certificate-validation, and write-denial events are logged through the regular Serilog rolling file sink.
- **Admin**: `AuditLogService` writes `ConfigAuditLog` rows to the Config DB for every publish, rollback, cluster-node CRUD, credential rotation. Visible in the Audit page for operators with `ConfigViewer` or above. - **Admin**: `AuditWriterActor` (`src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs`) writes `ConfigAuditLog` rows (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigAuditLog.cs`) to the Config DB for publish, rollback, cluster-node CRUD, and credential rotation. Visible on the cluster Audit page (`ClusterAudit.razor`) for operators with `Viewer` or above.
--- ---
@@ -285,16 +293,16 @@ The rule is intentionally scoped to async surfaces — pure in-memory accessors
### Certificate trust failure ### Certificate trust failure
Check `{PkiStoreRoot}/rejected/` for the client's cert. Promote via Admin UI Certificates page, or copy the `.der` file manually to `trusted/`. Check `{PkiStoreRoot}/rejected/` for the client's cert. Copy the `.der` file to `trusted/certs/`; the SDK trust list reloads on the next handshake. The Admin UI Certificates page shows what is in each store but does not move certs.
### LDAP users can connect but fail authorization ### LDAP users can connect but fail authorization
Verify (a) `OpcUaServer:Ldap:GroupAttribute` returns groups in the form `CN=MyGroup,…` (OtOpcUa strips the `CN=` for matching), (b) a `NodeAcl` grant exists at any level of the node's UNS path that unions to the required permission, (c) `Authorization:StrictMode` is correctly set for the deployment stage. Verify (a) `Security:Ldap:GroupAttribute` (default `memberOf`) returns the user's groups, (b) `Security:Ldap:GroupToRole` maps those groups to the expected roles, and (c) a `NodeAcl` grant exists at some level of the node's scope path that unions to the required permission. The data-plane evaluator is strictly default-deny — there is no fail-open mode to fall back on.
### LDAP bind rejected as "insecure" ### LDAP bind rejected as "insecure"
Set `UseTls = true` + `Port = 636`, or temporarily flip `AllowInsecureLdap = true` in dev. Production Active Directory increasingly refuses plain-LDAP bind under LDAP-signing enforcement. Set `Security:Ldap:Transport = "Ldaps"` (or `"StartTls"`) with the matching port (636 for AD `Ldaps`), or temporarily set `Security:Ldap:AllowInsecure = true` in dev. Production Active Directory increasingly refuses plain-LDAP bind under LDAP-signing enforcement.
### `AuthorizationGate` denies every call after a publish ### Stale ACL trie after a publish
`AclChangeNotifier` invalidates the `PermissionTrieCache` on publish; a stuck cache is usually a missed notification. Restart the Server as a quick mitigation and file a bug — the design is to stay fresh without restarts. A publish installs a **new generation** into `PermissionTrieCache` via `PermissionTrieBuilder` rather than signalling an invalidation; the evaluator binds each session to a generation. If grants appear stale, confirm the new generation was installed (publish completed) and that sessions re-resolved their auth state — a session past its staleness ceiling fails closed and must re-authenticate. As a last resort `PermissionTrieCache.Invalidate(clusterId)` drops a cluster's cached tries.