docs(audit): security.md — accuracy pass (profiles, LDAP, ACL, analyzer)

STRUCTURAL (links-report.md):
- Repointed missing src/.../Security/Ldap/LdapAuthService.cs -> the real
  OtOpcUaLdapAuthService.cs (Ldap/OtOpcUaLdapAuthService.cs implements
  ILdapAuthService). Class was reorganized as a wrapper over shared
  ZB.MOM.WW.Auth.Ldap. check_links now clean for docs/security.md.

CODE-REALITY — transport profiles (OpcUaApplicationHost.cs:15-23,59-64,374-409):
- Only THREE profiles exist: None, Basic256Sha256Sign,
  Basic256Sha256SignAndEncrypt (NO hyphens, NO underscores). Removed the four
  fabricated Aes128/Aes256 rows. Config binds by enum-member name; hyphenated
  form does NOT bind. Documented this + the empty-list fallback to None.
- Config section is OpcUa (not OpcUaServer); key is the LIST
  EnabledSecurityProfiles (not singular SecurityProfile). Program.cs:120 binds
  'OpcUa'; Certificates.razor:80 reads OpcUa:PkiStoreRoot.
- No SecurityProfileResolver class exists — stated so explicitly.

CODE-REALITY — LDAP (LdapOptions.cs:21, OtOpcUaLdapAuthService.cs):
- Section is Security:Ldap (LdapOptions.SectionName), not OpcUaServer:Ldap.
- Authenticator is OtOpcUaLdapAuthService (wrapper) + LdapOpcUaUserAuthenticator
  (IOpcUaUserAuthenticator.AuthenticateUserNameAsync), not bespoke
  LdapUserAuthenticator/IUserAuthenticator.
- UseTls bool -> Transport enum (Ldaps/StartTls/None); AllowInsecureLdap ->
  AllowInsecure. Added Enabled master switch + DevStubMode.
- Group->role mapping is downstream via IGroupRoleMapper<string>
  (OtOpcUaGroupRoleMapper), NOT in the auth service. ILdapGroupsBearer and
  DenyAllUserAuthenticator do not exist (fallback is NullOpcUaUserAuthenticator).
- GroupToRole values corrected to canonical roles (Viewer/Designer/
  Administrator/Operator).

CODE-REALITY — ACL trie (TriePermissionEvaluator.cs, PermissionTrieCache.cs,
NodeScope.cs, NodePermissions.cs):
- NodePermissions backing type is int (not uint); lives in Configuration/Enums.
- Authorize(UserAuthorizationState, OpcUaOperation, NodeScope) returns
  AuthorizationDecision.
- Evaluator is strictly fail-CLOSED. Removed the fabricated
  'fail-open-during-transition' + Authorization:StrictMode key (no StrictMode
  anywhere in source).
- Cache: generation-sealed Install/Invalidate/Prune. AclChangeNotifier does
  NOT exist — removed.
- Added the SystemPlatform (Galaxy) scope hierarchy variant.

CODE-REALITY — control plane (AdminRole.cs, ServiceCollectionExtensions.cs:
113-131):
- AdminRole members are Viewer/Designer/Administrator (Task 1.7 rename from
  ConfigViewer/ConfigEditor/FleetAdmin). DriverOperator/FleetAdmin are POLICY
  names; DriverOperator requires roles Operator|Administrator.

CODE-REALITY — analyzer (UnwrappedCapabilityCallAnalyzer.cs:99-103,
AnalyzerReleases.Shipped.md):
- Confirmed category OtOpcUa.Resilience + severity Warning (already correct).
  Corrected 'Five tests' (suite has 26 cases) and AlarmSurfaceInvoker
  wrapper-home wording.

OTHER FIXES:
- v2 header: removed false AddJwtBearer/IPostConfigureOptions<JwtBearerOptions>
  claim — auth is Cookie-only; JWT is mint-only via /auth/token for external
  consumers (JwtTokenService.cs:25-48).
- Certificates.razor is a read-only viewer; removed fabricated
  CertTrustService/CertTrustOptions promote claim.
- Audit: writer is AuditWriterActor (not AuditLogService); softened the
  unverifiable server-side 'AUDIT:' Serilog-prefix claim.
This commit is contained in:
Joseph Doherty
2026-06-03 16:26:00 -04:00
parent 1b6dedc142
commit b9bdfee189
+90 -82
View File
@@ -4,12 +4,15 @@
> Paths + project names moved: `OtOpcUa.Server/Security/` → `OtOpcUa.Security/`
> (`Ldap/`, `Jwt/`, `Endpoints/AuthEndpoints.cs`), `OtOpcUa.Admin` is gone (its
> auth + role-grant pages live in `OtOpcUa.AdminUI`), and Admin auth policies
> register in `OtOpcUa.Host/Program.cs` via `AddOtOpcUaAuth` rather than in a
> separate Admin process. The v2 `Security:Jwt` section adds JWT bearer auth
> alongside the existing cookie scheme (`AddJwtBearer` wired via
> `IPostConfigureOptions<JwtBearerOptions>` in `OtOpcUa.Security`). DataProtection
> keys persist to the shared `ConfigDb.DataProtectionKeys` table so cookies
> survive failover between admin-role nodes.
> register from `OtOpcUa.Host/Program.cs` via `AddOtOpcUaAuth`
> (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`) rather
> than in a separate Admin process. The Admin UI uses a **single Cookie
> authentication scheme** — there is no `AddJwtBearer` pipeline. The
> `Security:Jwt` section configures `JwtTokenService`, which mints a JWT at the
> `/auth/token` endpoint for **external** consumers (OPC UA clients / automation
> scripts); the cookie itself stores the `ClaimsPrincipal` directly. DataProtection
> keys persist to the shared Config DB (`PersistKeysToDbContext<OtOpcUaConfigDbContext>`)
> so cookies survive failover between admin-role nodes.
>
> See `docs/plans/2026-05-26-akka-hosting-alignment-design.md` §5 for the v2
> auth + DataProtection rationale.
@@ -18,8 +21,8 @@ OtOpcUa has four independent security concerns. This document covers all four:
1. **Transport security** — OPC UA secure channel (signing, encryption, X.509 trust).
2. **OPC UA authentication** — Anonymous / UserName / X.509 session identities; UserName tokens authenticated by LDAP bind.
3. **Data-plane authorization** — who can browse, read, subscribe, write, acknowledge alarms on which nodes. Evaluated by `PermissionTrie` against the Config DB `NodeAcl` tree.
4. **Control-plane authorization** — who can view or edit fleet configuration in the Admin UI. Gated by the `AdminRole` (`ConfigViewer` / `ConfigEditor` / `FleetAdmin`) claim from `LdapGroupRoleMapping`.
3. **Data-plane authorization** — who can browse, read, subscribe, write, acknowledge alarms on which nodes. Evaluated by `TriePermissionEvaluator` over a `PermissionTrie` built from the Config DB `NodeAcl` tree.
4. **Control-plane authorization** — who can view or edit fleet configuration in the Admin UI. Gated by the `AdminRole` (`Viewer` / `Designer` / `Administrator`) claim resolved from `LdapGroupRoleMapping`.
Transport security and OPC UA authentication are per-node concerns configured in the Server's bootstrap `appsettings.json`. Data-plane ACLs and Admin role grants live in the Config DB.
@@ -33,42 +36,43 @@ The OtOpcUa Server supports configurable OPC UA transport security profiles that
There are two distinct layers of security in OPC UA:
- **Transport security** -- secures the communication channel itself using TLS-style certificate exchange, message signing, and encryption. This is what the `OpcUaServer:SecurityProfile` setting controls.
- **Transport security** -- secures the communication channel itself using TLS-style certificate exchange, message signing, and encryption. This is what the `OpcUa:EnabledSecurityProfiles` setting controls.
- **UserName token encryption** -- protects user credentials (username/password) sent during session activation. The OPC UA stack encrypts UserName tokens using the server's application certificate regardless of the transport security mode. UserName authentication therefore works on `None` endpoints too — the credentials themselves are always encrypted. A secure transport profile adds protection against message-level tampering and eavesdropping of data payloads.
### Supported security profiles
The server supports seven transport security profiles:
The profiles are the members of the `OpcUaSecurityProfile` enum (`src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OpcUaApplicationHost.cs`). The server ships **three** baseline profiles; the config value is the bare enum-member name (no hyphens, no underscores):
| Profile Name | Security Policy | Message Security Mode | Description |
|-----------------------------------|----------------------------|-----------------------|--------------------------------------------------|
| `None` | None | None | No signing or encryption. Suitable for development and isolated networks only. |
| `Basic256Sha256-Sign` | Basic256Sha256 | Sign | Messages are signed but not encrypted. Protects against tampering but data is visible on the wire. |
| `Basic256Sha256-SignAndEncrypt` | Basic256Sha256 | SignAndEncrypt | Messages are both signed and encrypted. Full protection against tampering and eavesdropping. |
| `Aes128_Sha256_RsaOaep-Sign` | Aes128_Sha256_RsaOaep | Sign | Modern profile with AES-128 encryption and SHA-256 signing. |
| `Aes128_Sha256_RsaOaep-SignAndEncrypt` | Aes128_Sha256_RsaOaep | SignAndEncrypt | Modern profile with AES-128 encryption. Recommended for production. |
| `Aes256_Sha256_RsaPss-Sign` | Aes256_Sha256_RsaPss | Sign | Strongest profile with AES-256 and RSA-PSS signatures. |
| `Aes256_Sha256_RsaPss-SignAndEncrypt` | Aes256_Sha256_RsaPss | SignAndEncrypt | Strongest profile. Recommended for high-security deployments. |
| Enum member | Security Policy | Message Security Mode | Description |
|---------------------------------|------------------|-----------------------|--------------------------------------------------|
| `None` | None | None | No signing or encryption. Suitable for development and isolated networks only. |
| `Basic256Sha256Sign` | Basic256Sha256 | Sign | Messages are signed but not encrypted. Protects against tampering but data is visible on the wire. |
| `Basic256Sha256SignAndEncrypt` | Basic256Sha256 | SignAndEncrypt | Messages are both signed and encrypted. Full protection against tampering and eavesdropping. |
The server exposes a separate endpoint for each configured profile, and clients select the one they prefer during connection.
`BuildSecurityPolicies` (`OpcUaApplicationHost.cs`) maps each configured profile to an SDK `ServerSecurityPolicy`. The server exposes a separate endpoint per configured profile and clients select the one they prefer at session open. The enum's XML doc notes that Aes128/Aes256 variants can be added later by extending the enum + `BuildSecurityPolicies` — the wiring is profile-agnostic — but they are **not implemented today**. There is no `SecurityProfileResolver` class.
> **Config value form.** The enum binds by member name, so a profile string with hyphens (e.g. `Basic256Sha256-Sign`) does **not** bind — use the exact enum-member spelling above. If `EnabledSecurityProfiles` is empty, the server falls back to a single `None` endpoint (logged, very visible) so it still has a listening endpoint.
### Configuration
Transport security is configured in the `OpcUaServer` section of the Server process's bootstrap `appsettings.json`:
Transport security is configured in the `OpcUa` section of the Host process's bootstrap `appsettings.json` (bound to `OpcUaApplicationHostOptions`):
```json
{
"OpcUaServer": {
"EndpointUrl": "opc.tcp://0.0.0.0:4840/OtOpcUa",
"ApplicationName": "OtOpcUa Server",
"OpcUa": {
"ApplicationName": "OtOpcUa",
"ApplicationUri": "urn:node-a:OtOpcUa",
"PublicHostname": "0.0.0.0",
"OpcUaPort": 4840,
"PkiStoreRoot": "C:/ProgramData/OtOpcUa/pki",
"AutoAcceptUntrustedClientCertificates": false,
"SecurityProfile": "Basic256Sha256-SignAndEncrypt"
"EnabledSecurityProfiles": [ "Basic256Sha256Sign", "Basic256Sha256SignAndEncrypt" ]
}
}
```
`EnabledSecurityProfiles` is a **list** — the server publishes one endpoint per entry. The default (when the key is omitted) is all three baseline profiles (`None`, `Basic256Sha256Sign`, `Basic256Sha256SignAndEncrypt`); production deployments typically drop `None`. The list must contain at least one entry (`OpcUaApplicationHostOptionsValidator` enforces `MinCount(…, 1)`).
The server certificate is auto-generated on first start if none exists in `PkiStoreRoot/own/`. Always generated even for `None`-only deployments because UserName token encryption depends on it.
### PKI directory layout
@@ -91,13 +95,13 @@ When a client connects using a secure profile (`Sign` or `SignAndEncrypt`), the
4. If not found and `AutoAcceptUntrustedClientCertificates` is `true`, the certificate is automatically copied to `trusted/` and the connection proceeds.
5. If not found and `AutoAcceptUntrustedClientCertificates` is `false`, the certificate is copied to `rejected/` and the connection is refused.
The Admin UI `Certificates.razor` page uses `CertTrustService` (singleton reading `CertTrustOptions` for the Server's `PkiStoreRoot`) to promote rejected client certs to trusted without operators having to file-copy manually.
The Admin UI `Certificates.razor` page (`src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Certificates.razor`) lists the contents of each PKI sub-store (own / trusted / issuer / rejected) by reading the `OpcUa:PkiStoreRoot` path from configuration. It is currently a **read-only viewer** promoting a rejected cert to trusted is still a file move (copy the `.der` from `rejected/` to `trusted/certs/`); the SDK trust list reloads on the next handshake.
### Production hardening
- Set `AutoAcceptUntrustedClientCertificates = false`.
- Drop `None` from the profile set.
- Use the Admin UI to promote trusted client certs rather than the auto-accept fallback.
- Drop `None` from `EnabledSecurityProfiles`.
- Promote trusted client certs by moving the `.der` from `rejected/` to `trusted/certs/` rather than relying on the auto-accept fallback. (The Admin UI Certificates page shows what is in each store.)
- Periodically audit the `rejected/` directory; an unexpected entry is often a misconfigured client or a probe attempt.
---
@@ -108,59 +112,55 @@ The Server accepts three OPC UA identity-token types:
| Token | Handler | Notes |
|---|---|---|
| Anonymous | `IUserAuthenticator.AuthenticateAsync(username: "", password: "")` | Refused in strict mode unless explicit anonymous grants exist; allowed in lax mode for backward compatibility. |
| UserName/Password | `LdapOpcUaUserAuthenticator` (`src/Server/ZB.MOM.WW.OtOpcUa.Host/OpcUa/LdapOpcUaUserAuthenticator.cs`, backed by `LdapAuthService` at `src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapAuthService.cs`) | LDAP bind + group lookup; resolved `LdapGroups` flow into the session's identity bearer (`ILdapGroupsBearer`). |
| X.509 Certificate | Stack-level acceptance + role mapping via CN | X.509 identity carries `AuthenticatedUser` + read roles; finer-grain authorization happens through the data-plane ACLs. |
| Anonymous | No `IOpcUaUserAuthenticator` call — the SDK admits anonymous sessions at the channel. | Data-plane authorization (below) still default-denies any node a session has no ACL grant for. |
| UserName/Password | `LdapOpcUaUserAuthenticator.AuthenticateUserNameAsync` (`src/Server/ZB.MOM.WW.OtOpcUa.Host/OpcUa/LdapOpcUaUserAuthenticator.cs`, implements `IOpcUaUserAuthenticator`), backed by the app `ILdapAuthService` `OtOpcUaLdapAuthService` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/OtOpcUaLdapAuthService.cs`). | LDAP bind + group lookup. The returned LDAP groups are mapped to roles via `IGroupRoleMapper<string>` (`OtOpcUaGroupRoleMapper`) and attached to the OPC UA session identity for the downstream ACL evaluator. |
| X.509 Certificate | Stack-level acceptance during the secure-channel handshake. | The certificate must be trusted (see PKI trust flow); finer-grain authorization happens through the data-plane ACLs. |
### LDAP bind flow (`LdapUserAuthenticator`)
When no authenticator is supplied, `OpcUaApplicationHost` falls back to `NullOpcUaUserAuthenticator`; the Host wires the real `LdapOpcUaUserAuthenticator` as a singleton in `Program.cs`.
`Program.cs` in the Server registers the authenticator based on `OpcUaServer:Ldap`:
### LDAP bind flow (`OtOpcUaLdapAuthService`)
```csharp
builder.Services.AddSingleton<IUserAuthenticator>(sp => ldapOptions.Enabled
? new LdapUserAuthenticator(ldapOptions, sp.GetRequiredService<ILogger<LdapUserAuthenticator>>())
: new DenyAllUserAuthenticator());
```
LDAP is configured under the `Security:Ldap` section (bound to `LdapOptions`, `src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapOptions.cs`, `SectionName = "Security:Ldap"`). The app authenticator is `OtOpcUaLdapAuthService` — a thin wrapper around the shared `ZB.MOM.WW.Auth.Ldap` directory client that adds two app-only concerns the shared library deliberately does not model: the `Enabled` master switch and `DevStubMode`. The same `ILdapAuthService` instance serves **both** the Admin UI cookie login (`/auth/login`) and the OPC UA UserName path (via `LdapOpcUaUserAuthenticator`), so operators use one credential across both planes.
`LdapUserAuthenticator`:
`OtOpcUaLdapAuthService.AuthenticateAsync`:
1. Refuses to bind over plain-LDAP unless `AllowInsecureLdap = true` (dev/test only).
2. Connects to `Server:Port`, optionally upgrades to TLS (`UseTls = true`, port 636 for AD).
3. Binds as the service account; searches `SearchBase` for `UserNameAttribute = username`.
4. Rebinds as the resolved user DN with the supplied password (the actual credential check).
5. Reads `GroupAttribute` (default `memberOf`) and strips the leading `CN=` so operators configure friendly group names in `GroupToRole`.
6. Returns a `UserAuthResult` carrying the validated username + the set of LDAP groups. The set flows through to the session identity via `ILdapGroupsBearer.LdapGroups`.
1. If `Enabled = false`, denies outright — no bind, no DevStub bypass (the master switch wins).
2. If `DevStubMode = true`, accepts any non-empty credentials and grants the `Administrator` role **without any network bind** (dev only — must be `false` in production).
3. Refuses to bind over a plaintext transport (`Transport = None`) unless `AllowInsecure = true` (dev/test only). This is enforced at login, not at startup.
4. Delegates the real path to the shared `ZB.MOM.WW.Auth.Ldap` client: it binds (search-then-bind via `ServiceAccountDn`, or direct-bind `cn={user},{SearchBase}` when no service account is set), verifies the password, and reads the user's group memberships.
5. Returns an `LdapAuthResult` carrying the validated username + the **groups** (never roles). Failure codes are folded into opaque user-facing error strings so a probe cannot distinguish "unknown user" from "wrong password".
Configuration example (Active Directory production):
**Group → role mapping happens downstream**, not in the auth service: `LdapOpcUaUserAuthenticator` resolves `IGroupRoleMapper<string>` (`OtOpcUaGroupRoleMapper`) per call and unions its output with any pre-resolved roles (the DevStub `Administrator` grant). The roles are attached to the OPC UA session identity for the ACL evaluator. A mapper fault (e.g. a Config DB outage) falls back to the pre-resolved baseline rather than denying an otherwise-authenticated session.
`Transport` replaces the former `UseTls` bool: `Ldaps` (implicit TLS), `StartTls` (upgrade), or `None` (plaintext, requires `AllowInsecure`). Configuration example (Active Directory production):
```json
{
"OpcUaServer": {
"Security": {
"Ldap": {
"Enabled": true,
"DevStubMode": false,
"Server": "dc01.corp.example.com",
"Port": 636,
"UseTls": true,
"AllowInsecureLdap": false,
"Transport": "Ldaps",
"AllowInsecure": false,
"SearchBase": "DC=corp,DC=example,DC=com",
"ServiceAccountDn": "CN=OtOpcUaSvc,OU=Service Accounts,DC=corp,DC=example,DC=com",
"ServiceAccountPassword": "<from your secret store>",
"GroupAttribute": "memberOf",
"DisplayNameAttribute": "cn",
"UserNameAttribute": "sAMAccountName",
"GroupToRole": {
"OPCUA-Operators": "WriteOperate",
"OPCUA-Engineers": "WriteConfigure",
"OPCUA-Tuners": "WriteTune",
"OPCUA-AlarmAck": "AlarmAck"
"OPCUA-Designers": "Designer",
"OPCUA-Admins": "Administrator",
"OPCUA-Operators": "Operator"
}
}
}
}
```
`UserNameAttribute: "sAMAccountName"` is the critical AD override — the default `uid` is not populated on AD user entries. Use `userPrincipalName` instead if operators log in with `user@corp.example.com` form. Nested group membership is not expanded — assign users directly to the role-mapped groups, or pre-flatten in AD.
The same options bind the Admin's `LdapAuthService` (cookie auth / login form) so operators authenticate with a single credential across both processes.
`GroupToRole` maps LDAP group names → Admin roles (case-insensitive); a user gets every role whose source group is in their membership. The values are the canonical control-plane role strings (`Viewer` / `Designer` / `Administrator`, plus the appsettings-only `Operator` for the `DriverOperator` policy). `UserNameAttribute: "sAMAccountName"` is the critical AD override — the GLAuth dev default is `cn`, which is not how AD users are looked up; use `userPrincipalName` instead if operators log in with `user@corp.example.com` form. `LdapOptionsValidator` (`src/Server/ZB.MOM.WW.OtOpcUa.Host/Configuration/LdapOptionsValidator.cs`) fails startup when `Transport = None` and `AllowInsecure = false` on a real-LDAP (non-DevStub) config.
---
@@ -172,20 +172,27 @@ Per decision #129 the model is **additive-only — no explicit Deny**. Grants at
### Hierarchy
ACLs are evaluated against the UNS path:
ACLs are evaluated against the node's scope path. `NodeScope` (`src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/NodeScope.cs`) carries a `Kind` that selects between two hierarchy shapes:
```
ClusterId → Namespace → UnsArea → UnsLine → Equipment → Tag
Equipment (UNS) kind: Cluster → Namespace → UnsArea → UnsLine → Equipment → Tag
SystemPlatform (Galaxy) kind: Cluster → Namespace → FolderSegment(s) → Tag
```
On the Galaxy/SystemPlatform path each folder segment takes one trie level, so a deeply-nested Galaxy folder reaches the same depth as a full UNS path. Unset mid-path levels leave the corresponding id `null` and the evaluator walks only as far as the scope goes.
Each level can carry `NodeAcl` rows (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/NodeAcl.cs`) that grant a permission bundle to a set of `LdapGroups`.
### Permission flags
`NodePermissions` (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/NodePermissions.cs`), stored as an `int` bitmask in `NodeAcl.PermissionFlags`:
```csharp
[Flags]
public enum NodePermissions : uint
public enum NodePermissions : int
{
None = 0,
Browse = 1 << 0,
Read = 1 << 1,
Subscribe = 1 << 2,
@@ -215,20 +222,20 @@ The three Write tiers map to Galaxy's v1 `SecurityClassification` — `FreeAcces
| Class | Role |
|---|---|
| `PermissionTrie` | Cluster-scoped trie; each node carries `(GroupId → NodePermissions)` grants. |
| `PermissionTrieBuilder` | Builds a trie from the current `NodeAcl` rows in one pass. |
| `PermissionTrieCache` | Per-cluster memoised trie; invalidated via `AclChangeNotifier` when the Admin publishes a draft that touches ACLs. |
| `TriePermissionEvaluator` | Implements `IPermissionEvaluator.Authorize(session, operation, scope)` — walks from the root to the leaf for the supplied `NodeScope`, unions grants along the path, compares required permission to the union. |
| `PermissionTrieBuilder` | Builds a trie from the current `NodeAcl` rows in one pass and installs it into the cache. |
| `PermissionTrieCache` | Process-singleton cache keyed on `(ClusterId, GenerationId)`. Generation-sealed: `Install(trie)` adds a new generation + advances the "current" pointer; older generations are retained (in-flight requests still resolve) and GC'd by `Prune`. `Invalidate(clusterId)` drops every cached trie for a cluster. There is **no** `AclChangeNotifier` — a publish installs a new generation rather than signalling an invalidation. |
| `TriePermissionEvaluator` | Implements `IPermissionEvaluator.Authorize(session, operation, scope)`. Walks the cluster trie for the supplied `NodeScope`, unions grants along the path, and returns an `AuthorizationDecision`. Evaluates against the **session's bound generation** (`session.AuthGenerationId`), not just "current", so a grant added/removed in a newer generation cannot take effect mid-session. |
`NodeScope` carries `(ClusterId, NamespaceId, AreaId, LineId, EquipmentId, TagId)`; any suffix may be null — a tag-level ACL is more specific than an area-level ACL but both contribute via union.
`NodeScope` is described above (Equipment-kind vs SystemPlatform-kind). The evaluator unions the matched grants along the path — a tag-level ACL and an area-level ACL both contribute.
### Dispatch gate — `IPermissionEvaluator`
`IPermissionEvaluator.Authorize(session, operation, scope)` (default impl `TriePermissionEvaluator` at `src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/TriePermissionEvaluator.cs`) bridges the OPC UA stack's `ISystemContext.UserIdentity` to the trie. The dispatch path calls it on every Read, Write, HistoryRead, Browse, Subscribe, AckAlarm, Call. A non-allow decision short-circuits the dispatch with `BadUserAccessDenied`.
`IPermissionEvaluator.Authorize(UserAuthorizationState session, OpcUaOperation operation, NodeScope scope)` (default impl `TriePermissionEvaluator` at `src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/TriePermissionEvaluator.cs`) returns an `AuthorizationDecision`. The dispatch path calls it on every Read, Write, HistoryRead, Browse, Subscribe, AckAlarm, Call; a `NotGranted` decision denies the operation.
Key properties:
- **Driver-agnostic.** No driver-level code participates in authorization decisions. Drivers report `SecurityClassification` as metadata on tag discovery; everything else flows through the evaluator.
- **Fail-open-during-transition.** `StrictMode = false` (default during ACL rollouts) lets sessions without resolved LDAP groups proceed; flip `Authorization:StrictMode = true` in production once ACLs are populated.
- **Strictly fail-closed (default-deny).** Every guard path returns `NotGranted` — a stale session (past the staleness ceiling, decision #152), a cluster mismatch between session and scope, a missing trie, a pruned bound generation, or simply no matching grant. There is no `StrictMode` / fail-open mode; absence of a grant is always a deny.
- **Evaluator stays pure.** `TriePermissionEvaluator` has no OPC UA stack dependency — it's tested directly from xUnit.
### Full model
@@ -241,24 +248,25 @@ See [`docs/v2/acl-design.md`](v2/acl-design.md) for the complete design: trie in
Control-plane authorization governs **the Admin UI** — who can view fleet config, edit drafts, publish generations, manage cluster nodes + credentials.
Per decision #150 control-plane roles are **deliberately independent of data-plane ACLs**. An operator who can read every OPC UA tag in production may not be allowed to edit cluster config; conversely a ConfigEditor may not have any data-plane grants at all.
Per decision #150 control-plane roles are **deliberately independent of data-plane ACLs**. An operator who can read every OPC UA tag in production may not be allowed to edit cluster config; conversely a `Designer` may not have any data-plane grants at all.
### Roles
The `AdminRole` enum (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/AdminRole.cs`) defines:
The `AdminRole` enum (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/AdminRole.cs`) defines three roles. Task 1.7 standardized the member names on the canonical `ZB.MOM.WW.Auth` `CanonicalRole` vocabulary (`ConfigViewer → Viewer`, `ConfigEditor → Designer`, `FleetAdmin → Administrator`); a data migration (`CanonicalizeAdminRoles`) rewrote existing rows. This was a rename, not a permission change.
| Role | Capabilities |
|---|---|
| `ConfigViewer` | Read-only access to drafts, generations, audit log, fleet status. |
| `ConfigEditor` | ConfigViewer plus draft editing (UNS, equipment, tags, ACLs, driver instances, reservations, CSV imports). Cannot publish. |
| `FleetAdmin` | ConfigEditor plus publish, cluster/node CRUD, credential management, role-grant management. Also satisfies the `DriverOperator` authorization policy. |
| `DriverOperator` | May issue **Reconnect** and **Restart** commands against live driver instances from the Admin UI `DriverStatusPanel`. Gated by the `DriverOperator` named policy in `AddAuthorization` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`). Map an LDAP group via `GroupToRole`, e.g. `"ot-driver-operator": "DriverOperator"`. |
| `Viewer` | Read-only access to drafts, generations, audit log, fleet status. (Was `ConfigViewer`.) |
| `Designer` | Viewer plus draft authoring (UNS, equipment, tags, ACLs, driver instances, reservations, CSV imports). Cannot publish. (Was `ConfigEditor`.) |
| `Administrator` | Designer plus publish, cluster/node CRUD, credential management, role-grant management. Satisfies both the `FleetAdmin` and `DriverOperator` authorization policies. (Was `FleetAdmin`.) |
In v2 the authentication + authorization stack is wired centrally by `AddOtOpcUaAuth` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`) and Razor pages gate inline with the role names, e.g. `@attribute [Authorize(Roles = "FleetAdmin,ConfigEditor")]` on `Deployments.razor`. Nav-menu sections hide via `<AuthorizeView>`.
`DriverOperator` is an **authorization policy name** (kept stable), not an `AdminRole` member. It gates **Reconnect** / **Restart** commands against live driver instances from the Admin UI `DriverStatusPanel` and requires the canonical role `Operator` or `Administrator` (`policy.RequireRole("Operator", "Administrator")` in `AddAuthorization`, `src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`). `Operator` is an appsettings-only string role (not an `AdminRole` member); map an LDAP group to it via `GroupToRole`, e.g. `"ot-driver-operator": "Operator"`. The `FleetAdmin` policy requires the `Administrator` role.
In v2 the authentication + authorization stack is wired centrally by `AddOtOpcUaAuth` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`), which also installs a `FallbackPolicy` that requires an authenticated user. Razor pages gate inline with the canonical role names, e.g. `@attribute [Authorize(Roles = "Administrator,Designer")]`. Nav-menu sections hide via `<AuthorizeView>`.
### Role grant source
Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/LdapGroupRoleMapping.cs`) — the same pattern as the data-plane `NodeAcl` but scoped to Admin roles + (optionally) cluster scope for multi-site fleets. The `RoleGrants.razor` page lets FleetAdmins edit these mappings without leaving the UI.
Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/LdapGroupRoleMapping.cs`) — the same pattern as the data-plane `NodeAcl` but scoped to Admin roles + (optionally) one cluster for multi-site fleets (a system-wide row, `IsSystemWide = true`, stacks additively with cluster-scoped rows). The `RoleGrants.razor` page lets `Administrator`s edit these mappings without leaving the UI.
---
@@ -266,9 +274,9 @@ Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/Core/ZB.MOM.WW.
Per-capability resilience (retry, timeout, circuit-breaker, bulkhead) is applied by `CapabilityInvoker` in `src/Core/ZB.MOM.WW.OtOpcUa.Core/Resilience/`. A driver-capability call made **outside** the invoker bypasses resilience entirely — which in production looks like inconsistent timeouts, un-wrapped retries, and unbounded blocking.
`OTOPCUA0001` (Roslyn analyzer at `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs`) fires as a compile-time **warning** when an `async`/`Task`-returning method on one of the seven guarded capability interfaces (`IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider`) is invoked **outside** a lambda passed to `CapabilityInvoker.ExecuteAsync` / `ExecuteWriteAsync` / `AlarmSurfaceInvoker.*`. The analyzer walks up the syntax tree from the call site, finds any enclosing invoker invocation, and verifies the call lives transitively inside that invocation's anonymous-function argument — a sibling pattern (do the call, then invoke `ExecuteAsync` on something unrelated nearby) does not satisfy the rule.
`OTOPCUA0001` (Roslyn analyzer at `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs`) fires with category `OtOpcUa.Resilience` and default severity **Warning** (per `AnalyzerReleases.Shipped.md`) when a method on one of the seven guarded capability interfaces (`IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider` — all in `ZB.MOM.WW.OtOpcUa.Core.Abstractions`) is invoked **outside** a lambda passed to `CapabilityInvoker.ExecuteAsync` / `ExecuteWriteAsync`. `AlarmSurfaceInvoker` is **not** a wrapper home — its own implementation is covered transitively because it routes through the inner `CapabilityInvoker.ExecuteAsync`. The analyzer walks up the syntax tree from the call site, finds any enclosing invoker invocation, and verifies the call lives transitively inside that invocation's anonymous-function argument — a sibling pattern (do the call, then invoke `ExecuteAsync` on something unrelated nearby) does not satisfy the rule.
Five xUnit-v3 + Shouldly tests at `tests/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers.Tests` cover the common fail/pass shapes + the sibling-pattern regression guard.
The xunit.v3 + Shouldly suite at `tests/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers.Tests/UnwrappedCapabilityCallAnalyzerTests.cs` covers the common fail/pass shapes + the sibling-pattern regression guard.
The rule is intentionally scoped to async surfaces — pure in-memory accessors like `IHostConnectivityProbe.GetHostStatuses()` return synchronously and do not require the invoker wrap.
@@ -276,8 +284,8 @@ The rule is intentionally scoped to async surfaces — pure in-memory accessors
## Audit Logging
- **Server**: Serilog `AUDIT:` prefix on every authentication success/failure, certificate validation result, write access denial. Written alongside the regular rolling file sink.
- **Admin**: `AuditLogService` writes `ConfigAuditLog` rows to the Config DB for every publish, rollback, cluster-node CRUD, credential rotation. Visible in the Audit page for operators with `ConfigViewer` or above.
- **Server**: authentication, certificate-validation, and write-denial events are logged through the regular Serilog rolling file sink.
- **Admin**: `AuditWriterActor` (`src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs`) writes `ConfigAuditLog` rows (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigAuditLog.cs`) to the Config DB for publish, rollback, cluster-node CRUD, and credential rotation. Visible on the cluster Audit page (`ClusterAudit.razor`) for operators with `Viewer` or above.
---
@@ -285,16 +293,16 @@ The rule is intentionally scoped to async surfaces — pure in-memory accessors
### Certificate trust failure
Check `{PkiStoreRoot}/rejected/` for the client's cert. Promote via Admin UI Certificates page, or copy the `.der` file manually to `trusted/`.
Check `{PkiStoreRoot}/rejected/` for the client's cert. Copy the `.der` file to `trusted/certs/`; the SDK trust list reloads on the next handshake. The Admin UI Certificates page shows what is in each store but does not move certs.
### LDAP users can connect but fail authorization
Verify (a) `OpcUaServer:Ldap:GroupAttribute` returns groups in the form `CN=MyGroup,…` (OtOpcUa strips the `CN=` for matching), (b) a `NodeAcl` grant exists at any level of the node's UNS path that unions to the required permission, (c) `Authorization:StrictMode` is correctly set for the deployment stage.
Verify (a) `Security:Ldap:GroupAttribute` (default `memberOf`) returns the user's groups, (b) `Security:Ldap:GroupToRole` maps those groups to the expected roles, and (c) a `NodeAcl` grant exists at some level of the node's scope path that unions to the required permission. The data-plane evaluator is strictly default-deny — there is no fail-open mode to fall back on.
### LDAP bind rejected as "insecure"
Set `UseTls = true` + `Port = 636`, or temporarily flip `AllowInsecureLdap = true` in dev. Production Active Directory increasingly refuses plain-LDAP bind under LDAP-signing enforcement.
Set `Security:Ldap:Transport = "Ldaps"` (or `"StartTls"`) with the matching port (636 for AD `Ldaps`), or temporarily set `Security:Ldap:AllowInsecure = true` in dev. Production Active Directory increasingly refuses plain-LDAP bind under LDAP-signing enforcement.
### `AuthorizationGate` denies every call after a publish
### Stale ACL trie after a publish
`AclChangeNotifier` invalidates the `PermissionTrieCache` on publish; a stuck cache is usually a missed notification. Restart the Server as a quick mitigation and file a bug — the design is to stay fresh without restarts.
A publish installs a **new generation** into `PermissionTrieCache` via `PermissionTrieBuilder` rather than signalling an invalidation; the evaluator binds each session to a generation. If grants appear stale, confirm the new generation was installed (publish completed) and that sessions re-resolved their auth state — a session past its staleness ceiling fails closed and must re-authenticate. As a last resort `PermissionTrieCache.Invalidate(clusterId)` drops a cluster's cached tries.