# Security OtOpcUa has four independent security concerns. This document covers all four: 1. **Transport security** — OPC UA secure channel (signing, encryption, X.509 trust). 2. **OPC UA authentication** — Anonymous / UserName / X.509 session identities; UserName tokens authenticated by LDAP bind. 3. **Data-plane authorization** — who can browse, read, subscribe, write, acknowledge alarms on which nodes. Evaluated by `PermissionTrie` against the Config DB `NodeAcl` tree. 4. **Control-plane authorization** — who can view or edit fleet configuration in the Admin UI. Gated by the `AdminRole` (`ConfigViewer` / `ConfigEditor` / `FleetAdmin`) claim from `LdapGroupRoleMapping`. Transport security and OPC UA authentication are per-node concerns configured in the Server's bootstrap `appsettings.json`. Data-plane ACLs and Admin role grants live in the Config DB. --- ## Transport Security ### Overview The OtOpcUa Server supports configurable OPC UA transport security profiles that control how data is protected on the wire between OPC UA clients and the server. There are two distinct layers of security in OPC UA: - **Transport security** -- secures the communication channel itself using TLS-style certificate exchange, message signing, and encryption. This is what the `OpcUaServer:SecurityProfile` setting controls. - **UserName token encryption** -- protects user credentials (username/password) sent during session activation. The OPC UA stack encrypts UserName tokens using the server's application certificate regardless of the transport security mode. UserName authentication therefore works on `None` endpoints too — the credentials themselves are always encrypted. A secure transport profile adds protection against message-level tampering and eavesdropping of data payloads. ### Supported security profiles The server supports seven transport security profiles: | Profile Name | Security Policy | Message Security Mode | Description | |-----------------------------------|----------------------------|-----------------------|--------------------------------------------------| | `None` | None | None | No signing or encryption. Suitable for development and isolated networks only. | | `Basic256Sha256-Sign` | Basic256Sha256 | Sign | Messages are signed but not encrypted. Protects against tampering but data is visible on the wire. | | `Basic256Sha256-SignAndEncrypt` | Basic256Sha256 | SignAndEncrypt | Messages are both signed and encrypted. Full protection against tampering and eavesdropping. | | `Aes128_Sha256_RsaOaep-Sign` | Aes128_Sha256_RsaOaep | Sign | Modern profile with AES-128 encryption and SHA-256 signing. | | `Aes128_Sha256_RsaOaep-SignAndEncrypt` | Aes128_Sha256_RsaOaep | SignAndEncrypt | Modern profile with AES-128 encryption. Recommended for production. | | `Aes256_Sha256_RsaPss-Sign` | Aes256_Sha256_RsaPss | Sign | Strongest profile with AES-256 and RSA-PSS signatures. | | `Aes256_Sha256_RsaPss-SignAndEncrypt` | Aes256_Sha256_RsaPss | SignAndEncrypt | Strongest profile. Recommended for high-security deployments. | The server exposes a separate endpoint for each configured profile, and clients select the one they prefer during connection. ### Configuration Transport security is configured in the `OpcUaServer` section of the Server process's bootstrap `appsettings.json`: ```json { "OpcUaServer": { "EndpointUrl": "opc.tcp://0.0.0.0:4840/OtOpcUa", "ApplicationName": "OtOpcUa Server", "ApplicationUri": "urn:node-a:OtOpcUa", "PkiStoreRoot": "C:/ProgramData/OtOpcUa/pki", "AutoAcceptUntrustedClientCertificates": false, "SecurityProfile": "Basic256Sha256-SignAndEncrypt" } } ``` The server certificate is auto-generated on first start if none exists in `PkiStoreRoot/own/`. Always generated even for `None`-only deployments because UserName token encryption depends on it. ### PKI directory layout ``` {PkiStoreRoot}/ own/ Server's own application certificate and private key issuer/ CA certificates that issued trusted client certificates trusted/ Explicitly trusted client (peer) certificates rejected/ Certificates that were presented but not trusted ``` ### Certificate trust flow When a client connects using a secure profile (`Sign` or `SignAndEncrypt`), the following trust evaluation occurs: 1. The client presents its application certificate during the secure channel handshake. 2. The server checks whether the certificate exists in the `trusted/` store. 3. If found, the connection proceeds. 4. If not found and `AutoAcceptUntrustedClientCertificates` is `true`, the certificate is automatically copied to `trusted/` and the connection proceeds. 5. If not found and `AutoAcceptUntrustedClientCertificates` is `false`, the certificate is copied to `rejected/` and the connection is refused. The Admin UI `Certificates.razor` page uses `CertTrustService` (singleton reading `CertTrustOptions` for the Server's `PkiStoreRoot`) to promote rejected client certs to trusted without operators having to file-copy manually. ### Production hardening - Set `AutoAcceptUntrustedClientCertificates = false`. - Drop `None` from the profile set. - Use the Admin UI to promote trusted client certs rather than the auto-accept fallback. - Periodically audit the `rejected/` directory; an unexpected entry is often a misconfigured client or a probe attempt. --- ## OPC UA Authentication The Server accepts three OPC UA identity-token types: | Token | Handler | Notes | |---|---|---| | Anonymous | `IUserAuthenticator.AuthenticateAsync(username: "", password: "")` | Refused in strict mode unless explicit anonymous grants exist; allowed in lax mode for backward compatibility. | | UserName/Password | `LdapUserAuthenticator` (`src/ZB.MOM.WW.OtOpcUa.Server/Security/LdapUserAuthenticator.cs`) | LDAP bind + group lookup; resolved `LdapGroups` flow into the session's identity bearer (`ILdapGroupsBearer`). | | X.509 Certificate | Stack-level acceptance + role mapping via CN | X.509 identity carries `AuthenticatedUser` + read roles; finer-grain authorization happens through the data-plane ACLs. | ### LDAP bind flow (`LdapUserAuthenticator`) `Program.cs` in the Server registers the authenticator based on `OpcUaServer:Ldap`: ```csharp builder.Services.AddSingleton(sp => ldapOptions.Enabled ? new LdapUserAuthenticator(ldapOptions, sp.GetRequiredService>()) : new DenyAllUserAuthenticator()); ``` `LdapUserAuthenticator`: 1. Refuses to bind over plain-LDAP unless `AllowInsecureLdap = true` (dev/test only). 2. Connects to `Server:Port`, optionally upgrades to TLS (`UseTls = true`, port 636 for AD). 3. Binds as the service account; searches `SearchBase` for `UserNameAttribute = username`. 4. Rebinds as the resolved user DN with the supplied password (the actual credential check). 5. Reads `GroupAttribute` (default `memberOf`) and strips the leading `CN=` so operators configure friendly group names in `GroupToRole`. 6. Returns a `UserAuthResult` carrying the validated username + the set of LDAP groups. The set flows through to the session identity via `ILdapGroupsBearer.LdapGroups`. Configuration example (Active Directory production): ```json { "OpcUaServer": { "Ldap": { "Enabled": true, "Server": "dc01.corp.example.com", "Port": 636, "UseTls": true, "AllowInsecureLdap": false, "SearchBase": "DC=corp,DC=example,DC=com", "ServiceAccountDn": "CN=OtOpcUaSvc,OU=Service Accounts,DC=corp,DC=example,DC=com", "ServiceAccountPassword": "", "GroupAttribute": "memberOf", "UserNameAttribute": "sAMAccountName", "GroupToRole": { "OPCUA-Operators": "WriteOperate", "OPCUA-Engineers": "WriteConfigure", "OPCUA-Tuners": "WriteTune", "OPCUA-AlarmAck": "AlarmAck" } } } } ``` `UserNameAttribute: "sAMAccountName"` is the critical AD override — the default `uid` is not populated on AD user entries. Use `userPrincipalName` instead if operators log in with `user@corp.example.com` form. Nested group membership is not expanded — assign users directly to the role-mapped groups, or pre-flatten in AD. The same options bind the Admin's `LdapAuthService` (cookie auth / login form) so operators authenticate with a single credential across both processes. --- ## Data-Plane Authorization Data-plane authorization is the check run on every OPC UA operation against an OtOpcUa endpoint: *can this authenticated user Browse / Read / Subscribe / Write / HistoryRead / AckAlarm / Call on this specific node?* Per decision #129 the model is **additive-only — no explicit Deny**. Grants at each hierarchy level union; absence of a grant is the default-deny. ### Hierarchy ACLs are evaluated against the UNS path: ``` ClusterId → Namespace → UnsArea → UnsLine → Equipment → Tag ``` Each level can carry `NodeAcl` rows (`src/ZB.MOM.WW.OtOpcUa.Configuration/Entities/NodeAcl.cs`) that grant a permission bundle to a set of `LdapGroups`. ### Permission flags ```csharp [Flags] public enum NodePermissions : uint { Browse = 1 << 0, Read = 1 << 1, Subscribe = 1 << 2, HistoryRead = 1 << 3, WriteOperate = 1 << 4, WriteTune = 1 << 5, WriteConfigure = 1 << 6, AlarmRead = 1 << 7, AlarmAcknowledge = 1 << 8, AlarmConfirm = 1 << 9, AlarmShelve = 1 << 10, MethodCall = 1 << 11, ReadOnly = Browse | Read | Subscribe | HistoryRead | AlarmRead, Operator = ReadOnly | WriteOperate | AlarmAcknowledge | AlarmConfirm, Engineer = Operator | WriteTune | AlarmShelve, Admin = Engineer | WriteConfigure | MethodCall, } ``` The three Write tiers map to Galaxy's v1 `SecurityClassification` — `FreeAccess`/`Operate` → `WriteOperate`, `Tune` → `WriteTune`, `Configure` → `WriteConfigure`. `SecuredWrite` / `VerifiedWrite` / `ViewOnly` classifications remain read-only from OPC UA regardless of grant. ### Evaluator — `PermissionTrie` `src/ZB.MOM.WW.OtOpcUa.Core/Authorization/`: | Class | Role | |---|---| | `PermissionTrie` | Cluster-scoped trie; each node carries `(GroupId → NodePermissions)` grants. | | `PermissionTrieBuilder` | Builds a trie from the current `NodeAcl` rows in one pass. | | `PermissionTrieCache` | Per-cluster memoised trie; invalidated via `AclChangeNotifier` when the Admin publishes a draft that touches ACLs. | | `TriePermissionEvaluator` | Implements `IPermissionEvaluator.Authorize(session, operation, scope)` — walks from the root to the leaf for the supplied `NodeScope`, unions grants along the path, compares required permission to the union. | `NodeScope` carries `(ClusterId, NamespaceId, AreaId, LineId, EquipmentId, TagId)`; any suffix may be null — a tag-level ACL is more specific than an area-level ACL but both contribute via union. ### Dispatch gate — `AuthorizationGate` `src/ZB.MOM.WW.OtOpcUa.Server/Security/AuthorizationGate.cs` bridges the OPC UA stack's `ISystemContext.UserIdentity` to the evaluator. `DriverNodeManager` holds exactly one reference to it and calls `IsAllowed(identity, OpcUaOperation.*, NodeScope)` on every Read, Write, HistoryRead, Browse, Subscribe, AckAlarm, Call path. A false return short-circuits the dispatch with `BadUserAccessDenied`. Key properties: - **Driver-agnostic.** No driver-level code participates in authorization decisions. Drivers report `SecurityClassification` as metadata on tag discovery; everything else flows through `AuthorizationGate`. - **Fail-open-during-transition.** `StrictMode = false` (default during ACL rollouts) lets sessions without resolved LDAP groups proceed; flip `Authorization:StrictMode = true` in production once ACLs are populated. - **Evaluator stays pure.** `TriePermissionEvaluator` has no OPC UA stack dependency — it's tested directly from xUnit. ### Probe-this-permission (Admin UI) `PermissionProbeService` (`src/ZB.MOM.WW.OtOpcUa.Admin/Services/PermissionProbeService.cs`) lets an operator ask "if a user with groups X, Y, Z asked to do operation O on node N, would it succeed?" The answer is rendered in the AclsTab "Probe" dialog — same evaluator, same trie, so the Admin UI answer and the live Server answer cannot disagree. ### Full model See [`docs/v2/acl-design.md`](v2/acl-design.md) for the complete design: trie invalidation, flag semantics, per-path override rules, and the reasoning behind additive-only (no Deny). --- ## Control-Plane Authorization Control-plane authorization governs **the Admin UI** — who can view fleet config, edit drafts, publish generations, manage cluster nodes + credentials. Per decision #150 control-plane roles are **deliberately independent of data-plane ACLs**. An operator who can read every OPC UA tag in production may not be allowed to edit cluster config; conversely a ConfigEditor may not have any data-plane grants at all. ### Roles `src/ZB.MOM.WW.OtOpcUa.Admin/Services/AdminRoles.cs`: | Role | Capabilities | |---|---| | `ConfigViewer` | Read-only access to drafts, generations, audit log, fleet status. | | `ConfigEditor` | ConfigViewer plus draft editing (UNS, equipment, tags, ACLs, driver instances, reservations, CSV imports). Cannot publish. | | `FleetAdmin` | ConfigEditor plus publish, cluster/node CRUD, credential management, role-grant management. | Policies registered in Admin `Program.cs`: ```csharp builder.Services.AddAuthorizationBuilder() .AddPolicy("CanEdit", p => p.RequireRole(AdminRoles.ConfigEditor, AdminRoles.FleetAdmin)) .AddPolicy("CanPublish", p => p.RequireRole(AdminRoles.FleetAdmin)); ``` Razor pages and API endpoints gate with `[Authorize(Policy = "CanEdit")]` / `"CanPublish"`; nav-menu sections hide via ``. ### Role grant source Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/ZB.MOM.WW.OtOpcUa.Configuration/Entities/LdapGroupRoleMapping.cs`) — the same pattern as the data-plane `NodeAcl` but scoped to Admin roles + (optionally) cluster scope for multi-site fleets. The `RoleGrants.razor` page lets FleetAdmins edit these mappings without leaving the UI. --- ## OTOPCUA0001 Analyzer — Compile-Time Guard Per-capability resilience (retry, timeout, circuit-breaker, bulkhead) is applied by `CapabilityInvoker` in `src/ZB.MOM.WW.OtOpcUa.Core/Resilience/`. A driver-capability call made **outside** the invoker bypasses resilience entirely — which in production looks like inconsistent timeouts, un-wrapped retries, and unbounded blocking. `OTOPCUA0001` (Roslyn analyzer at `src/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs`) fires as a compile-time **warning** when an `async`/`Task`-returning method on one of the seven guarded capability interfaces (`IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider`) is invoked **outside** a lambda passed to `CapabilityInvoker.ExecuteAsync` / `ExecuteWriteAsync` / `AlarmSurfaceInvoker.*`. The analyzer walks up the syntax tree from the call site, finds any enclosing invoker invocation, and verifies the call lives transitively inside that invocation's anonymous-function argument — a sibling pattern (do the call, then invoke `ExecuteAsync` on something unrelated nearby) does not satisfy the rule. Five xUnit-v3 + Shouldly tests at `tests/ZB.MOM.WW.OtOpcUa.Analyzers.Tests` cover the common fail/pass shapes + the sibling-pattern regression guard. The rule is intentionally scoped to async surfaces — pure in-memory accessors like `IHostConnectivityProbe.GetHostStatuses()` return synchronously and do not require the invoker wrap. --- ## Audit Logging - **Server**: Serilog `AUDIT:` prefix on every authentication success/failure, certificate validation result, write access denial. Written alongside the regular rolling file sink. - **Admin**: `AuditLogService` writes `ConfigAuditLog` rows to the Config DB for every publish, rollback, cluster-node CRUD, credential rotation. Visible in the Audit page for operators with `ConfigViewer` or above. --- ## Troubleshooting ### Certificate trust failure Check `{PkiStoreRoot}/rejected/` for the client's cert. Promote via Admin UI Certificates page, or copy the `.der` file manually to `trusted/`. ### LDAP users can connect but fail authorization Verify (a) `OpcUaServer:Ldap:GroupAttribute` returns groups in the form `CN=MyGroup,…` (OtOpcUa strips the `CN=` for matching), (b) a `NodeAcl` grant exists at any level of the node's UNS path that unions to the required permission, (c) `Authorization:StrictMode` is correctly set for the deployment stage. ### LDAP bind rejected as "insecure" Set `UseTls = true` + `Port = 636`, or temporarily flip `AllowInsecureLdap = true` in dev. Production Active Directory increasingly refuses plain-LDAP bind under LDAP-signing enforcement. ### `AuthorizationGate` denies every call after a publish `AclChangeNotifier` invalidates the `PermissionTrieCache` on publish; a stuck cache is usually a missed notification. Restart the Server as a quick mitigation and file a bug — the design is to stay fresh without restarts.