Rewrite src/ and tests/ project paths in docs, CLAUDE.md, README.md, and test-fixture READMEs to the new module-folder layout (Core/Server/Drivers/ Client/Tooling). References to retired v1 projects (Galaxy.Host/Proxy/Shared, the legacy monolithic test projects) are left untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
298 lines
17 KiB
Markdown
298 lines
17 KiB
Markdown
# Security
|
|
|
|
OtOpcUa has four independent security concerns. This document covers all four:
|
|
|
|
1. **Transport security** — OPC UA secure channel (signing, encryption, X.509 trust).
|
|
2. **OPC UA authentication** — Anonymous / UserName / X.509 session identities; UserName tokens authenticated by LDAP bind.
|
|
3. **Data-plane authorization** — who can browse, read, subscribe, write, acknowledge alarms on which nodes. Evaluated by `PermissionTrie` against the Config DB `NodeAcl` tree.
|
|
4. **Control-plane authorization** — who can view or edit fleet configuration in the Admin UI. Gated by the `AdminRole` (`ConfigViewer` / `ConfigEditor` / `FleetAdmin`) claim from `LdapGroupRoleMapping`.
|
|
|
|
Transport security and OPC UA authentication are per-node concerns configured in the Server's bootstrap `appsettings.json`. Data-plane ACLs and Admin role grants live in the Config DB.
|
|
|
|
---
|
|
|
|
## Transport Security
|
|
|
|
### Overview
|
|
|
|
The OtOpcUa Server supports configurable OPC UA transport security profiles that control how data is protected on the wire between OPC UA clients and the server.
|
|
|
|
There are two distinct layers of security in OPC UA:
|
|
|
|
- **Transport security** -- secures the communication channel itself using TLS-style certificate exchange, message signing, and encryption. This is what the `OpcUaServer:SecurityProfile` setting controls.
|
|
- **UserName token encryption** -- protects user credentials (username/password) sent during session activation. The OPC UA stack encrypts UserName tokens using the server's application certificate regardless of the transport security mode. UserName authentication therefore works on `None` endpoints too — the credentials themselves are always encrypted. A secure transport profile adds protection against message-level tampering and eavesdropping of data payloads.
|
|
|
|
### Supported security profiles
|
|
|
|
The server supports seven transport security profiles:
|
|
|
|
| Profile Name | Security Policy | Message Security Mode | Description |
|
|
|-----------------------------------|----------------------------|-----------------------|--------------------------------------------------|
|
|
| `None` | None | None | No signing or encryption. Suitable for development and isolated networks only. |
|
|
| `Basic256Sha256-Sign` | Basic256Sha256 | Sign | Messages are signed but not encrypted. Protects against tampering but data is visible on the wire. |
|
|
| `Basic256Sha256-SignAndEncrypt` | Basic256Sha256 | SignAndEncrypt | Messages are both signed and encrypted. Full protection against tampering and eavesdropping. |
|
|
| `Aes128_Sha256_RsaOaep-Sign` | Aes128_Sha256_RsaOaep | Sign | Modern profile with AES-128 encryption and SHA-256 signing. |
|
|
| `Aes128_Sha256_RsaOaep-SignAndEncrypt` | Aes128_Sha256_RsaOaep | SignAndEncrypt | Modern profile with AES-128 encryption. Recommended for production. |
|
|
| `Aes256_Sha256_RsaPss-Sign` | Aes256_Sha256_RsaPss | Sign | Strongest profile with AES-256 and RSA-PSS signatures. |
|
|
| `Aes256_Sha256_RsaPss-SignAndEncrypt` | Aes256_Sha256_RsaPss | SignAndEncrypt | Strongest profile. Recommended for high-security deployments. |
|
|
|
|
The server exposes a separate endpoint for each configured profile, and clients select the one they prefer during connection.
|
|
|
|
### Configuration
|
|
|
|
Transport security is configured in the `OpcUaServer` section of the Server process's bootstrap `appsettings.json`:
|
|
|
|
```json
|
|
{
|
|
"OpcUaServer": {
|
|
"EndpointUrl": "opc.tcp://0.0.0.0:4840/OtOpcUa",
|
|
"ApplicationName": "OtOpcUa Server",
|
|
"ApplicationUri": "urn:node-a:OtOpcUa",
|
|
"PkiStoreRoot": "C:/ProgramData/OtOpcUa/pki",
|
|
"AutoAcceptUntrustedClientCertificates": false,
|
|
"SecurityProfile": "Basic256Sha256-SignAndEncrypt"
|
|
}
|
|
}
|
|
```
|
|
|
|
The server certificate is auto-generated on first start if none exists in `PkiStoreRoot/own/`. Always generated even for `None`-only deployments because UserName token encryption depends on it.
|
|
|
|
### PKI directory layout
|
|
|
|
```
|
|
{PkiStoreRoot}/
|
|
own/ Server's own application certificate and private key
|
|
issuer/ CA certificates that issued trusted client certificates
|
|
trusted/ Explicitly trusted client (peer) certificates
|
|
rejected/ Certificates that were presented but not trusted
|
|
```
|
|
|
|
### Certificate trust flow
|
|
|
|
When a client connects using a secure profile (`Sign` or `SignAndEncrypt`), the following trust evaluation occurs:
|
|
|
|
1. The client presents its application certificate during the secure channel handshake.
|
|
2. The server checks whether the certificate exists in the `trusted/` store.
|
|
3. If found, the connection proceeds.
|
|
4. If not found and `AutoAcceptUntrustedClientCertificates` is `true`, the certificate is automatically copied to `trusted/` and the connection proceeds.
|
|
5. If not found and `AutoAcceptUntrustedClientCertificates` is `false`, the certificate is copied to `rejected/` and the connection is refused.
|
|
|
|
The Admin UI `Certificates.razor` page uses `CertTrustService` (singleton reading `CertTrustOptions` for the Server's `PkiStoreRoot`) to promote rejected client certs to trusted without operators having to file-copy manually.
|
|
|
|
### Production hardening
|
|
|
|
- Set `AutoAcceptUntrustedClientCertificates = false`.
|
|
- Drop `None` from the profile set.
|
|
- Use the Admin UI to promote trusted client certs rather than the auto-accept fallback.
|
|
- Periodically audit the `rejected/` directory; an unexpected entry is often a misconfigured client or a probe attempt.
|
|
|
|
---
|
|
|
|
## OPC UA Authentication
|
|
|
|
The Server accepts three OPC UA identity-token types:
|
|
|
|
| Token | Handler | Notes |
|
|
|---|---|---|
|
|
| Anonymous | `IUserAuthenticator.AuthenticateAsync(username: "", password: "")` | Refused in strict mode unless explicit anonymous grants exist; allowed in lax mode for backward compatibility. |
|
|
| UserName/Password | `LdapUserAuthenticator` (`src/Server/ZB.MOM.WW.OtOpcUa.Server/Security/LdapUserAuthenticator.cs`) | LDAP bind + group lookup; resolved `LdapGroups` flow into the session's identity bearer (`ILdapGroupsBearer`). |
|
|
| X.509 Certificate | Stack-level acceptance + role mapping via CN | X.509 identity carries `AuthenticatedUser` + read roles; finer-grain authorization happens through the data-plane ACLs. |
|
|
|
|
### LDAP bind flow (`LdapUserAuthenticator`)
|
|
|
|
`Program.cs` in the Server registers the authenticator based on `OpcUaServer:Ldap`:
|
|
|
|
```csharp
|
|
builder.Services.AddSingleton<IUserAuthenticator>(sp => ldapOptions.Enabled
|
|
? new LdapUserAuthenticator(ldapOptions, sp.GetRequiredService<ILogger<LdapUserAuthenticator>>())
|
|
: new DenyAllUserAuthenticator());
|
|
```
|
|
|
|
`LdapUserAuthenticator`:
|
|
|
|
1. Refuses to bind over plain-LDAP unless `AllowInsecureLdap = true` (dev/test only).
|
|
2. Connects to `Server:Port`, optionally upgrades to TLS (`UseTls = true`, port 636 for AD).
|
|
3. Binds as the service account; searches `SearchBase` for `UserNameAttribute = username`.
|
|
4. Rebinds as the resolved user DN with the supplied password (the actual credential check).
|
|
5. Reads `GroupAttribute` (default `memberOf`) and strips the leading `CN=` so operators configure friendly group names in `GroupToRole`.
|
|
6. Returns a `UserAuthResult` carrying the validated username + the set of LDAP groups. The set flows through to the session identity via `ILdapGroupsBearer.LdapGroups`.
|
|
|
|
Configuration example (Active Directory production):
|
|
|
|
```json
|
|
{
|
|
"OpcUaServer": {
|
|
"Ldap": {
|
|
"Enabled": true,
|
|
"Server": "dc01.corp.example.com",
|
|
"Port": 636,
|
|
"UseTls": true,
|
|
"AllowInsecureLdap": false,
|
|
"SearchBase": "DC=corp,DC=example,DC=com",
|
|
"ServiceAccountDn": "CN=OtOpcUaSvc,OU=Service Accounts,DC=corp,DC=example,DC=com",
|
|
"ServiceAccountPassword": "<from your secret store>",
|
|
"GroupAttribute": "memberOf",
|
|
"UserNameAttribute": "sAMAccountName",
|
|
"GroupToRole": {
|
|
"OPCUA-Operators": "WriteOperate",
|
|
"OPCUA-Engineers": "WriteConfigure",
|
|
"OPCUA-Tuners": "WriteTune",
|
|
"OPCUA-AlarmAck": "AlarmAck"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
`UserNameAttribute: "sAMAccountName"` is the critical AD override — the default `uid` is not populated on AD user entries. Use `userPrincipalName` instead if operators log in with `user@corp.example.com` form. Nested group membership is not expanded — assign users directly to the role-mapped groups, or pre-flatten in AD.
|
|
|
|
The same options bind the Admin's `LdapAuthService` (cookie auth / login form) so operators authenticate with a single credential across both processes.
|
|
|
|
---
|
|
|
|
## Data-Plane Authorization
|
|
|
|
Data-plane authorization is the check run on every OPC UA operation against an OtOpcUa endpoint: *can this authenticated user Browse / Read / Subscribe / Write / HistoryRead / AckAlarm / Call on this specific node?*
|
|
|
|
Per decision #129 the model is **additive-only — no explicit Deny**. Grants at each hierarchy level union; absence of a grant is the default-deny.
|
|
|
|
### Hierarchy
|
|
|
|
ACLs are evaluated against the UNS path:
|
|
|
|
```
|
|
ClusterId → Namespace → UnsArea → UnsLine → Equipment → Tag
|
|
```
|
|
|
|
Each level can carry `NodeAcl` rows (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/NodeAcl.cs`) that grant a permission bundle to a set of `LdapGroups`.
|
|
|
|
### Permission flags
|
|
|
|
```csharp
|
|
[Flags]
|
|
public enum NodePermissions : uint
|
|
{
|
|
Browse = 1 << 0,
|
|
Read = 1 << 1,
|
|
Subscribe = 1 << 2,
|
|
HistoryRead = 1 << 3,
|
|
WriteOperate = 1 << 4,
|
|
WriteTune = 1 << 5,
|
|
WriteConfigure = 1 << 6,
|
|
AlarmRead = 1 << 7,
|
|
AlarmAcknowledge = 1 << 8,
|
|
AlarmConfirm = 1 << 9,
|
|
AlarmShelve = 1 << 10,
|
|
MethodCall = 1 << 11,
|
|
|
|
ReadOnly = Browse | Read | Subscribe | HistoryRead | AlarmRead,
|
|
Operator = ReadOnly | WriteOperate | AlarmAcknowledge | AlarmConfirm,
|
|
Engineer = Operator | WriteTune | AlarmShelve,
|
|
Admin = Engineer | WriteConfigure | MethodCall,
|
|
}
|
|
```
|
|
|
|
The three Write tiers map to Galaxy's v1 `SecurityClassification` — `FreeAccess`/`Operate` → `WriteOperate`, `Tune` → `WriteTune`, `Configure` → `WriteConfigure`. `SecuredWrite` / `VerifiedWrite` / `ViewOnly` classifications remain read-only from OPC UA regardless of grant.
|
|
|
|
### Evaluator — `PermissionTrie`
|
|
|
|
`src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/`:
|
|
|
|
| Class | Role |
|
|
|---|---|
|
|
| `PermissionTrie` | Cluster-scoped trie; each node carries `(GroupId → NodePermissions)` grants. |
|
|
| `PermissionTrieBuilder` | Builds a trie from the current `NodeAcl` rows in one pass. |
|
|
| `PermissionTrieCache` | Per-cluster memoised trie; invalidated via `AclChangeNotifier` when the Admin publishes a draft that touches ACLs. |
|
|
| `TriePermissionEvaluator` | Implements `IPermissionEvaluator.Authorize(session, operation, scope)` — walks from the root to the leaf for the supplied `NodeScope`, unions grants along the path, compares required permission to the union. |
|
|
|
|
`NodeScope` carries `(ClusterId, NamespaceId, AreaId, LineId, EquipmentId, TagId)`; any suffix may be null — a tag-level ACL is more specific than an area-level ACL but both contribute via union.
|
|
|
|
### Dispatch gate — `AuthorizationGate`
|
|
|
|
`src/Server/ZB.MOM.WW.OtOpcUa.Server/Security/AuthorizationGate.cs` bridges the OPC UA stack's `ISystemContext.UserIdentity` to the evaluator. `DriverNodeManager` holds exactly one reference to it and calls `IsAllowed(identity, OpcUaOperation.*, NodeScope)` on every Read, Write, HistoryRead, Browse, Subscribe, AckAlarm, Call path. A false return short-circuits the dispatch with `BadUserAccessDenied`.
|
|
|
|
Key properties:
|
|
|
|
- **Driver-agnostic.** No driver-level code participates in authorization decisions. Drivers report `SecurityClassification` as metadata on tag discovery; everything else flows through `AuthorizationGate`.
|
|
- **Fail-open-during-transition.** `StrictMode = false` (default during ACL rollouts) lets sessions without resolved LDAP groups proceed; flip `Authorization:StrictMode = true` in production once ACLs are populated.
|
|
- **Evaluator stays pure.** `TriePermissionEvaluator` has no OPC UA stack dependency — it's tested directly from xUnit.
|
|
|
|
### Probe-this-permission (Admin UI)
|
|
|
|
`PermissionProbeService` (`src/Server/ZB.MOM.WW.OtOpcUa.Admin/Services/PermissionProbeService.cs`) lets an operator ask "if a user with groups X, Y, Z asked to do operation O on node N, would it succeed?" The answer is rendered in the AclsTab "Probe" dialog — same evaluator, same trie, so the Admin UI answer and the live Server answer cannot disagree.
|
|
|
|
### Full model
|
|
|
|
See [`docs/v2/acl-design.md`](v2/acl-design.md) for the complete design: trie invalidation, flag semantics, per-path override rules, and the reasoning behind additive-only (no Deny).
|
|
|
|
---
|
|
|
|
## Control-Plane Authorization
|
|
|
|
Control-plane authorization governs **the Admin UI** — who can view fleet config, edit drafts, publish generations, manage cluster nodes + credentials.
|
|
|
|
Per decision #150 control-plane roles are **deliberately independent of data-plane ACLs**. An operator who can read every OPC UA tag in production may not be allowed to edit cluster config; conversely a ConfigEditor may not have any data-plane grants at all.
|
|
|
|
### Roles
|
|
|
|
`src/Server/ZB.MOM.WW.OtOpcUa.Admin/Services/AdminRoles.cs`:
|
|
|
|
| Role | Capabilities |
|
|
|---|---|
|
|
| `ConfigViewer` | Read-only access to drafts, generations, audit log, fleet status. |
|
|
| `ConfigEditor` | ConfigViewer plus draft editing (UNS, equipment, tags, ACLs, driver instances, reservations, CSV imports). Cannot publish. |
|
|
| `FleetAdmin` | ConfigEditor plus publish, cluster/node CRUD, credential management, role-grant management. |
|
|
|
|
Policies registered in Admin `Program.cs`:
|
|
|
|
```csharp
|
|
builder.Services.AddAuthorizationBuilder()
|
|
.AddPolicy("CanEdit", p => p.RequireRole(AdminRoles.ConfigEditor, AdminRoles.FleetAdmin))
|
|
.AddPolicy("CanPublish", p => p.RequireRole(AdminRoles.FleetAdmin));
|
|
```
|
|
|
|
Razor pages and API endpoints gate with `[Authorize(Policy = "CanEdit")]` / `"CanPublish"`; nav-menu sections hide via `<AuthorizeView>`.
|
|
|
|
### Role grant source
|
|
|
|
Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/LdapGroupRoleMapping.cs`) — the same pattern as the data-plane `NodeAcl` but scoped to Admin roles + (optionally) cluster scope for multi-site fleets. The `RoleGrants.razor` page lets FleetAdmins edit these mappings without leaving the UI.
|
|
|
|
---
|
|
|
|
## OTOPCUA0001 Analyzer — Compile-Time Guard
|
|
|
|
Per-capability resilience (retry, timeout, circuit-breaker, bulkhead) is applied by `CapabilityInvoker` in `src/Core/ZB.MOM.WW.OtOpcUa.Core/Resilience/`. A driver-capability call made **outside** the invoker bypasses resilience entirely — which in production looks like inconsistent timeouts, un-wrapped retries, and unbounded blocking.
|
|
|
|
`OTOPCUA0001` (Roslyn analyzer at `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs`) fires as a compile-time **warning** when an `async`/`Task`-returning method on one of the seven guarded capability interfaces (`IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider`) is invoked **outside** a lambda passed to `CapabilityInvoker.ExecuteAsync` / `ExecuteWriteAsync` / `AlarmSurfaceInvoker.*`. The analyzer walks up the syntax tree from the call site, finds any enclosing invoker invocation, and verifies the call lives transitively inside that invocation's anonymous-function argument — a sibling pattern (do the call, then invoke `ExecuteAsync` on something unrelated nearby) does not satisfy the rule.
|
|
|
|
Five xUnit-v3 + Shouldly tests at `tests/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers.Tests` cover the common fail/pass shapes + the sibling-pattern regression guard.
|
|
|
|
The rule is intentionally scoped to async surfaces — pure in-memory accessors like `IHostConnectivityProbe.GetHostStatuses()` return synchronously and do not require the invoker wrap.
|
|
|
|
---
|
|
|
|
## Audit Logging
|
|
|
|
- **Server**: Serilog `AUDIT:` prefix on every authentication success/failure, certificate validation result, write access denial. Written alongside the regular rolling file sink.
|
|
- **Admin**: `AuditLogService` writes `ConfigAuditLog` rows to the Config DB for every publish, rollback, cluster-node CRUD, credential rotation. Visible in the Audit page for operators with `ConfigViewer` or above.
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Certificate trust failure
|
|
|
|
Check `{PkiStoreRoot}/rejected/` for the client's cert. Promote via Admin UI Certificates page, or copy the `.der` file manually to `trusted/`.
|
|
|
|
### LDAP users can connect but fail authorization
|
|
|
|
Verify (a) `OpcUaServer:Ldap:GroupAttribute` returns groups in the form `CN=MyGroup,…` (OtOpcUa strips the `CN=` for matching), (b) a `NodeAcl` grant exists at any level of the node's UNS path that unions to the required permission, (c) `Authorization:StrictMode` is correctly set for the deployment stage.
|
|
|
|
### LDAP bind rejected as "insecure"
|
|
|
|
Set `UseTls = true` + `Port = 636`, or temporarily flip `AllowInsecureLdap = true` in dev. Production Active Directory increasingly refuses plain-LDAP bind under LDAP-signing enforcement.
|
|
|
|
### `AuthorizationGate` denies every call after a publish
|
|
|
|
`AclChangeNotifier` invalidates the `PermissionTrieCache` on publish; a stuck cache is usually a missed notification. Restart the Server as a quick mitigation and file a bug — the design is to stay fresh without restarts.
|