ad7f9e731f
v2-ci / build (push) Failing after 50s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
A thin gateway over the admin-operations cluster singleton so CI/scripts can trigger a deployment without the Blazor button. Forwards to the same IAdminOperationsClient. StartDeploymentAsync; mounted on admin-role nodes. Auth is a fixed-time X-Api-Key check against Security:DeployApiKey (orthogonal to the cookie-only web auth); AllowAnonymous so the auth fallback doesn't 401 it, self-disabling (503) until the key is set. Outcome->status: 202/200/409/422. Unit tests for the key check + outcome mapping; HTTP E2E (real auth + real deploy via the 2-node harness). Documented in docs/security.md.
324 lines
26 KiB
Markdown
324 lines
26 KiB
Markdown
# Security
|
|
|
|
> **v2 status (2026-05-26).** The four security concerns below are unchanged in v2.
|
|
> Paths + project names moved: `OtOpcUa.Server/Security/` → `OtOpcUa.Security/`
|
|
> (`Ldap/`, `Jwt/`, `Endpoints/AuthEndpoints.cs`), `OtOpcUa.Admin` is gone (its
|
|
> auth + role-grant pages live in `OtOpcUa.AdminUI`), and Admin auth policies
|
|
> register from `OtOpcUa.Host/Program.cs` via `AddOtOpcUaAuth`
|
|
> (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`) rather
|
|
> than in a separate Admin process. The Admin UI uses a **single Cookie
|
|
> authentication scheme** — there is no `AddJwtBearer` pipeline. The
|
|
> `Security:Jwt` section configures `JwtTokenService`, which mints a JWT at the
|
|
> `/auth/token` endpoint for **external** consumers (OPC UA clients / automation
|
|
> scripts); the cookie itself stores the `ClaimsPrincipal` directly. DataProtection
|
|
> keys persist to the shared Config DB (`PersistKeysToDbContext<OtOpcUaConfigDbContext>`)
|
|
> so cookies survive failover between admin-role nodes.
|
|
>
|
|
> See `docs/plans/2026-05-26-akka-hosting-alignment-design.md` §5 for the v2
|
|
> auth + DataProtection rationale.
|
|
|
|
OtOpcUa has four independent security concerns. This document covers all four:
|
|
|
|
1. **Transport security** — OPC UA secure channel (signing, encryption, X.509 trust).
|
|
2. **OPC UA authentication** — Anonymous / UserName / X.509 session identities; UserName tokens authenticated by LDAP bind.
|
|
3. **Data-plane authorization** — who can browse, read, subscribe, write, acknowledge alarms on which nodes. Evaluated by `TriePermissionEvaluator` over a `PermissionTrie` built from the Config DB `NodeAcl` tree.
|
|
4. **Control-plane authorization** — who can view or edit fleet configuration in the Admin UI. Gated by the `AdminRole` (`Viewer` / `Designer` / `Administrator`) claim resolved from `LdapGroupRoleMapping`.
|
|
|
|
Transport security and OPC UA authentication are per-node concerns configured in the Server's bootstrap `appsettings.json`. Data-plane ACLs and Admin role grants live in the Config DB.
|
|
|
|
---
|
|
|
|
## Transport Security
|
|
|
|
### Overview
|
|
|
|
The OtOpcUa Server supports configurable OPC UA transport security profiles that control how data is protected on the wire between OPC UA clients and the server.
|
|
|
|
There are two distinct layers of security in OPC UA:
|
|
|
|
- **Transport security** -- secures the communication channel itself using TLS-style certificate exchange, message signing, and encryption. This is what the `OpcUa:EnabledSecurityProfiles` setting controls.
|
|
- **UserName token encryption** -- protects user credentials (username/password) sent during session activation. The OPC UA stack encrypts UserName tokens using the server's application certificate regardless of the transport security mode. UserName authentication therefore works on `None` endpoints too — the credentials themselves are always encrypted. A secure transport profile adds protection against message-level tampering and eavesdropping of data payloads.
|
|
|
|
### Supported security profiles
|
|
|
|
The profiles are the members of the `OpcUaSecurityProfile` enum (`src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OpcUaApplicationHost.cs`). The server ships **three** baseline profiles; the config value is the bare enum-member name (no hyphens, no underscores):
|
|
|
|
| Enum member | Security Policy | Message Security Mode | Description |
|
|
|---------------------------------|------------------|-----------------------|--------------------------------------------------|
|
|
| `None` | None | None | No signing or encryption. Suitable for development and isolated networks only. |
|
|
| `Basic256Sha256Sign` | Basic256Sha256 | Sign | Messages are signed but not encrypted. Protects against tampering but data is visible on the wire. |
|
|
| `Basic256Sha256SignAndEncrypt` | Basic256Sha256 | SignAndEncrypt | Messages are both signed and encrypted. Full protection against tampering and eavesdropping. |
|
|
|
|
`BuildSecurityPolicies` (`OpcUaApplicationHost.cs`) maps each configured profile to an SDK `ServerSecurityPolicy`. The server exposes a separate endpoint per configured profile and clients select the one they prefer at session open. The enum's XML doc notes that Aes128/Aes256 variants can be added later by extending the enum + `BuildSecurityPolicies` — the wiring is profile-agnostic — but they are **not implemented today**. There is no `SecurityProfileResolver` class.
|
|
|
|
> **Config value form.** The enum binds by member name, so a profile string with hyphens (e.g. `Basic256Sha256-Sign`) does **not** bind — use the exact enum-member spelling above. If `EnabledSecurityProfiles` is empty, the server falls back to a single `None` endpoint (logged, very visible) so it still has a listening endpoint.
|
|
|
|
### Configuration
|
|
|
|
Transport security is configured in the `OpcUa` section of the Host process's bootstrap `appsettings.json` (bound to `OpcUaApplicationHostOptions`):
|
|
|
|
```json
|
|
{
|
|
"OpcUa": {
|
|
"ApplicationName": "OtOpcUa",
|
|
"ApplicationUri": "urn:node-a:OtOpcUa",
|
|
"PublicHostname": "0.0.0.0",
|
|
"OpcUaPort": 4840,
|
|
"PkiStoreRoot": "C:/ProgramData/OtOpcUa/pki",
|
|
"AutoAcceptUntrustedClientCertificates": false,
|
|
"EnabledSecurityProfiles": [ "Basic256Sha256Sign", "Basic256Sha256SignAndEncrypt" ]
|
|
}
|
|
}
|
|
```
|
|
|
|
`EnabledSecurityProfiles` is a **list** — the server publishes one endpoint per entry. The default (when the key is omitted) is all three baseline profiles (`None`, `Basic256Sha256Sign`, `Basic256Sha256SignAndEncrypt`); production deployments typically drop `None`. The list must contain at least one entry (`OpcUaApplicationHostOptionsValidator` enforces `MinCount(…, 1)`).
|
|
|
|
The server certificate is auto-generated on first start if none exists in `PkiStoreRoot/own/`. Always generated even for `None`-only deployments because UserName token encryption depends on it.
|
|
|
|
### PKI directory layout
|
|
|
|
```
|
|
{PkiStoreRoot}/
|
|
own/ Server's own application certificate and private key
|
|
issuer/ CA certificates that issued trusted client certificates
|
|
trusted/ Explicitly trusted client (peer) certificates
|
|
rejected/ Certificates that were presented but not trusted
|
|
```
|
|
|
|
### Certificate trust flow
|
|
|
|
When a client connects using a secure profile (`Sign` or `SignAndEncrypt`), the following trust evaluation occurs:
|
|
|
|
1. The client presents its application certificate during the secure channel handshake.
|
|
2. The server checks whether the certificate exists in the `trusted/` store.
|
|
3. If found, the connection proceeds.
|
|
4. If not found and `AutoAcceptUntrustedClientCertificates` is `true`, the certificate is automatically copied to `trusted/` and the connection proceeds.
|
|
5. If not found and `AutoAcceptUntrustedClientCertificates` is `false`, the certificate is copied to `rejected/` and the connection is refused.
|
|
|
|
The Admin UI `Certificates.razor` page (`src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Certificates.razor`) lists the contents of each PKI sub-store (own / trusted / issuer / rejected) by reading the `OpcUa:PkiStoreRoot` path from configuration. It is currently a **read-only viewer** — promoting a rejected cert to trusted is still a file move (copy the `.der` from `rejected/` to `trusted/certs/`); the SDK trust list reloads on the next handshake.
|
|
|
|
### Production hardening
|
|
|
|
- Set `AutoAcceptUntrustedClientCertificates = false`.
|
|
- Drop `None` from `EnabledSecurityProfiles`.
|
|
- Promote trusted client certs by moving the `.der` from `rejected/` to `trusted/certs/` rather than relying on the auto-accept fallback. (The Admin UI Certificates page shows what is in each store.)
|
|
- Periodically audit the `rejected/` directory; an unexpected entry is often a misconfigured client or a probe attempt.
|
|
|
|
---
|
|
|
|
## OPC UA Authentication
|
|
|
|
The Server accepts three OPC UA identity-token types:
|
|
|
|
| Token | Handler | Notes |
|
|
|---|---|---|
|
|
| Anonymous | No `IOpcUaUserAuthenticator` call — the SDK admits anonymous sessions at the channel. | Data-plane authorization (below) still default-denies any node a session has no ACL grant for. |
|
|
| UserName/Password | `LdapOpcUaUserAuthenticator.AuthenticateUserNameAsync` (`src/Server/ZB.MOM.WW.OtOpcUa.Host/OpcUa/LdapOpcUaUserAuthenticator.cs`, implements `IOpcUaUserAuthenticator`), backed by the app `ILdapAuthService` — `OtOpcUaLdapAuthService` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/OtOpcUaLdapAuthService.cs`). | LDAP bind + group lookup. The returned LDAP groups are mapped to roles via `IGroupRoleMapper<string>` (`OtOpcUaGroupRoleMapper`) and attached to the OPC UA session identity for the downstream ACL evaluator. |
|
|
| X.509 Certificate | Stack-level acceptance during the secure-channel handshake. | The certificate must be trusted (see PKI trust flow); finer-grain authorization happens through the data-plane ACLs. |
|
|
|
|
When no authenticator is supplied, `OpcUaApplicationHost` falls back to `NullOpcUaUserAuthenticator`; the Host wires the real `LdapOpcUaUserAuthenticator` as a singleton in `Program.cs`.
|
|
|
|
### LDAP bind flow (`OtOpcUaLdapAuthService`)
|
|
|
|
LDAP is configured under the `Security:Ldap` section (bound to `LdapOptions`, `src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapOptions.cs`, `SectionName = "Security:Ldap"`). The app authenticator is `OtOpcUaLdapAuthService` — a thin wrapper around the shared `ZB.MOM.WW.Auth.Ldap` directory client that adds two app-only concerns the shared library deliberately does not model: the `Enabled` master switch and `DevStubMode`. The same `ILdapAuthService` instance serves **both** the Admin UI cookie login (`/auth/login`) and the OPC UA UserName path (via `LdapOpcUaUserAuthenticator`), so operators use one credential across both planes.
|
|
|
|
`OtOpcUaLdapAuthService.AuthenticateAsync`:
|
|
|
|
1. If `Enabled = false`, denies outright — no bind, no DevStub bypass (the master switch wins).
|
|
2. If `DevStubMode = true`, accepts any non-empty credentials and grants the `Administrator` role **without any network bind** (dev only — must be `false` in production).
|
|
3. Refuses to bind over a plaintext transport (`Transport = None`) unless `AllowInsecure = true` (dev/test only). This is enforced at login, not at startup.
|
|
4. Delegates the real path to the shared `ZB.MOM.WW.Auth.Ldap` client: it binds (search-then-bind via `ServiceAccountDn`, or direct-bind `cn={user},{SearchBase}` when no service account is set), verifies the password, and reads the user's group memberships.
|
|
5. Returns an `LdapAuthResult` carrying the validated username + the **groups** (never roles). Failure codes are folded into opaque user-facing error strings so a probe cannot distinguish "unknown user" from "wrong password".
|
|
|
|
**Group → role mapping happens downstream**, not in the auth service: `LdapOpcUaUserAuthenticator` resolves `IGroupRoleMapper<string>` (`OtOpcUaGroupRoleMapper`) per call and unions its output with any pre-resolved roles (the DevStub `Administrator` grant). The roles are attached to the OPC UA session identity for the ACL evaluator. A mapper fault (e.g. a Config DB outage) falls back to the pre-resolved baseline rather than denying an otherwise-authenticated session.
|
|
|
|
`Transport` replaces the former `UseTls` bool: `Ldaps` (implicit TLS), `StartTls` (upgrade), or `None` (plaintext, requires `AllowInsecure`). Configuration example (Active Directory production):
|
|
|
|
```json
|
|
{
|
|
"Security": {
|
|
"Ldap": {
|
|
"Enabled": true,
|
|
"DevStubMode": false,
|
|
"Server": "dc01.corp.example.com",
|
|
"Port": 636,
|
|
"Transport": "Ldaps",
|
|
"AllowInsecure": false,
|
|
"SearchBase": "DC=corp,DC=example,DC=com",
|
|
"ServiceAccountDn": "CN=OtOpcUaSvc,OU=Service Accounts,DC=corp,DC=example,DC=com",
|
|
"ServiceAccountPassword": "<from your secret store>",
|
|
"GroupAttribute": "memberOf",
|
|
"DisplayNameAttribute": "cn",
|
|
"UserNameAttribute": "sAMAccountName",
|
|
"GroupToRole": {
|
|
"OPCUA-Designers": "Designer",
|
|
"OPCUA-Admins": "Administrator",
|
|
"OPCUA-Operators": "Operator"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
`GroupToRole` maps LDAP group names → Admin roles (case-insensitive); a user gets every role whose source group is in their membership. The values are the canonical control-plane role strings (`Viewer` / `Designer` / `Administrator`, plus the appsettings-only `Operator` for the `DriverOperator` policy). `UserNameAttribute: "sAMAccountName"` is the critical AD override — the GLAuth dev default is `cn`, which is not how AD users are looked up; use `userPrincipalName` instead if operators log in with `user@corp.example.com` form. `LdapOptionsValidator` (`src/Server/ZB.MOM.WW.OtOpcUa.Host/Configuration/LdapOptionsValidator.cs`) fails startup when `Transport = None` and `AllowInsecure = false` on a real-LDAP (non-DevStub) config.
|
|
|
|
---
|
|
|
|
## Data-Plane Authorization
|
|
|
|
Data-plane authorization is the check run on every OPC UA operation against an OtOpcUa endpoint: *can this authenticated user Browse / Read / Subscribe / Write / HistoryRead / AckAlarm / Call on this specific node?*
|
|
|
|
Per decision #129 the model is **additive-only — no explicit Deny**. Grants at each hierarchy level union; absence of a grant is the default-deny.
|
|
|
|
### Hierarchy
|
|
|
|
ACLs are evaluated against the node's scope path. `NodeScope` (`src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/NodeScope.cs`) carries a `Kind` that selects between two hierarchy shapes:
|
|
|
|
```
|
|
Equipment (UNS) kind: Cluster → Namespace → UnsArea → UnsLine → Equipment → Tag
|
|
SystemPlatform (Galaxy) kind: Cluster → Namespace → FolderSegment(s) → Tag
|
|
```
|
|
|
|
On the Galaxy/SystemPlatform path each folder segment takes one trie level, so a deeply-nested Galaxy folder reaches the same depth as a full UNS path. Unset mid-path levels leave the corresponding id `null` and the evaluator walks only as far as the scope goes.
|
|
|
|
Each level can carry `NodeAcl` rows (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/NodeAcl.cs`) that grant a permission bundle to a set of `LdapGroups`.
|
|
|
|
### Permission flags
|
|
|
|
`NodePermissions` (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/NodePermissions.cs`), stored as an `int` bitmask in `NodeAcl.PermissionFlags`:
|
|
|
|
```csharp
|
|
[Flags]
|
|
public enum NodePermissions : int
|
|
{
|
|
None = 0,
|
|
|
|
Browse = 1 << 0,
|
|
Read = 1 << 1,
|
|
Subscribe = 1 << 2,
|
|
HistoryRead = 1 << 3,
|
|
WriteOperate = 1 << 4,
|
|
WriteTune = 1 << 5,
|
|
WriteConfigure = 1 << 6,
|
|
AlarmRead = 1 << 7,
|
|
AlarmAcknowledge = 1 << 8,
|
|
AlarmConfirm = 1 << 9,
|
|
AlarmShelve = 1 << 10,
|
|
MethodCall = 1 << 11,
|
|
|
|
ReadOnly = Browse | Read | Subscribe | HistoryRead | AlarmRead,
|
|
Operator = ReadOnly | WriteOperate | AlarmAcknowledge | AlarmConfirm,
|
|
Engineer = Operator | WriteTune | AlarmShelve,
|
|
Admin = Engineer | WriteConfigure | MethodCall,
|
|
}
|
|
```
|
|
|
|
The three Write tiers map to Galaxy's v1 `SecurityClassification` — `FreeAccess`/`Operate` → `WriteOperate`, `Tune` → `WriteTune`, `Configure` → `WriteConfigure`. `SecuredWrite` / `VerifiedWrite` / `ViewOnly` classifications remain read-only from OPC UA regardless of grant.
|
|
|
|
### Evaluator — `PermissionTrie`
|
|
|
|
`src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/`:
|
|
|
|
| Class | Role |
|
|
|---|---|
|
|
| `PermissionTrie` | Cluster-scoped trie; each node carries `(GroupId → NodePermissions)` grants. |
|
|
| `PermissionTrieBuilder` | Builds a trie from the current `NodeAcl` rows in one pass and installs it into the cache. |
|
|
| `PermissionTrieCache` | Process-singleton cache keyed on `(ClusterId, GenerationId)`. Generation-sealed: `Install(trie)` adds a new generation + advances the "current" pointer; older generations are retained (in-flight requests still resolve) and GC'd by `Prune`. `Invalidate(clusterId)` drops every cached trie for a cluster. There is **no** `AclChangeNotifier` — a publish installs a new generation rather than signalling an invalidation. |
|
|
| `TriePermissionEvaluator` | Implements `IPermissionEvaluator.Authorize(session, operation, scope)`. Walks the cluster trie for the supplied `NodeScope`, unions grants along the path, and returns an `AuthorizationDecision`. Evaluates against the **session's bound generation** (`session.AuthGenerationId`), not just "current", so a grant added/removed in a newer generation cannot take effect mid-session. |
|
|
|
|
`NodeScope` is described above (Equipment-kind vs SystemPlatform-kind). The evaluator unions the matched grants along the path — a tag-level ACL and an area-level ACL both contribute.
|
|
|
|
### Dispatch gate — `IPermissionEvaluator`
|
|
|
|
`IPermissionEvaluator.Authorize(UserAuthorizationState session, OpcUaOperation operation, NodeScope scope)` (default impl `TriePermissionEvaluator` at `src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/TriePermissionEvaluator.cs`) returns an `AuthorizationDecision`. The dispatch path calls it on every Read, Write, HistoryRead, Browse, Subscribe, AckAlarm, Call; a `NotGranted` decision denies the operation.
|
|
|
|
Key properties:
|
|
|
|
- **Driver-agnostic.** No driver-level code participates in authorization decisions. Drivers report `SecurityClassification` as metadata on tag discovery; everything else flows through the evaluator.
|
|
- **Strictly fail-closed (default-deny).** Every guard path returns `NotGranted` — a stale session (past the staleness ceiling, decision #152), a cluster mismatch between session and scope, a missing trie, a pruned bound generation, or simply no matching grant. There is no `StrictMode` / fail-open mode; absence of a grant is always a deny.
|
|
- **Evaluator stays pure.** `TriePermissionEvaluator` has no OPC UA stack dependency — it's tested directly from xUnit.
|
|
|
|
### Full model
|
|
|
|
See [`docs/v2/acl-design.md`](v2/acl-design.md) for the complete design: trie invalidation, flag semantics, per-path override rules, and the reasoning behind additive-only (no Deny).
|
|
|
|
---
|
|
|
|
## Control-Plane Authorization
|
|
|
|
Control-plane authorization governs **the Admin UI** — who can view fleet config, edit drafts, publish generations, manage cluster nodes + credentials.
|
|
|
|
Per decision #150 control-plane roles are **deliberately independent of data-plane ACLs**. An operator who can read every OPC UA tag in production may not be allowed to edit cluster config; conversely a `Designer` may not have any data-plane grants at all.
|
|
|
|
### Roles
|
|
|
|
The `AdminRole` enum (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/AdminRole.cs`) defines three roles. Task 1.7 standardized the member names on the canonical `ZB.MOM.WW.Auth` `CanonicalRole` vocabulary (`ConfigViewer → Viewer`, `ConfigEditor → Designer`, `FleetAdmin → Administrator`); a data migration (`CanonicalizeAdminRoles`) rewrote existing rows. This was a rename, not a permission change.
|
|
|
|
| Role | Capabilities |
|
|
|---|---|
|
|
| `Viewer` | Read-only access to drafts, generations, audit log, fleet status. (Was `ConfigViewer`.) |
|
|
| `Designer` | Viewer plus draft authoring (UNS, equipment, tags, ACLs, driver instances, reservations, CSV imports). Cannot publish. (Was `ConfigEditor`.) |
|
|
| `Administrator` | Designer plus publish, cluster/node CRUD, credential management, role-grant management. Satisfies both the `FleetAdmin` and `DriverOperator` authorization policies. (Was `FleetAdmin`.) |
|
|
|
|
`DriverOperator` is an **authorization policy name** (kept stable), not an `AdminRole` member. It gates **Reconnect** / **Restart** commands against live driver instances from the Admin UI `DriverStatusPanel` and requires the canonical role `Operator` or `Administrator` (`policy.RequireRole("Operator", "Administrator")` in `AddAuthorization`, `src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`). `Operator` is an appsettings-only string role (not an `AdminRole` member); map an LDAP group to it via `GroupToRole`, e.g. `"ot-driver-operator": "Operator"`. The `FleetAdmin` policy requires the `Administrator` role.
|
|
|
|
In v2 the authentication + authorization stack is wired centrally by `AddOtOpcUaAuth` (`src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`), which also installs a `FallbackPolicy` that requires an authenticated user. Razor pages gate inline with the canonical role names, e.g. `@attribute [Authorize(Roles = "Administrator,Designer")]`. Nav-menu sections hide via `<AuthorizeView>`.
|
|
|
|
### Role grant source
|
|
|
|
Admin reads `LdapGroupRoleMapping` rows from the Config DB (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/LdapGroupRoleMapping.cs`) — the same pattern as the data-plane `NodeAcl` but scoped to Admin roles + (optionally) one cluster for multi-site fleets (a system-wide row, `IsSystemWide = true`, stacks additively with cluster-scoped rows). The `RoleGrants.razor` page lets `Administrator`s edit these mappings without leaving the UI.
|
|
|
|
### Headless deploy API (`POST /api/deployments`)
|
|
|
|
For CI / scripts that need to trigger a deployment without driving the Blazor "Deploy current configuration" button, admin-role nodes expose `POST /api/deployments` (`DeployApiEndpoints`, `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Api/DeployApiEndpoints.cs`). It forwards to the same `IAdminOperationsClient.StartDeploymentAsync` the button calls.
|
|
|
|
Auth is a **single configured secret** checked from the `X-Api-Key` header in fixed time — deliberately orthogonal to the cookie-only web auth (`OPC UA Authentication` above) so automation needs no LDAP login round-trip. The endpoint is `AllowAnonymous` so the `FallbackPolicy` doesn't 401 it, and enforces the key itself. **It self-disables (503) until `Security:DeployApiKey` is set**, so it is never open by default.
|
|
|
|
```bash
|
|
curl -X POST https://<admin-host>/api/deployments \
|
|
-H 'X-Api-Key: <Security:DeployApiKey>' \
|
|
-H 'Content-Type: application/json' \
|
|
-d '{"createdBy":"ci-bot"}'
|
|
```
|
|
|
|
Responses: `202 Accepted` (`{ outcome, deploymentId, revisionHash }`) when a deployment was sealed, `200` for `NoChanges`, `409` when another deployment is in flight, `422` when rejected, `401` for a missing/wrong key, `503` when unconfigured. Set the secret via `Security:DeployApiKey` (env `Security__DeployApiKey`) on admin nodes only; treat it like any deploy credential (rotate, keep out of source).
|
|
|
|
---
|
|
|
|
## OTOPCUA0001 Analyzer — Compile-Time Guard
|
|
|
|
Per-capability resilience (retry, timeout, circuit-breaker, bulkhead) is applied by `CapabilityInvoker` in `src/Core/ZB.MOM.WW.OtOpcUa.Core/Resilience/`. A driver-capability call made **outside** the invoker bypasses resilience entirely — which in production looks like inconsistent timeouts, un-wrapped retries, and unbounded blocking.
|
|
|
|
`OTOPCUA0001` (Roslyn analyzer at `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs`) fires with category `OtOpcUa.Resilience` and default severity **Warning** (per `AnalyzerReleases.Shipped.md`) when a method on one of the seven guarded capability interfaces (`IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider` — all in `ZB.MOM.WW.OtOpcUa.Core.Abstractions`) is invoked **outside** a lambda passed to `CapabilityInvoker.ExecuteAsync` / `ExecuteWriteAsync`. `AlarmSurfaceInvoker` is **not** a wrapper home — its own implementation is covered transitively because it routes through the inner `CapabilityInvoker.ExecuteAsync`. The analyzer walks up the syntax tree from the call site, finds any enclosing invoker invocation, and verifies the call lives transitively inside that invocation's anonymous-function argument — a sibling pattern (do the call, then invoke `ExecuteAsync` on something unrelated nearby) does not satisfy the rule.
|
|
|
|
The xunit.v3 + Shouldly suite at `tests/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers.Tests/UnwrappedCapabilityCallAnalyzerTests.cs` covers the common fail/pass shapes + the sibling-pattern regression guard.
|
|
|
|
The rule is intentionally scoped to async surfaces — pure in-memory accessors like `IHostConnectivityProbe.GetHostStatuses()` return synchronously and do not require the invoker wrap.
|
|
|
|
---
|
|
|
|
## Audit Logging
|
|
|
|
- **Server**: authentication, certificate-validation, and write-denial events are logged through the regular Serilog rolling file sink.
|
|
- **Admin**: `AuditWriterActor` (`src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs`) writes `ConfigAuditLog` rows (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigAuditLog.cs`) to the Config DB for publish, rollback, cluster-node CRUD, and credential rotation. Visible on the cluster Audit page (`ClusterAudit.razor`) for operators with `Viewer` or above.
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Certificate trust failure
|
|
|
|
Check `{PkiStoreRoot}/rejected/` for the client's cert. Copy the `.der` file to `trusted/certs/`; the SDK trust list reloads on the next handshake. The Admin UI Certificates page shows what is in each store but does not move certs.
|
|
|
|
### LDAP users can connect but fail authorization
|
|
|
|
Verify (a) `Security:Ldap:GroupAttribute` (default `memberOf`) returns the user's groups, (b) `Security:Ldap:GroupToRole` maps those groups to the expected roles, and (c) a `NodeAcl` grant exists at some level of the node's scope path that unions to the required permission. The data-plane evaluator is strictly default-deny — there is no fail-open mode to fall back on.
|
|
|
|
### LDAP bind rejected as "insecure"
|
|
|
|
Set `Security:Ldap:Transport = "Ldaps"` (or `"StartTls"`) with the matching port (636 for AD `Ldaps`), or temporarily set `Security:Ldap:AllowInsecure = true` in dev. Production Active Directory increasingly refuses plain-LDAP bind under LDAP-signing enforcement.
|
|
|
|
### Stale ACL trie after a publish
|
|
|
|
A publish installs a **new generation** into `PermissionTrieCache` via `PermissionTrieBuilder` rather than signalling an invalidation; the evaluator binds each session to a generation. If grants appear stale, confirm the new generation was installed (publish completed) and that sessions re-resolved their auth state — a session past its staleness ceiling fails closed and must re-authenticate. As a last resort `PermissionTrieCache.Invalidate(clusterId)` drops a cluster's cached tries.
|