Files
lmxopcua/docs/security.md
T
Joseph Doherty ad7f9e731f
v2-ci / build (push) Failing after 50s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
feat(admin): headless POST /api/deployments REST endpoint (API-key gated)
A thin gateway over the admin-operations cluster singleton so CI/scripts can trigger a
deployment without the Blazor button. Forwards to the same IAdminOperationsClient.
StartDeploymentAsync; mounted on admin-role nodes. Auth is a fixed-time X-Api-Key check
against Security:DeployApiKey (orthogonal to the cookie-only web auth); AllowAnonymous so the
auth fallback doesn't 401 it, self-disabling (503) until the key is set. Outcome->status:
202/200/409/422. Unit tests for the key check + outcome mapping; HTTP E2E (real auth + real
deploy via the 2-node harness). Documented in docs/security.md.
2026-06-06 15:54:51 -04:00

26 KiB

Security

v2 status (2026-05-26). The four security concerns below are unchanged in v2. Paths + project names moved: OtOpcUa.Server/Security/OtOpcUa.Security/ (Ldap/, Jwt/, Endpoints/AuthEndpoints.cs), OtOpcUa.Admin is gone (its auth + role-grant pages live in OtOpcUa.AdminUI), and Admin auth policies register from OtOpcUa.Host/Program.cs via AddOtOpcUaAuth (src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs) rather than in a separate Admin process. The Admin UI uses a single Cookie authentication scheme — there is no AddJwtBearer pipeline. The Security:Jwt section configures JwtTokenService, which mints a JWT at the /auth/token endpoint for external consumers (OPC UA clients / automation scripts); the cookie itself stores the ClaimsPrincipal directly. DataProtection keys persist to the shared Config DB (PersistKeysToDbContext<OtOpcUaConfigDbContext>) so cookies survive failover between admin-role nodes.

See docs/plans/2026-05-26-akka-hosting-alignment-design.md §5 for the v2 auth + DataProtection rationale.

OtOpcUa has four independent security concerns. This document covers all four:

  1. Transport security — OPC UA secure channel (signing, encryption, X.509 trust).
  2. OPC UA authentication — Anonymous / UserName / X.509 session identities; UserName tokens authenticated by LDAP bind.
  3. Data-plane authorization — who can browse, read, subscribe, write, acknowledge alarms on which nodes. Evaluated by TriePermissionEvaluator over a PermissionTrie built from the Config DB NodeAcl tree.
  4. Control-plane authorization — who can view or edit fleet configuration in the Admin UI. Gated by the AdminRole (Viewer / Designer / Administrator) claim resolved from LdapGroupRoleMapping.

Transport security and OPC UA authentication are per-node concerns configured in the Server's bootstrap appsettings.json. Data-plane ACLs and Admin role grants live in the Config DB.


Transport Security

Overview

The OtOpcUa Server supports configurable OPC UA transport security profiles that control how data is protected on the wire between OPC UA clients and the server.

There are two distinct layers of security in OPC UA:

  • Transport security -- secures the communication channel itself using TLS-style certificate exchange, message signing, and encryption. This is what the OpcUa:EnabledSecurityProfiles setting controls.
  • UserName token encryption -- protects user credentials (username/password) sent during session activation. The OPC UA stack encrypts UserName tokens using the server's application certificate regardless of the transport security mode. UserName authentication therefore works on None endpoints too — the credentials themselves are always encrypted. A secure transport profile adds protection against message-level tampering and eavesdropping of data payloads.

Supported security profiles

The profiles are the members of the OpcUaSecurityProfile enum (src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OpcUaApplicationHost.cs). The server ships three baseline profiles; the config value is the bare enum-member name (no hyphens, no underscores):

Enum member Security Policy Message Security Mode Description
None None None No signing or encryption. Suitable for development and isolated networks only.
Basic256Sha256Sign Basic256Sha256 Sign Messages are signed but not encrypted. Protects against tampering but data is visible on the wire.
Basic256Sha256SignAndEncrypt Basic256Sha256 SignAndEncrypt Messages are both signed and encrypted. Full protection against tampering and eavesdropping.

BuildSecurityPolicies (OpcUaApplicationHost.cs) maps each configured profile to an SDK ServerSecurityPolicy. The server exposes a separate endpoint per configured profile and clients select the one they prefer at session open. The enum's XML doc notes that Aes128/Aes256 variants can be added later by extending the enum + BuildSecurityPolicies — the wiring is profile-agnostic — but they are not implemented today. There is no SecurityProfileResolver class.

Config value form. The enum binds by member name, so a profile string with hyphens (e.g. Basic256Sha256-Sign) does not bind — use the exact enum-member spelling above. If EnabledSecurityProfiles is empty, the server falls back to a single None endpoint (logged, very visible) so it still has a listening endpoint.

Configuration

Transport security is configured in the OpcUa section of the Host process's bootstrap appsettings.json (bound to OpcUaApplicationHostOptions):

{
  "OpcUa": {
    "ApplicationName": "OtOpcUa",
    "ApplicationUri": "urn:node-a:OtOpcUa",
    "PublicHostname": "0.0.0.0",
    "OpcUaPort": 4840,
    "PkiStoreRoot": "C:/ProgramData/OtOpcUa/pki",
    "AutoAcceptUntrustedClientCertificates": false,
    "EnabledSecurityProfiles": [ "Basic256Sha256Sign", "Basic256Sha256SignAndEncrypt" ]
  }
}

EnabledSecurityProfiles is a list — the server publishes one endpoint per entry. The default (when the key is omitted) is all three baseline profiles (None, Basic256Sha256Sign, Basic256Sha256SignAndEncrypt); production deployments typically drop None. The list must contain at least one entry (OpcUaApplicationHostOptionsValidator enforces MinCount(…, 1)).

The server certificate is auto-generated on first start if none exists in PkiStoreRoot/own/. Always generated even for None-only deployments because UserName token encryption depends on it.

PKI directory layout

{PkiStoreRoot}/
  own/        Server's own application certificate and private key
  issuer/     CA certificates that issued trusted client certificates
  trusted/    Explicitly trusted client (peer) certificates
  rejected/   Certificates that were presented but not trusted

Certificate trust flow

When a client connects using a secure profile (Sign or SignAndEncrypt), the following trust evaluation occurs:

  1. The client presents its application certificate during the secure channel handshake.
  2. The server checks whether the certificate exists in the trusted/ store.
  3. If found, the connection proceeds.
  4. If not found and AutoAcceptUntrustedClientCertificates is true, the certificate is automatically copied to trusted/ and the connection proceeds.
  5. If not found and AutoAcceptUntrustedClientCertificates is false, the certificate is copied to rejected/ and the connection is refused.

The Admin UI Certificates.razor page (src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Certificates.razor) lists the contents of each PKI sub-store (own / trusted / issuer / rejected) by reading the OpcUa:PkiStoreRoot path from configuration. It is currently a read-only viewer — promoting a rejected cert to trusted is still a file move (copy the .der from rejected/ to trusted/certs/); the SDK trust list reloads on the next handshake.

Production hardening

  • Set AutoAcceptUntrustedClientCertificates = false.
  • Drop None from EnabledSecurityProfiles.
  • Promote trusted client certs by moving the .der from rejected/ to trusted/certs/ rather than relying on the auto-accept fallback. (The Admin UI Certificates page shows what is in each store.)
  • Periodically audit the rejected/ directory; an unexpected entry is often a misconfigured client or a probe attempt.

OPC UA Authentication

The Server accepts three OPC UA identity-token types:

Token Handler Notes
Anonymous No IOpcUaUserAuthenticator call — the SDK admits anonymous sessions at the channel. Data-plane authorization (below) still default-denies any node a session has no ACL grant for.
UserName/Password LdapOpcUaUserAuthenticator.AuthenticateUserNameAsync (src/Server/ZB.MOM.WW.OtOpcUa.Host/OpcUa/LdapOpcUaUserAuthenticator.cs, implements IOpcUaUserAuthenticator), backed by the app ILdapAuthServiceOtOpcUaLdapAuthService (src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/OtOpcUaLdapAuthService.cs). LDAP bind + group lookup. The returned LDAP groups are mapped to roles via IGroupRoleMapper<string> (OtOpcUaGroupRoleMapper) and attached to the OPC UA session identity for the downstream ACL evaluator.
X.509 Certificate Stack-level acceptance during the secure-channel handshake. The certificate must be trusted (see PKI trust flow); finer-grain authorization happens through the data-plane ACLs.

When no authenticator is supplied, OpcUaApplicationHost falls back to NullOpcUaUserAuthenticator; the Host wires the real LdapOpcUaUserAuthenticator as a singleton in Program.cs.

LDAP bind flow (OtOpcUaLdapAuthService)

LDAP is configured under the Security:Ldap section (bound to LdapOptions, src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapOptions.cs, SectionName = "Security:Ldap"). The app authenticator is OtOpcUaLdapAuthService — a thin wrapper around the shared ZB.MOM.WW.Auth.Ldap directory client that adds two app-only concerns the shared library deliberately does not model: the Enabled master switch and DevStubMode. The same ILdapAuthService instance serves both the Admin UI cookie login (/auth/login) and the OPC UA UserName path (via LdapOpcUaUserAuthenticator), so operators use one credential across both planes.

OtOpcUaLdapAuthService.AuthenticateAsync:

  1. If Enabled = false, denies outright — no bind, no DevStub bypass (the master switch wins).
  2. If DevStubMode = true, accepts any non-empty credentials and grants the Administrator role without any network bind (dev only — must be false in production).
  3. Refuses to bind over a plaintext transport (Transport = None) unless AllowInsecure = true (dev/test only). This is enforced at login, not at startup.
  4. Delegates the real path to the shared ZB.MOM.WW.Auth.Ldap client: it binds (search-then-bind via ServiceAccountDn, or direct-bind cn={user},{SearchBase} when no service account is set), verifies the password, and reads the user's group memberships.
  5. Returns an LdapAuthResult carrying the validated username + the groups (never roles). Failure codes are folded into opaque user-facing error strings so a probe cannot distinguish "unknown user" from "wrong password".

Group → role mapping happens downstream, not in the auth service: LdapOpcUaUserAuthenticator resolves IGroupRoleMapper<string> (OtOpcUaGroupRoleMapper) per call and unions its output with any pre-resolved roles (the DevStub Administrator grant). The roles are attached to the OPC UA session identity for the ACL evaluator. A mapper fault (e.g. a Config DB outage) falls back to the pre-resolved baseline rather than denying an otherwise-authenticated session.

Transport replaces the former UseTls bool: Ldaps (implicit TLS), StartTls (upgrade), or None (plaintext, requires AllowInsecure). Configuration example (Active Directory production):

{
  "Security": {
    "Ldap": {
      "Enabled": true,
      "DevStubMode": false,
      "Server": "dc01.corp.example.com",
      "Port": 636,
      "Transport": "Ldaps",
      "AllowInsecure": false,
      "SearchBase": "DC=corp,DC=example,DC=com",
      "ServiceAccountDn": "CN=OtOpcUaSvc,OU=Service Accounts,DC=corp,DC=example,DC=com",
      "ServiceAccountPassword": "<from your secret store>",
      "GroupAttribute": "memberOf",
      "DisplayNameAttribute": "cn",
      "UserNameAttribute": "sAMAccountName",
      "GroupToRole": {
        "OPCUA-Designers": "Designer",
        "OPCUA-Admins": "Administrator",
        "OPCUA-Operators": "Operator"
      }
    }
  }
}

GroupToRole maps LDAP group names → Admin roles (case-insensitive); a user gets every role whose source group is in their membership. The values are the canonical control-plane role strings (Viewer / Designer / Administrator, plus the appsettings-only Operator for the DriverOperator policy). UserNameAttribute: "sAMAccountName" is the critical AD override — the GLAuth dev default is cn, which is not how AD users are looked up; use userPrincipalName instead if operators log in with user@corp.example.com form. LdapOptionsValidator (src/Server/ZB.MOM.WW.OtOpcUa.Host/Configuration/LdapOptionsValidator.cs) fails startup when Transport = None and AllowInsecure = false on a real-LDAP (non-DevStub) config.


Data-Plane Authorization

Data-plane authorization is the check run on every OPC UA operation against an OtOpcUa endpoint: can this authenticated user Browse / Read / Subscribe / Write / HistoryRead / AckAlarm / Call on this specific node?

Per decision #129 the model is additive-only — no explicit Deny. Grants at each hierarchy level union; absence of a grant is the default-deny.

Hierarchy

ACLs are evaluated against the node's scope path. NodeScope (src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/NodeScope.cs) carries a Kind that selects between two hierarchy shapes:

Equipment (UNS) kind:        Cluster → Namespace → UnsArea → UnsLine → Equipment → Tag
SystemPlatform (Galaxy) kind: Cluster → Namespace → FolderSegment(s) → Tag

On the Galaxy/SystemPlatform path each folder segment takes one trie level, so a deeply-nested Galaxy folder reaches the same depth as a full UNS path. Unset mid-path levels leave the corresponding id null and the evaluator walks only as far as the scope goes.

Each level can carry NodeAcl rows (src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/NodeAcl.cs) that grant a permission bundle to a set of LdapGroups.

Permission flags

NodePermissions (src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/NodePermissions.cs), stored as an int bitmask in NodeAcl.PermissionFlags:

[Flags]
public enum NodePermissions : int
{
    None = 0,

    Browse            = 1 << 0,
    Read              = 1 << 1,
    Subscribe         = 1 << 2,
    HistoryRead       = 1 << 3,
    WriteOperate      = 1 << 4,
    WriteTune         = 1 << 5,
    WriteConfigure    = 1 << 6,
    AlarmRead         = 1 << 7,
    AlarmAcknowledge  = 1 << 8,
    AlarmConfirm      = 1 << 9,
    AlarmShelve       = 1 << 10,
    MethodCall        = 1 << 11,

    ReadOnly  = Browse | Read | Subscribe | HistoryRead | AlarmRead,
    Operator  = ReadOnly | WriteOperate | AlarmAcknowledge | AlarmConfirm,
    Engineer  = Operator | WriteTune | AlarmShelve,
    Admin     = Engineer | WriteConfigure | MethodCall,
}

The three Write tiers map to Galaxy's v1 SecurityClassificationFreeAccess/OperateWriteOperate, TuneWriteTune, ConfigureWriteConfigure. SecuredWrite / VerifiedWrite / ViewOnly classifications remain read-only from OPC UA regardless of grant.

Evaluator — PermissionTrie

src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/:

Class Role
PermissionTrie Cluster-scoped trie; each node carries (GroupId → NodePermissions) grants.
PermissionTrieBuilder Builds a trie from the current NodeAcl rows in one pass and installs it into the cache.
PermissionTrieCache Process-singleton cache keyed on (ClusterId, GenerationId). Generation-sealed: Install(trie) adds a new generation + advances the "current" pointer; older generations are retained (in-flight requests still resolve) and GC'd by Prune. Invalidate(clusterId) drops every cached trie for a cluster. There is no AclChangeNotifier — a publish installs a new generation rather than signalling an invalidation.
TriePermissionEvaluator Implements IPermissionEvaluator.Authorize(session, operation, scope). Walks the cluster trie for the supplied NodeScope, unions grants along the path, and returns an AuthorizationDecision. Evaluates against the session's bound generation (session.AuthGenerationId), not just "current", so a grant added/removed in a newer generation cannot take effect mid-session.

NodeScope is described above (Equipment-kind vs SystemPlatform-kind). The evaluator unions the matched grants along the path — a tag-level ACL and an area-level ACL both contribute.

Dispatch gate — IPermissionEvaluator

IPermissionEvaluator.Authorize(UserAuthorizationState session, OpcUaOperation operation, NodeScope scope) (default impl TriePermissionEvaluator at src/Core/ZB.MOM.WW.OtOpcUa.Core/Authorization/TriePermissionEvaluator.cs) returns an AuthorizationDecision. The dispatch path calls it on every Read, Write, HistoryRead, Browse, Subscribe, AckAlarm, Call; a NotGranted decision denies the operation.

Key properties:

  • Driver-agnostic. No driver-level code participates in authorization decisions. Drivers report SecurityClassification as metadata on tag discovery; everything else flows through the evaluator.
  • Strictly fail-closed (default-deny). Every guard path returns NotGranted — a stale session (past the staleness ceiling, decision #152), a cluster mismatch between session and scope, a missing trie, a pruned bound generation, or simply no matching grant. There is no StrictMode / fail-open mode; absence of a grant is always a deny.
  • Evaluator stays pure. TriePermissionEvaluator has no OPC UA stack dependency — it's tested directly from xUnit.

Full model

See docs/v2/acl-design.md for the complete design: trie invalidation, flag semantics, per-path override rules, and the reasoning behind additive-only (no Deny).


Control-Plane Authorization

Control-plane authorization governs the Admin UI — who can view fleet config, edit drafts, publish generations, manage cluster nodes + credentials.

Per decision #150 control-plane roles are deliberately independent of data-plane ACLs. An operator who can read every OPC UA tag in production may not be allowed to edit cluster config; conversely a Designer may not have any data-plane grants at all.

Roles

The AdminRole enum (src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/AdminRole.cs) defines three roles. Task 1.7 standardized the member names on the canonical ZB.MOM.WW.Auth CanonicalRole vocabulary (ConfigViewer → Viewer, ConfigEditor → Designer, FleetAdmin → Administrator); a data migration (CanonicalizeAdminRoles) rewrote existing rows. This was a rename, not a permission change.

Role Capabilities
Viewer Read-only access to drafts, generations, audit log, fleet status. (Was ConfigViewer.)
Designer Viewer plus draft authoring (UNS, equipment, tags, ACLs, driver instances, reservations, CSV imports). Cannot publish. (Was ConfigEditor.)
Administrator Designer plus publish, cluster/node CRUD, credential management, role-grant management. Satisfies both the FleetAdmin and DriverOperator authorization policies. (Was FleetAdmin.)

DriverOperator is an authorization policy name (kept stable), not an AdminRole member. It gates Reconnect / Restart commands against live driver instances from the Admin UI DriverStatusPanel and requires the canonical role Operator or Administrator (policy.RequireRole("Operator", "Administrator") in AddAuthorization, src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs). Operator is an appsettings-only string role (not an AdminRole member); map an LDAP group to it via GroupToRole, e.g. "ot-driver-operator": "Operator". The FleetAdmin policy requires the Administrator role.

In v2 the authentication + authorization stack is wired centrally by AddOtOpcUaAuth (src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs), which also installs a FallbackPolicy that requires an authenticated user. Razor pages gate inline with the canonical role names, e.g. @attribute [Authorize(Roles = "Administrator,Designer")]. Nav-menu sections hide via <AuthorizeView>.

Role grant source

Admin reads LdapGroupRoleMapping rows from the Config DB (src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/LdapGroupRoleMapping.cs) — the same pattern as the data-plane NodeAcl but scoped to Admin roles + (optionally) one cluster for multi-site fleets (a system-wide row, IsSystemWide = true, stacks additively with cluster-scoped rows). The RoleGrants.razor page lets Administrators edit these mappings without leaving the UI.

Headless deploy API (POST /api/deployments)

For CI / scripts that need to trigger a deployment without driving the Blazor "Deploy current configuration" button, admin-role nodes expose POST /api/deployments (DeployApiEndpoints, src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Api/DeployApiEndpoints.cs). It forwards to the same IAdminOperationsClient.StartDeploymentAsync the button calls.

Auth is a single configured secret checked from the X-Api-Key header in fixed time — deliberately orthogonal to the cookie-only web auth (OPC UA Authentication above) so automation needs no LDAP login round-trip. The endpoint is AllowAnonymous so the FallbackPolicy doesn't 401 it, and enforces the key itself. It self-disables (503) until Security:DeployApiKey is set, so it is never open by default.

curl -X POST https://<admin-host>/api/deployments \
     -H 'X-Api-Key: <Security:DeployApiKey>' \
     -H 'Content-Type: application/json' \
     -d '{"createdBy":"ci-bot"}'

Responses: 202 Accepted ({ outcome, deploymentId, revisionHash }) when a deployment was sealed, 200 for NoChanges, 409 when another deployment is in flight, 422 when rejected, 401 for a missing/wrong key, 503 when unconfigured. Set the secret via Security:DeployApiKey (env Security__DeployApiKey) on admin nodes only; treat it like any deploy credential (rotate, keep out of source).


OTOPCUA0001 Analyzer — Compile-Time Guard

Per-capability resilience (retry, timeout, circuit-breaker, bulkhead) is applied by CapabilityInvoker in src/Core/ZB.MOM.WW.OtOpcUa.Core/Resilience/. A driver-capability call made outside the invoker bypasses resilience entirely — which in production looks like inconsistent timeouts, un-wrapped retries, and unbounded blocking.

OTOPCUA0001 (Roslyn analyzer at src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs) fires with category OtOpcUa.Resilience and default severity Warning (per AnalyzerReleases.Shipped.md) when a method on one of the seven guarded capability interfaces (IReadable, IWritable, ITagDiscovery, ISubscribable, IHostConnectivityProbe, IAlarmSource, IHistoryProvider — all in ZB.MOM.WW.OtOpcUa.Core.Abstractions) is invoked outside a lambda passed to CapabilityInvoker.ExecuteAsync / ExecuteWriteAsync. AlarmSurfaceInvoker is not a wrapper home — its own implementation is covered transitively because it routes through the inner CapabilityInvoker.ExecuteAsync. The analyzer walks up the syntax tree from the call site, finds any enclosing invoker invocation, and verifies the call lives transitively inside that invocation's anonymous-function argument — a sibling pattern (do the call, then invoke ExecuteAsync on something unrelated nearby) does not satisfy the rule.

The xunit.v3 + Shouldly suite at tests/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers.Tests/UnwrappedCapabilityCallAnalyzerTests.cs covers the common fail/pass shapes + the sibling-pattern regression guard.

The rule is intentionally scoped to async surfaces — pure in-memory accessors like IHostConnectivityProbe.GetHostStatuses() return synchronously and do not require the invoker wrap.


Audit Logging

  • Server: authentication, certificate-validation, and write-denial events are logged through the regular Serilog rolling file sink.
  • Admin: AuditWriterActor (src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs) writes ConfigAuditLog rows (src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigAuditLog.cs) to the Config DB for publish, rollback, cluster-node CRUD, and credential rotation. Visible on the cluster Audit page (ClusterAudit.razor) for operators with Viewer or above.

Troubleshooting

Certificate trust failure

Check {PkiStoreRoot}/rejected/ for the client's cert. Copy the .der file to trusted/certs/; the SDK trust list reloads on the next handshake. The Admin UI Certificates page shows what is in each store but does not move certs.

LDAP users can connect but fail authorization

Verify (a) Security:Ldap:GroupAttribute (default memberOf) returns the user's groups, (b) Security:Ldap:GroupToRole maps those groups to the expected roles, and (c) a NodeAcl grant exists at some level of the node's scope path that unions to the required permission. The data-plane evaluator is strictly default-deny — there is no fail-open mode to fall back on.

LDAP bind rejected as "insecure"

Set Security:Ldap:Transport = "Ldaps" (or "StartTls") with the matching port (636 for AD Ldaps), or temporarily set Security:Ldap:AllowInsecure = true in dev. Production Active Directory increasingly refuses plain-LDAP bind under LDAP-signing enforcement.

Stale ACL trie after a publish

A publish installs a new generation into PermissionTrieCache via PermissionTrieBuilder rather than signalling an invalidation; the evaluator binds each session to a generation. If grants appear stale, confirm the new generation was installed (publish completed) and that sessions re-resolved their auth state — a session past its staleness ceiling fails closed and must re-authenticate. As a last resort PermissionTrieCache.Invalidate(clusterId) drops a cluster's cached tries.