Phase 6.2 Stream C wiring — AuthorizationBootstrap + OpcUaApplicationHost.SetAuthorization

Closes task #133 — the "authz gate is inert in production" blocker
surfaced during task #123. Before this commit, every ACL check on the
six dispatch surfaces (Read, Write, HistoryRead, Browse,
CreateMonitoredItems, Call) short-circuited to allow because Program.cs
constructed OpcUaApplicationHost without passing authzGate or
scopeResolver.

New pieces:

- `AuthorizationOptions` — bound to `Node:Authorization` in
  appsettings.json. `Enabled` (default false) is the master switch;
  `StrictMode` (default false) controls the anonymous / no-LDAP-groups
  fallback behaviour.
- `AuthorizationBootstrap` — singleton service that loads `NodeAcl`
  rows for the published generation, builds a `PermissionTrieCache` +
  `AuthorizationGate`, merges every registered driver's
  `EquipmentNamespaceContent` through `ScopePathIndexBuilder` into one
  full-path `NodeScopeResolver`. Returns `(null, null)` when disabled
  or when no generation is Published yet.
- `DriverEquipmentContentRegistry.Snapshot()` — new method returning a
  defensive copy of the driver → content map so the bootstrap can
  iterate without holding the lock.
- `OpcUaApplicationHost.SetAuthorization(gate, resolver)` — late-bind
  method matching the existing `SetPhase7Sources` pattern. Must run
  before `StartAsync`; rejects post-start rebinding with
  InvalidOperationException.
- `OpcUaServerService.ExecuteAsync` calls `AuthorizationBootstrap.BuildAsync`
  after `PopulateEquipmentContentAsync` and before `applicationHost.StartAsync`,
  in the same window that `SetPhase7Sources` runs.

Behaviour change
- Default (Enabled=false): no behaviour change — the gate stays null,
  all six dispatch surfaces run unchanged. Safe for any existing
  deployment on upgrade.
- Enabled=true with StrictMode=false: identities carrying LDAP groups
  are evaluated against the trie; anonymous / no-groups identities
  pass through (v1 legacy-client compatibility).
- Enabled=true with StrictMode=true: everything evaluates. Anonymous
  or no-groups identities are denied.

Follow-up not covered here: rebind the gate+resolver on generation
refresh (the `GenerationRefreshHostedService` that shipped earlier in
this session). Today the gate only reflects the bootstrap generation
— operators publishing new ACL changes need a process restart to see
them. Matches the current driver-hot-reload limitation and is tracked
in the existing 6.3 follow-up bullet.

Docs: v2-release-readiness.md Phase 6.2 Stream C.12 bullet flipped to
Closed with operator-facing config pointer (`Node:Authorization:Enabled`).

All 283/283 Server.Tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-04-24 15:35:46 -04:00
parent 1be0fb5a29
commit fb6dd3478d
8 changed files with 202 additions and 3 deletions

View File

@@ -38,7 +38,7 @@ Remaining Stream C surfaces (hardening, not release-blocking):
- ~~CreateMonitoredItems + TransferSubscriptions gating with per-item `(AuthGenerationId, MembershipVersion)` stamp so revoked grants surface `BadUserAccessDenied` within one publish cycle (decision #153).~~ **Partial, 2026-04-24.** `DriverNodeManager.CreateMonitoredItems` override pre-gates each request and pre-populates `BadUserAccessDenied` into the errors slot for denied items (the base stack honours pre-set errors and skips those items). Decision #153's per-item `(AuthGenerationId, MembershipVersion)` stamp for detecting mid-subscription revocation is still to ship — needs subscription-layer plumbing. TransferSubscriptions not yet wired (same pattern).
- ~~Alarm Acknowledge / Confirm / Shelve gating.~~ **Partial, 2026-04-24.** Acknowledge + Confirm map to dedicated `OpcUaOperation.AlarmAcknowledge` / `AlarmConfirm` via `MapCallOperation`; Shelve falls through to generic `OpcUaOperation.Call` (needs per-instance method NodeId resolution to distinguish — follow-up).
- ~~Call (method invocation) gating.~~ **Closed 2026-04-24.** `DriverNodeManager.Call` override pre-gates each `CallMethodRequest` via `GateCallMethodRequests`. Denied calls return `BadUserAccessDenied` without running the method. Alarm methods map to alarm-specific operation kinds; everything else gates as generic `Call`.
- ~~Finer-grained scope resolution — current `NodeScopeResolver` returns a flat cluster-level scope. Joining against the live Configuration DB to populate UnsArea / UnsLine / Equipment path is tracked as Stream C.12.~~ **Partial, 2026-04-24.** `ScopePathIndexBuilder` + indexed-mode `NodeScopeResolver` exist and are unit-tested — index keys driver-side full-ref → full `Cluster → Namespace → UnsArea → UnsLine → Equipment → Tag` scope. **Critical follow-up (task #133):** Program.cs does not yet construct either the gate or the resolver — all six dispatch-layer gates (Read, Write, HistoryRead, Browse, CreateMonitoredItems, Call) are currently inert in production. Wiring is required before GA.
- ~~Finer-grained scope resolution — current `NodeScopeResolver` returns a flat cluster-level scope. Joining against the live Configuration DB to populate UnsArea / UnsLine / Equipment path is tracked as Stream C.12.~~ **Closed 2026-04-24.** `AuthorizationBootstrap` now loads `NodeAcl` rows for the current generation into a `PermissionTrieCache`, builds the gate, and merges every registered driver's `EquipmentNamespaceContent` into a full-path `NodeScopeResolver` index. `OpcUaServerService` calls the bootstrap after the equipment registry is populated, before `OpcUaApplicationHost.StartAsync`. Disabled by default — operators flip `Node:Authorization:Enabled=true` to enforce, `StrictMode=true` to reject anonymous/no-groups identities.
- 3-user integration matrix covering every operation × allow/deny.
### ~~Config fallback — Phase 6.1 Stream D wiring~~ (task #136 — **CLOSED** 2026-04-19, PR #96)

View File

@@ -1,3 +1,5 @@
using ZB.MOM.WW.OtOpcUa.Server.Security;
namespace ZB.MOM.WW.OtOpcUa.Server;
/// <summary>
@@ -20,4 +22,7 @@ public sealed class NodeOptions
/// <summary>Path to the LiteDB local cache file.</summary>
public string LocalCachePath { get; init; } = "config_cache.db";
/// <summary>Phase 6.2 authorization pipeline config. Disabled by default.</summary>
public AuthorizationOptions Authorization { get; init; } = new();
}

View File

@@ -44,4 +44,17 @@ public sealed class DriverEquipmentContentRegistry
{
get { lock (_lock) { return _content.Count; } }
}
/// <summary>
/// Snapshot the current driver → content map. Returns a copy so callers can iterate
/// without holding the lock. Used at authorization bootstrap to merge all namespaces
/// into a single <see cref="Security.NodeScopeResolver"/> path index.
/// </summary>
public IReadOnlyDictionary<string, EquipmentNamespaceContent> Snapshot()
{
lock (_lock)
{
return new Dictionary<string, EquipmentNamespaceContent>(_content, StringComparer.OrdinalIgnoreCase);
}
}
}

View File

@@ -24,8 +24,8 @@ public sealed class OpcUaApplicationHost : IAsyncDisposable
private readonly DriverHost _driverHost;
private readonly IUserAuthenticator _authenticator;
private readonly DriverResiliencePipelineBuilder _pipelineBuilder;
private readonly AuthorizationGate? _authzGate;
private readonly NodeScopeResolver? _scopeResolver;
private AuthorizationGate? _authzGate;
private NodeScopeResolver? _scopeResolver;
private readonly StaleConfigFlag? _staleConfigFlag;
private readonly Func<string, ZB.MOM.WW.OtOpcUa.Core.Abstractions.DriverTier>? _tierLookup;
private readonly Func<string, string?>? _resilienceConfigLookup;
@@ -95,6 +95,23 @@ public sealed class OpcUaApplicationHost : IAsyncDisposable
_scriptedAlarmReadable = scriptedAlarmReadable;
}
/// <summary>
/// Late-bind the Phase 6.2 authorization gate + node-scope resolver. Must be called
/// BEFORE <see cref="StartAsync"/> — once the OPC UA server starts the
/// <see cref="OtOpcUaServer"/> + per-namespace <see cref="DriverNodeManager"/>s
/// capture these fields and later rebinding has no effect on already-materialized
/// managers. Call with <c>null</c> for either parameter to leave the corresponding
/// pipeline inert.
/// </summary>
public void SetAuthorization(AuthorizationGate? gate, NodeScopeResolver? resolver)
{
if (_server is not null)
throw new InvalidOperationException(
"Authorization must be set before OpcUaApplicationHost.StartAsync; the OtOpcUaServer + DriverNodeManagers have already captured the previous values.");
_authzGate = gate;
_scopeResolver = resolver;
}
/// <summary>
/// Builds the <see cref="ApplicationConfiguration"/>, validates/creates the application
/// certificate, constructs + starts the <see cref="OtOpcUaServer"/>, then drives

View File

@@ -4,6 +4,7 @@ using Microsoft.Extensions.Logging;
using ZB.MOM.WW.OtOpcUa.Core.Hosting;
using ZB.MOM.WW.OtOpcUa.Server.OpcUa;
using ZB.MOM.WW.OtOpcUa.Server.Phase7;
using ZB.MOM.WW.OtOpcUa.Server.Security;
namespace ZB.MOM.WW.OtOpcUa.Server;
@@ -20,6 +21,7 @@ public sealed class OpcUaServerService(
DriverEquipmentContentRegistry equipmentContentRegistry,
DriverInstanceBootstrapper driverBootstrapper,
Phase7Composer phase7Composer,
AuthorizationBootstrap authorizationBootstrap,
IServiceScopeFactory scopeFactory,
ILogger<OpcUaServerService> logger) : BackgroundService
{
@@ -55,6 +57,15 @@ public sealed class OpcUaServerService(
// No-op when the generation has no virtual tags or scripted alarms.
var phase7 = await phase7Composer.PrepareAsync(gen, stoppingToken);
applicationHost.SetPhase7Sources(phase7.VirtualReadable, phase7.ScriptedAlarmReadable);
// Phase 6.2 Stream C wiring — build the AuthorizationGate + NodeScopeResolver
// from the published generation's NodeAcl rows and the populated equipment
// registry. No-op when Node:Authorization:Enabled=false. Must run before
// StartAsync: OtOpcUaServer + DriverNodeManager construction captures the
// field values on the application host.
var (authzGate, scopeResolver) = await authorizationBootstrap
.BuildAsync(gen, stoppingToken).ConfigureAwait(false);
applicationHost.SetAuthorization(authzGate, scopeResolver);
}
await applicationHost.StartAsync(stoppingToken);

View File

@@ -123,6 +123,11 @@ builder.Services.AddSingleton<DriverInstanceBootstrapper>();
// added to OpcUaApplicationHost's ctor seam.
builder.Services.AddSingleton<DriverEquipmentContentRegistry>();
builder.Services.AddScoped<EquipmentNamespaceContentLoader>();
// Phase 6.2 Stream C wiring — constructs AuthorizationGate + NodeScopeResolver from the
// published generation's NodeAcl rows + per-driver EquipmentNamespaceContent. Gated by
// NodeOptions.Authorization.Enabled (default false) so existing deployments don't flip
// to ACL enforcement accidentally on upgrade.
builder.Services.AddSingleton<AuthorizationBootstrap>();
builder.Services.AddSingleton<OpcUaApplicationHost>(sp =>
{

View File

@@ -0,0 +1,115 @@
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Logging;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Core.Authorization;
using ZB.MOM.WW.OtOpcUa.Core.OpcUa;
using ZB.MOM.WW.OtOpcUa.Server.OpcUa;
namespace ZB.MOM.WW.OtOpcUa.Server.Security;
/// <summary>
/// Bootstraps the Phase 6.2 authorization pipeline for the running Server. Loads
/// <c>NodeAcl</c> rows for the current generation into a
/// <see cref="PermissionTrieCache"/>, constructs an <see cref="AuthorizationGate"/>,
/// and merges per-namespace <see cref="EquipmentNamespaceContent"/> into a single
/// full-path index for <see cref="NodeScopeResolver"/>.
/// </summary>
/// <remarks>
/// <para>
/// Called by <c>OpcUaServerService.ExecuteAsync</c> after the
/// <see cref="DriverEquipmentContentRegistry"/> has been populated but before
/// <c>OpcUaApplicationHost.StartAsync</c> runs — that's the window where the
/// config-DB state is known + the OPC UA server hasn't yet captured the gate
/// references.
/// </para>
/// <para>
/// <see cref="AuthorizationOptions.Enabled"/> gates the whole flow. When
/// <c>false</c> (default), <see cref="BuildAsync"/> returns <c>(null, null)</c>
/// and the dispatch layer short-circuits every ACL check — identical to
/// pre-Phase-6.2.
/// </para>
/// </remarks>
public sealed class AuthorizationBootstrap(
IDbContextFactory<OtOpcUaConfigDbContext> dbFactory,
DriverEquipmentContentRegistry equipmentContentRegistry,
NodeOptions nodeOptions,
ILogger<AuthorizationBootstrap> logger)
{
/// <summary>
/// Build a gate + resolver pair for the supplied <paramref name="generationId"/>.
/// Returns <c>(null, null)</c> when authorization is disabled via
/// <see cref="AuthorizationOptions.Enabled"/> or when the generation couldn't be
/// fetched — in that case the dispatch layer runs without ACL enforcement (same
/// behaviour the Server had before Phase 6.2 Stream C landed).
/// </summary>
public async Task<(AuthorizationGate?, NodeScopeResolver?)> BuildAsync(
long? generationId, CancellationToken cancellationToken)
{
if (!nodeOptions.Authorization.Enabled)
{
logger.LogInformation(
"Authorization disabled (Node:Authorization:Enabled=false); all ACL gates remain inert");
return (null, null);
}
if (generationId is not long gen)
{
logger.LogWarning(
"Authorization enabled but no Published generation available — ACL enforcement skipped until next publish");
return (null, null);
}
var gate = await BuildGateAsync(gen, cancellationToken).ConfigureAwait(false);
var resolver = BuildResolver();
logger.LogInformation(
"Authorization pipeline bootstrapped — generation {Gen}, strictMode={Strict}",
gen, nodeOptions.Authorization.StrictMode);
return (gate, resolver);
}
/// <summary>
/// Load every <see cref="Configuration.Entities.NodeAcl"/> row for
/// <paramref name="generationId"/> scoped to this node's cluster, build a
/// <see cref="PermissionTrieCache"/>, construct an <see cref="AuthorizationGate"/>.
/// </summary>
private async Task<AuthorizationGate> BuildGateAsync(long generationId, CancellationToken cancellationToken)
{
await using var ctx = await dbFactory.CreateDbContextAsync(cancellationToken).ConfigureAwait(false);
var rows = await ctx.NodeAcls
.AsNoTracking()
.Where(a => a.ClusterId == nodeOptions.ClusterId && a.GenerationId == generationId)
.ToListAsync(cancellationToken)
.ConfigureAwait(false);
var cache = new PermissionTrieCache();
cache.Install(PermissionTrieBuilder.Build(nodeOptions.ClusterId, generationId, rows));
var evaluator = new TriePermissionEvaluator(cache);
return new AuthorizationGate(evaluator, strictMode: nodeOptions.Authorization.StrictMode);
}
/// <summary>
/// Merge each registered driver's <see cref="EquipmentNamespaceContent"/> into a single
/// full-path index. Tag rows that cross-reference missing Equipment / Line / Area are
/// silently skipped (the cluster-only fallback handles them). Duplicate TagConfig
/// across namespaces is a config error — <see cref="ScopePathIndexBuilder"/> throws
/// on collision; we let that bubble so bootstrap fails fast.
/// </summary>
private NodeScopeResolver BuildResolver()
{
var merged = new Dictionary<string, NodeScope>(StringComparer.Ordinal);
foreach (var kv in equipmentContentRegistry.Snapshot())
{
// Namespace id isn't carried on EquipmentNamespaceContent directly — driverId
// serves as the namespace-stable key for ACL scope resolution.
var perNamespace = ScopePathIndexBuilder.Build(nodeOptions.ClusterId, kv.Key, kv.Value);
foreach (var entry in perNamespace)
merged[entry.Key] = entry.Value;
}
return merged.Count == 0
? new NodeScopeResolver(nodeOptions.ClusterId)
: new NodeScopeResolver(nodeOptions.ClusterId, merged);
}
}

View File

@@ -0,0 +1,33 @@
namespace ZB.MOM.WW.OtOpcUa.Server.Security;
/// <summary>
/// Configuration for the Phase 6.2 authorization pipeline. Bound from the
/// <c>Node:Authorization</c> section of <c>appsettings.json</c>. Defaults ship disabled
/// so upgrading from pre-Phase-6.2 doesn't accidentally start denying reads the day a
/// new build lands — operators opt in explicitly once their <c>NodeAcl</c> rows are
/// populated.
/// </summary>
/// <remarks>
/// <para>
/// <see cref="Enabled"/> is the master switch. When <c>false</c> (default),
/// the OPC UA application host constructs with
/// <c>authzGate: null, scopeResolver: null</c>; all six dispatch-layer gates
/// (Read, Write, HistoryRead, Browse, CreateMonitoredItems, Call) short-circuit
/// to pass — identical behaviour to pre-Phase-6.2.
/// </para>
/// <para>
/// When <c>true</c>, <see cref="StrictMode"/> picks between two failure modes:
/// <c>false</c> (default) grants anonymous / no-LDAP-groups identities a pass-
/// through so v1-style legacy clients keep working; <c>true</c> denies them.
/// Production deployments should flip to <c>StrictMode = true</c> once every
/// client has been validated against the new identity flow.
/// </para>
/// </remarks>
public sealed class AuthorizationOptions
{
/// <summary>Master switch. False = gate is inert; true = gate is wired into dispatch.</summary>
public bool Enabled { get; init; }
/// <summary>False = anonymous / no-groups identities pass; true = they're denied.</summary>
public bool StrictMode { get; init; }
}