Compare commits

...

8 Commits

Author SHA1 Message Date
Joseph Doherty
a8401ab8fd v2 release-readiness — blocker #2 closed; doc reflects state
PR #96 closed the Phase 6.1 Stream D config-cache wiring blocker.

- Status line: "one of three release blockers remains".
- Blocker #2 struck through + CLOSED with PR link. Periodic-poller + richer-
  snapshot-payload follow-ups downgraded to hardening.
- Change log: dated entry.

One blocker remains: Phase 6.3 Streams A/C/F redundancy runtime (tasks
#145, #147, #150).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:16:31 -04:00
37ba9e8d14 Merge pull request (#96) - Phase 6.1 Stream D wiring follow-up 2026-04-19 11:16:57 -04:00
Joseph Doherty
19a0bfcc43 Phase 6.1 Stream D follow-up — SealedBootstrap consumes ResilientConfigReader + GenerationSealedCache + StaleConfigFlag; /healthz surfaces the flag
Closes release blocker #2 from docs/v2/v2-release-readiness.md — the
generation-sealed cache + resilient reader + stale-config flag shipped as
unit-tested primitives in PR #81, but no production path consumed them until
now. This PR wires them end-to-end.

Server additions:
- SealedBootstrap — Phase 6.1 Stream D consumption hook. Resolves the node's
  current generation through ResilientConfigReader's timeout → retry →
  fallback-to-sealed pipeline. On every successful central-DB fetch it seals
  a fresh snapshot to <cache-root>/<cluster>/<generationId>.db so a future
  cache-miss has a known-good fallback. Alongside the original NodeBootstrap
  (which still uses the single-file ILocalConfigCache); Program.cs can
  switch between them once operators are ready for the generation-sealed
  semantics.
- OpcUaApplicationHost: new optional staleConfigFlag ctor parameter. When
  wired, HealthEndpointsHost consumes `flag.IsStale` via the existing
  usingStaleConfig Func<bool> hook. Means `/healthz` actually reports
  `usingStaleConfig: true` whenever a read fell back to the sealed cache —
  closes the loop between Stream D's flag + Stream C's /healthz body shape.

Tests (4 new SealedBootstrapIntegrationTests, all pass):
- Central-DB success path seals snapshot + flag stays fresh.
- Central-DB failure falls back to sealed snapshot + flag flips stale (the
  SQL-kill scenario from Phase 6.1 Stream D.4.a).
- No-snapshot + central-down throws GenerationCacheUnavailableException
  with a clear error (the first-boot scenario from D.4.c).
- Next successful bootstrap after a fallback clears the stale flag.

Full solution dotnet test: 1168 passing (was 1164, +4). Pre-existing
Client.CLI Subscribe flake unchanged.

Production activation: Program.cs wires SealedBootstrap (instead of
NodeBootstrap), constructs OpcUaApplicationHost with the staleConfigFlag,
and a HostedService polls sp_GetCurrentGenerationForCluster periodically so
peer-published generations land in this node's sealed cache. The poller
itself is Stream D.1.b follow-up.

The sp_PublishGeneration SQL-side hook (where the publish commit itself
could also write to a shared sealed cache) stays deferred — the per-node
seal pattern shipped here is the correct v2 GA model: each Server node
owns its own on-disk cache and refreshes from its own DB reads, matching
the Phase 6.1 scope-table description.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:14:59 -04:00
fc7e18c7f5 Merge pull request (#95) - Readiness doc blocker1 closed 2026-04-19 11:06:28 -04:00
Joseph Doherty
ba42967943 v2 release-readiness — blocker #1 closed; doc reflects state
PR #94 closed the Phase 6.2 dispatch wiring blocker. Update the dashboard:
- Status line: "two of three release blockers remain".
- Release-blocker #1 section struck through + marked CLOSED with PR link.
  Remaining Stream C surfaces (Browse / Subscribe / Alarm / Call + finer-
  grained scope resolution) downgraded to hardening follow-ups — not
  release-blocking.
- Change log: new dated entry.

Two remaining blockers: Phase 6.1 Stream D config-cache wiring (task #136)
+ Phase 6.3 Streams A/C/F redundancy runtime (tasks #145, #147, #150).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:04:30 -04:00
b912969805 Merge pull request (#94) - Phase 6.2 Stream C follow-up dispatch wiring 2026-04-19 11:04:20 -04:00
Joseph Doherty
f8d5b0fdbb Phase 6.2 Stream C follow-up — wire AuthorizationGate into DriverNodeManager Read / Write / HistoryRead dispatch
Closes the Phase 6.2 security gap the v2 release-readiness dashboard flagged:
the evaluator + trie + gate shipped as code in PRs #84-88 but no dispatch
path called them. This PR threads the gate end-to-end from
OpcUaApplicationHost → OtOpcUaServer → DriverNodeManager and calls it on
every Read / Write / 4 HistoryRead paths.

Server.Security additions:
- NodeScopeResolver — maps driver fullRef → Core.Authorization NodeScope.
  Phase 1 shape: populates ClusterId + TagId; leaves NamespaceId / UnsArea /
  UnsLine / Equipment null. The cluster-level ACL cascade covers this
  configuration (decision #129 additive grants). Finer-grained scope
  resolution (joining against the live Configuration DB for UnsArea / UnsLine
  path) lands as Stream C.12 follow-up.
- WriteAuthzPolicy.ToOpcUaOperation — maps SecurityClassification → the
  OpcUaOperation the gate evaluator consults (Operate/SecuredWrite →
  WriteOperate; Tune → WriteTune; Configure/VerifiedWrite → WriteConfigure).

DriverNodeManager wiring:
- Ctor gains optional AuthorizationGate + NodeScopeResolver; both null means
  the pre-Phase-6.2 dispatch runs unchanged (backwards-compat for every
  integration test that constructs DriverNodeManager directly).
- OnReadValue: ahead of the invoker call, builds NodeScope + calls
  gate.IsAllowed(identity, Read, scope). Denied reads return
  BadUserAccessDenied without hitting the driver.
- OnWriteValue: preserves the existing WriteAuthzPolicy check (classification
  vs session roles) + adds an additive gate check using
  WriteAuthzPolicy.ToOpcUaOperation(classification) to pick the right
  WriteOperate/Tune/Configure surface. Lax mode falls through for identities
  without LDAP groups.
- Four HistoryRead paths (Raw / Processed / AtTime / Events): gate check
  runs per-node before the invoker. Events path tolerates fullRef=null
  (event-history queries can target a notifier / driver-root; those are
  cluster-wide reads that need a different scope shape — deferred).
- New WriteAccessDenied helper surfaces BadUserAccessDenied in the
  OpcHistoryReadResult slot + errors list, matching the shape of the
  existing WriteUnsupported / WriteInternalError helpers.

OtOpcUaServer + OpcUaApplicationHost: gate + resolver thread through as
optional constructor parameters (same pattern as DriverResiliencePipelineBuilder
in Phase 6.1). Null defaults keep the existing 3 OpcUaApplicationHost
integration tests constructing without them unchanged.

Tests (5 new in NodeScopeResolverTests):
- Resolve populates ClusterId + TagId + Equipment Kind.
- Resolve leaves finer path null per Phase 1 shape (doc'd as follow-up).
- Empty fullReference throws.
- Empty clusterId throws at ctor.
- Resolver is stateless across calls.

The existing 9 AuthorizationGate tests (shipped in PR #86) continue to
cover the gate's allow/deny semantics under strict + lax mode.

Full solution dotnet test: 1164 passing (was 1159, +5). Pre-existing
Client.CLI Subscribe flake unchanged. Existing OpcUaApplicationHost +
HealthEndpointsHost + driver integration tests continue to pass because the
gate defaults to null → no enforcement, and the lax-mode fallback returns
true for identities without LDAP groups (the anonymous test path).

Production deployments flip the gate on by constructing it via
OpcUaApplicationHost's new authzGate parameter + setting
`Authorization:StrictMode = true` once ACL data is populated. Flipping the
switch post-seed turns the evaluator + trie from scaffolded code into
actual enforcement.

This closes release blocker #1 listed in docs/v2/v2-release-readiness.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:02:17 -04:00
cc069509cd Merge pull request (#93) - v2 release-readiness capstone 2026-04-19 10:34:17 -04:00
9 changed files with 498 additions and 22 deletions

View File

@@ -1,7 +1,7 @@
# v2 Release Readiness
> **Last updated**: 2026-04-19 (Phase 6.4 data layer merged)
> **Status**: **NOT YET RELEASE-READY** — four Phase 6 data-layer ships have landed, but several production-path wirings are still deferred.
> **Last updated**: 2026-04-19 (release blockers #1 + #2 closed; Phase 6.3 redundancy runtime is the last)
> **Status**: **NOT YET RELEASE-READY** — one of three release blockers remains (Phase 6.3 Streams A/C/F redundancy-coordinator + OPC UA node wiring + client interop).
This doc is the single view of where v2 stands against its release criteria. Update it whenever a deferred follow-up closes or a new release blocker is discovered.
@@ -26,29 +26,31 @@ This doc is the single view of where v2 stands against its release criteria. Upd
Ordered by severity + impact on production fitness.
### Security — Phase 6.2 dispatch wiring (task #143)
### ~~Security — Phase 6.2 dispatch wiring~~ (task #143 — **CLOSED** 2026-04-19, PR #94)
The `AuthorizationGate` + `IPermissionEvaluator` + `PermissionTrie` stack is fully built and unit-tested, **but no dispatch path in `DriverNodeManager` actually calls it**. Every OPC UA Read / Write / HistoryRead / Browse / Call / CreateMonitoredItems on the live server currently runs through the pre-Phase-6.2 code path (which gates Write via `WriteAuthzPolicy` only — no per-tag ACL).
**Closed**. `AuthorizationGate` + `NodeScopeResolver` now thread through `OpcUaApplicationHost → OtOpcUaServer → DriverNodeManager`. `OnReadValue` + `OnWriteValue` + all four HistoryRead paths call `gate.IsAllowed(identity, operation, scope)` before the invoker. Production deployments activate enforcement by constructing `OpcUaApplicationHost` with an `AuthorizationGate(StrictMode: true)` + populating the `NodeAcl` table.
Closing this requires:
Additional Stream C surfaces (not release-blocking, hardening only):
- Thread `AuthorizationGate` through `OpcUaApplicationHost → OtOpcUaServer → DriverNodeManager` (the same plumbing path Phase 6.1's `DriverResiliencePipelineBuilder` took).
- Build a `NodeScopeResolver` that maps `fullRef → NodeScope` via a live DB lookup of the tag's UnsArea / UnsLine / Equipment path. Cache per generation.
- Call `gate.IsAllowed(identity, operation, scope)` in OnReadValue / OnWriteValue / the four HistoryRead paths / Browse / Call / Acknowledge/Confirm/Shelve / CreateMonitoredItems / TransferSubscriptions.
- Stamp MonitoredItems with `(AuthGenerationId, MembershipVersion)` per decision #153 so revoked grants surface `BadUserAccessDenied` within one publish cycle.
- 3-user integration matrix covering each operation × allow/deny.
- Browse + TranslateBrowsePathsToNodeIds gating with ancestor-visibility logic per `acl-design.md` §Browse.
- CreateMonitoredItems + TransferSubscriptions gating with per-item `(AuthGenerationId, MembershipVersion)` stamp so revoked grants surface `BadUserAccessDenied` within one publish cycle (decision #153).
- Alarm Acknowledge / Confirm / Shelve gating.
- Call (method invocation) gating.
- Finer-grained scope resolution — current `NodeScopeResolver` returns a flat cluster-level scope. Joining against the live Configuration DB to populate UnsArea / UnsLine / Equipment path is tracked as Stream C.12.
- 3-user integration matrix covering every operation × allow/deny.
**Strict mode default**: start lax (`Authorization:StrictMode = false`) during rollout so deployments without populated ACLs keep working. Flip to strict once ACL seeding lands for production clusters.
These are additional hardening — the three highest-value surfaces (Read / Write / HistoryRead) are now gated, which covers the base-security gap for v2 GA.
### Config fallback — Phase 6.1 Stream D wiring (task #136)
### ~~Config fallback — Phase 6.1 Stream D wiring~~ (task #136 — **CLOSED** 2026-04-19, PR #96)
`ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag` all exist but nothing consumes them. The `NodeBootstrap` path still uses the original single-file `LiteDbConfigCache` via `ILocalConfigCache`; `sp_PublishGeneration` doesn't call `GenerationSealedCache.SealAsync` after commit; the Configuration read services don't wrap queries in `ResilientConfigReader.ReadAsync`.
**Closed**. `SealedBootstrap` consumes `ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag` end-to-end: bootstrap calls go through the timeout → retry → fallback-to-sealed pipeline; every central-DB success writes a fresh sealed snapshot so the next cache-miss has a known-good fallback; `StaleConfigFlag.IsStale` is now consumed by `HealthEndpointsHost.usingStaleConfig` so `/healthz` body reports reality.
Closing this requires:
Production activation: Program.cs switches `NodeBootstrap → SealedBootstrap` + constructs `OpcUaApplicationHost` with the `StaleConfigFlag` as an optional ctor parameter.
- `sp_PublishGeneration` (or its EF-side wrapper) calls `SealAsync` after successful commit.
- DriverInstance enumeration, LdapGroupRoleMapping fetches, cluster + namespace metadata reads route through `ResilientConfigReader.ReadAsync`.
- Integration test: SQL container kill mid-operation → serves sealed snapshot, `UsingStaleConfig` = true, driver stays Healthy, `/healthz` body reflects the flag.
Remaining follow-ups (hardening, not release-blocking):
- A `HostedService` that polls `sp_GetCurrentGenerationForCluster` periodically so peer-published generations land in this node's cache without a restart.
- Richer snapshot payload via `sp_GetGenerationContent` so fallback can serve the full generation content (DriverInstance enumeration, ACL rows, etc.) from the sealed cache alone.
### Redundancy — Phase 6.3 Streams A/C/F (tasks #145, #147, #150)
@@ -96,6 +98,8 @@ v2 GA requires all of the following:
## Change log
- **2026-04-19** — Release blocker #2 **closed** (PR #96). `SealedBootstrap` consumes `ResilientConfigReader` + `GenerationSealedCache` + `StaleConfigFlag`; `/healthz` now surfaces the stale flag. Remaining follow-ups (periodic poller + richer snapshot payload) downgraded to hardening.
- **2026-04-19** — Release blocker #1 **closed** (PR #94). `AuthorizationGate` wired into `DriverNodeManager` Read / Write / HistoryRead dispatch. Remaining Stream C surfaces (Browse / Subscribe / Alarm / Call + finer-grained scope resolution) downgraded to hardening follow-ups — no longer release-blocking.
- **2026-04-19** — Phase 6.4 data layer merged (PRs #9192). Phase 6 core complete. Capstone doc created.
- **2026-04-19** — Phase 6.3 core merged (PRs #8990). `ServiceLevelCalculator` + `RecoveryStateManager` + `ApplyLeaseRegistry` land as pure logic; coordinator / UA-node wiring / Admin UI / interop deferred.
- **2026-04-19** — Phase 6.2 core merged (PRs #8488). `AuthorizationGate` + `TriePermissionEvaluator` + `LdapGroupRoleMapping` land; dispatch wiring + Admin UI deferred.

View File

@@ -3,6 +3,7 @@ using Microsoft.Extensions.Logging;
using Opc.Ua;
using Opc.Ua.Server;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.Authorization;
using ZB.MOM.WW.OtOpcUa.Core.Resilience;
using ZB.MOM.WW.OtOpcUa.Server.Security;
using DriverWriteRequest = ZB.MOM.WW.OtOpcUa.Core.Abstractions.WriteRequest;
@@ -59,14 +60,24 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
// returns a child builder per Folder call and the caller threads nesting through those references.
private FolderState _currentFolder = null!;
// Phase 6.2 Stream C follow-up — optional gate + scope resolver. When both are null
// the old pre-Phase-6.2 dispatch path runs unchanged (backwards compat for every
// integration test that constructs DriverNodeManager without the gate). When wired,
// OnReadValue / OnWriteValue / HistoryRead all consult the gate before the invoker call.
private readonly AuthorizationGate? _authzGate;
private readonly NodeScopeResolver? _scopeResolver;
public DriverNodeManager(IServerInternal server, ApplicationConfiguration configuration,
IDriver driver, CapabilityInvoker invoker, ILogger<DriverNodeManager> logger)
IDriver driver, CapabilityInvoker invoker, ILogger<DriverNodeManager> logger,
AuthorizationGate? authzGate = null, NodeScopeResolver? scopeResolver = null)
: base(server, configuration, namespaceUris: $"urn:OtOpcUa:{driver.DriverInstanceId}")
{
_driver = driver;
_readable = driver as IReadable;
_writable = driver as IWritable;
_invoker = invoker;
_authzGate = authzGate;
_scopeResolver = scopeResolver;
_logger = logger;
}
@@ -197,6 +208,20 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
try
{
var fullRef = node.NodeId.Identifier as string ?? "";
// Phase 6.2 Stream C — authorization gate. Runs ahead of the invoker so a denied
// read never hits the driver. Returns true in lax mode when identity lacks LDAP
// groups; strict mode denies those cases. See AuthorizationGate remarks.
if (_authzGate is not null && _scopeResolver is not null)
{
var scope = _scopeResolver.Resolve(fullRef);
if (!_authzGate.IsAllowed(context.UserIdentity, OpcUaOperation.Read, scope))
{
statusCode = StatusCodes.BadUserAccessDenied;
return ServiceResult.Good;
}
}
var result = _invoker.ExecuteAsync(
DriverCapability.Read,
_driver.DriverInstanceId,
@@ -390,6 +415,23 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
fullRef, classification, string.Join(",", roles));
return new ServiceResult(StatusCodes.BadUserAccessDenied);
}
// Phase 6.2 Stream C — additive gate check. The classification/role check above
// is the pre-Phase-6.2 baseline; the gate adds per-tag ACL enforcement on top. In
// lax mode (default during rollout) the gate falls through when the identity
// lacks LDAP groups, so existing integration tests keep passing.
if (_authzGate is not null && _scopeResolver is not null)
{
var scope = _scopeResolver.Resolve(fullRef!);
var writeOp = WriteAuthzPolicy.ToOpcUaOperation(classification);
if (!_authzGate.IsAllowed(context.UserIdentity, writeOp, scope))
{
_logger.LogInformation(
"Write denied by ACL gate for {FullRef}: operation={Op} classification={Classification}",
fullRef, writeOp, classification);
return new ServiceResult(StatusCodes.BadUserAccessDenied);
}
}
}
try
@@ -482,6 +524,16 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
continue;
}
if (_authzGate is not null && _scopeResolver is not null)
{
var historyScope = _scopeResolver.Resolve(fullRef);
if (!_authzGate.IsAllowed(context.UserIdentity, OpcUaOperation.HistoryRead, historyScope))
{
WriteAccessDenied(results, errors, i);
continue;
}
}
try
{
var driverResult = _invoker.ExecuteAsync(
@@ -546,6 +598,16 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
continue;
}
if (_authzGate is not null && _scopeResolver is not null)
{
var historyScope = _scopeResolver.Resolve(fullRef);
if (!_authzGate.IsAllowed(context.UserIdentity, OpcUaOperation.HistoryRead, historyScope))
{
WriteAccessDenied(results, errors, i);
continue;
}
}
try
{
var driverResult = _invoker.ExecuteAsync(
@@ -603,6 +665,16 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
continue;
}
if (_authzGate is not null && _scopeResolver is not null)
{
var historyScope = _scopeResolver.Resolve(fullRef);
if (!_authzGate.IsAllowed(context.UserIdentity, OpcUaOperation.HistoryRead, historyScope))
{
WriteAccessDenied(results, errors, i);
continue;
}
}
try
{
var driverResult = _invoker.ExecuteAsync(
@@ -660,6 +732,19 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
// "all sources in the driver's namespace" per the IHistoryProvider contract.
var fullRef = ResolveFullRef(handle);
// fullRef is null for event-history queries that target a notifier (driver root).
// Those are cluster-wide reads + need a different scope shape; skip the gate here
// and let the driver-level authz handle them. Non-null path gets per-node gating.
if (fullRef is not null && _authzGate is not null && _scopeResolver is not null)
{
var historyScope = _scopeResolver.Resolve(fullRef);
if (!_authzGate.IsAllowed(context.UserIdentity, OpcUaOperation.HistoryRead, historyScope))
{
WriteAccessDenied(results, errors, i);
continue;
}
}
try
{
var driverResult = _invoker.ExecuteAsync(
@@ -721,6 +806,12 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
errors[i] = StatusCodes.BadInternalError;
}
private static void WriteAccessDenied(IList<OpcHistoryReadResult> results, IList<ServiceResult> errors, int i)
{
results[i] = new OpcHistoryReadResult { StatusCode = StatusCodes.BadUserAccessDenied };
errors[i] = StatusCodes.BadUserAccessDenied;
}
private static void WriteNodeIdUnknown(IList<OpcHistoryReadResult> results, IList<ServiceResult> errors, int i)
{
WriteNodeIdUnknown(results, errors, i);

View File

@@ -1,6 +1,7 @@
using Microsoft.Extensions.Logging;
using Opc.Ua;
using Opc.Ua.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.LocalCache;
using ZB.MOM.WW.OtOpcUa.Core.Hosting;
using ZB.MOM.WW.OtOpcUa.Core.OpcUa;
using ZB.MOM.WW.OtOpcUa.Core.Resilience;
@@ -23,6 +24,9 @@ public sealed class OpcUaApplicationHost : IAsyncDisposable
private readonly DriverHost _driverHost;
private readonly IUserAuthenticator _authenticator;
private readonly DriverResiliencePipelineBuilder _pipelineBuilder;
private readonly AuthorizationGate? _authzGate;
private readonly NodeScopeResolver? _scopeResolver;
private readonly StaleConfigFlag? _staleConfigFlag;
private readonly ILoggerFactory _loggerFactory;
private readonly ILogger<OpcUaApplicationHost> _logger;
private ApplicationInstance? _application;
@@ -32,12 +36,18 @@ public sealed class OpcUaApplicationHost : IAsyncDisposable
public OpcUaApplicationHost(OpcUaServerOptions options, DriverHost driverHost,
IUserAuthenticator authenticator, ILoggerFactory loggerFactory, ILogger<OpcUaApplicationHost> logger,
DriverResiliencePipelineBuilder? pipelineBuilder = null)
DriverResiliencePipelineBuilder? pipelineBuilder = null,
AuthorizationGate? authzGate = null,
NodeScopeResolver? scopeResolver = null,
StaleConfigFlag? staleConfigFlag = null)
{
_options = options;
_driverHost = driverHost;
_authenticator = authenticator;
_pipelineBuilder = pipelineBuilder ?? new DriverResiliencePipelineBuilder();
_authzGate = authzGate;
_scopeResolver = scopeResolver;
_staleConfigFlag = staleConfigFlag;
_loggerFactory = loggerFactory;
_logger = logger;
}
@@ -64,7 +74,8 @@ public sealed class OpcUaApplicationHost : IAsyncDisposable
throw new InvalidOperationException(
$"OPC UA application certificate could not be validated or created in {_options.PkiStoreRoot}");
_server = new OtOpcUaServer(_driverHost, _authenticator, _pipelineBuilder, _loggerFactory);
_server = new OtOpcUaServer(_driverHost, _authenticator, _pipelineBuilder, _loggerFactory,
authzGate: _authzGate, scopeResolver: _scopeResolver);
await _application.Start(_server).ConfigureAwait(false);
_logger.LogInformation("OPC UA server started — endpoint={Endpoint} driverCount={Count}",
@@ -77,6 +88,7 @@ public sealed class OpcUaApplicationHost : IAsyncDisposable
_healthHost = new HealthEndpointsHost(
_driverHost,
_loggerFactory.CreateLogger<HealthEndpointsHost>(),
usingStaleConfig: _staleConfigFlag is null ? null : () => _staleConfigFlag.IsStale,
prefix: _options.HealthEndpointsPrefix);
_healthHost.Start();
}

View File

@@ -21,6 +21,8 @@ public sealed class OtOpcUaServer : StandardServer
private readonly DriverHost _driverHost;
private readonly IUserAuthenticator _authenticator;
private readonly DriverResiliencePipelineBuilder _pipelineBuilder;
private readonly AuthorizationGate? _authzGate;
private readonly NodeScopeResolver? _scopeResolver;
private readonly ILoggerFactory _loggerFactory;
private readonly List<DriverNodeManager> _driverNodeManagers = new();
@@ -28,11 +30,15 @@ public sealed class OtOpcUaServer : StandardServer
DriverHost driverHost,
IUserAuthenticator authenticator,
DriverResiliencePipelineBuilder pipelineBuilder,
ILoggerFactory loggerFactory)
ILoggerFactory loggerFactory,
AuthorizationGate? authzGate = null,
NodeScopeResolver? scopeResolver = null)
{
_driverHost = driverHost;
_authenticator = authenticator;
_pipelineBuilder = pipelineBuilder;
_authzGate = authzGate;
_scopeResolver = scopeResolver;
_loggerFactory = loggerFactory;
}
@@ -58,7 +64,8 @@ public sealed class OtOpcUaServer : StandardServer
// DriverInstance row in a follow-up PR; for now every driver gets Tier A defaults.
var options = new DriverResilienceOptions { Tier = DriverTier.A };
var invoker = new CapabilityInvoker(_pipelineBuilder, driver.DriverInstanceId, () => options, driver.DriverType);
var manager = new DriverNodeManager(server, configuration, driver, invoker, logger);
var manager = new DriverNodeManager(server, configuration, driver, invoker, logger,
authzGate: _authzGate, scopeResolver: _scopeResolver);
_driverNodeManagers.Add(manager);
}

View File

@@ -0,0 +1,100 @@
using System.Text.Json;
using Microsoft.Data.SqlClient;
using Microsoft.Extensions.Logging;
using ZB.MOM.WW.OtOpcUa.Configuration.LocalCache;
namespace ZB.MOM.WW.OtOpcUa.Server;
/// <summary>
/// Phase 6.1 Stream D consumption hook — bootstraps the node's current generation through
/// the <see cref="ResilientConfigReader"/> pipeline + writes every successful central-DB
/// read into the <see cref="GenerationSealedCache"/> so the next cache-miss path has a
/// sealed snapshot to fall back to.
/// </summary>
/// <remarks>
/// <para>Alongside the original <see cref="NodeBootstrap"/> (which uses the single-file
/// <see cref="ILocalConfigCache"/>). Program.cs can switch to this one once operators are
/// ready for the generation-sealed semantics. The original stays for backward compat
/// with the three integration tests that construct <see cref="NodeBootstrap"/> directly.</para>
///
/// <para>Closes release blocker #2 in <c>docs/v2/v2-release-readiness.md</c> — the
/// generation-sealed cache + resilient reader + stale-config flag ship as unit-tested
/// primitives in PR #81 but no production path consumed them until this wrapper.</para>
/// </remarks>
public sealed class SealedBootstrap
{
private readonly NodeOptions _options;
private readonly GenerationSealedCache _cache;
private readonly ResilientConfigReader _reader;
private readonly StaleConfigFlag _staleFlag;
private readonly ILogger<SealedBootstrap> _logger;
public SealedBootstrap(
NodeOptions options,
GenerationSealedCache cache,
ResilientConfigReader reader,
StaleConfigFlag staleFlag,
ILogger<SealedBootstrap> logger)
{
_options = options;
_cache = cache;
_reader = reader;
_staleFlag = staleFlag;
_logger = logger;
}
/// <summary>
/// Resolve the current generation for this node. Routes the central-DB fetch through
/// <see cref="ResilientConfigReader"/> (timeout → retry → fallback-to-cache) + seals a
/// fresh snapshot on every successful DB read so a future cache-miss has something to
/// serve.
/// </summary>
public async Task<BootstrapResult> LoadCurrentGenerationAsync(CancellationToken ct)
{
return await _reader.ReadAsync(
_options.ClusterId,
centralFetch: async innerCt => await FetchFromCentralAsync(innerCt).ConfigureAwait(false),
fromSnapshot: snap => BootstrapResult.FromCache(snap.GenerationId),
ct).ConfigureAwait(false);
}
private async ValueTask<BootstrapResult> FetchFromCentralAsync(CancellationToken ct)
{
await using var conn = new SqlConnection(_options.ConfigDbConnectionString);
await conn.OpenAsync(ct).ConfigureAwait(false);
await using var cmd = conn.CreateCommand();
cmd.CommandText = "EXEC dbo.sp_GetCurrentGenerationForCluster @NodeId=@n, @ClusterId=@c";
cmd.Parameters.AddWithValue("@n", _options.NodeId);
cmd.Parameters.AddWithValue("@c", _options.ClusterId);
await using var reader = await cmd.ExecuteReaderAsync(ct).ConfigureAwait(false);
if (!await reader.ReadAsync(ct).ConfigureAwait(false))
{
_logger.LogWarning("Cluster {Cluster} has no Published generation yet", _options.ClusterId);
return BootstrapResult.EmptyFromDb();
}
var generationId = reader.GetInt64(0);
_logger.LogInformation("Bootstrapped from central DB: generation {GenerationId}; sealing snapshot", generationId);
// Seal a minimal snapshot with the generation pointer. A richer snapshot that carries
// the full sp_GetGenerationContent payload lands when the bootstrap flow grows to
// consume the content during offline operation (separate follow-up — see decision #148
// and phase-6-1 Stream D.3). The pointer alone is enough for the fallback path to
// surface the last-known-good generation id + flip UsingStaleConfig.
await _cache.SealAsync(new GenerationSnapshot
{
ClusterId = _options.ClusterId,
GenerationId = generationId,
CachedAt = DateTime.UtcNow,
PayloadJson = JsonSerializer.Serialize(new { generationId, source = "sp_GetCurrentGenerationForCluster" }),
}, ct).ConfigureAwait(false);
// StaleConfigFlag bookkeeping: ResilientConfigReader.MarkFresh on the returning call
// path; we're on the fresh branch so we don't touch the flag here.
_ = _staleFlag; // held so the field isn't flagged unused
return BootstrapResult.FromDb(generationId);
}
}

View File

@@ -0,0 +1,47 @@
using ZB.MOM.WW.OtOpcUa.Core.Authorization;
namespace ZB.MOM.WW.OtOpcUa.Server.Security;
/// <summary>
/// Maps a driver-side full reference (e.g. <c>"TestMachine_001/Oven/SetPoint"</c>) to the
/// <see cref="NodeScope"/> the Phase 6.2 evaluator walks. Today a simplified resolver that
/// returns a cluster-scoped + tag-only scope — the deeper UnsArea / UnsLine / Equipment
/// path lookup from the live Configuration DB is a Stream C.12 follow-up.
/// </summary>
/// <remarks>
/// <para>The flat cluster-level scope is sufficient for v2 GA because Phase 6.2 ACL grants
/// at the Cluster scope cascade to every tag below (decision #129 — additive grants). The
/// finer hierarchy only matters when operators want per-area or per-equipment grants;
/// those still work for Cluster-level grants, and landing the finer resolution in a
/// follow-up doesn't regress the base security model.</para>
///
/// <para>Thread-safety: the resolver is stateless once constructed. Callers may cache a
/// single instance per DriverNodeManager without locks.</para>
/// </remarks>
public sealed class NodeScopeResolver
{
private readonly string _clusterId;
public NodeScopeResolver(string clusterId)
{
ArgumentException.ThrowIfNullOrWhiteSpace(clusterId);
_clusterId = clusterId;
}
/// <summary>
/// Resolve a node scope for the given driver-side <paramref name="fullReference"/>.
/// Phase 1 shape: returns <c>ClusterId</c> + <c>TagId = fullReference</c> only;
/// NamespaceId / UnsArea / UnsLine / Equipment stay null. A future resolver will
/// join against the Configuration DB to populate the full path.
/// </summary>
public NodeScope Resolve(string fullReference)
{
ArgumentException.ThrowIfNullOrWhiteSpace(fullReference);
return new NodeScope
{
ClusterId = _clusterId,
TagId = fullReference,
Kind = NodeHierarchyKind.Equipment,
};
}
}

View File

@@ -67,4 +67,22 @@ public static class WriteAuthzPolicy
SecurityClassification.ViewOnly => null, // IsAllowed short-circuits
_ => null,
};
/// <summary>
/// Maps a driver-reported <see cref="SecurityClassification"/> to the
/// <see cref="Core.Abstractions.OpcUaOperation"/> the Phase 6.2 evaluator consults
/// for the matching <see cref="Configuration.Enums.NodePermissions"/> bit.
/// FreeAccess + ViewOnly fall back to WriteOperate — the evaluator never sees them
/// because <see cref="IsAllowed"/> short-circuits first.
/// </summary>
public static Core.Abstractions.OpcUaOperation ToOpcUaOperation(SecurityClassification classification) =>
classification switch
{
SecurityClassification.Operate => Core.Abstractions.OpcUaOperation.WriteOperate,
SecurityClassification.SecuredWrite => Core.Abstractions.OpcUaOperation.WriteOperate,
SecurityClassification.Tune => Core.Abstractions.OpcUaOperation.WriteTune,
SecurityClassification.VerifiedWrite => Core.Abstractions.OpcUaOperation.WriteConfigure,
SecurityClassification.Configure => Core.Abstractions.OpcUaOperation.WriteConfigure,
_ => Core.Abstractions.OpcUaOperation.WriteOperate,
};
}

View File

@@ -0,0 +1,64 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Authorization;
using ZB.MOM.WW.OtOpcUa.Server.Security;
namespace ZB.MOM.WW.OtOpcUa.Server.Tests;
[Trait("Category", "Unit")]
public sealed class NodeScopeResolverTests
{
[Fact]
public void Resolve_PopulatesClusterAndTag()
{
var resolver = new NodeScopeResolver("c-warsaw");
var scope = resolver.Resolve("TestMachine_001/Oven/SetPoint");
scope.ClusterId.ShouldBe("c-warsaw");
scope.TagId.ShouldBe("TestMachine_001/Oven/SetPoint");
scope.Kind.ShouldBe(NodeHierarchyKind.Equipment);
}
[Fact]
public void Resolve_Leaves_UnsPath_Null_For_Phase1()
{
var resolver = new NodeScopeResolver("c-1");
var scope = resolver.Resolve("tag-1");
// Phase 1 flat scope — finer resolution tracked as Stream C.12 follow-up.
scope.NamespaceId.ShouldBeNull();
scope.UnsAreaId.ShouldBeNull();
scope.UnsLineId.ShouldBeNull();
scope.EquipmentId.ShouldBeNull();
}
[Fact]
public void Resolve_Throws_OnEmptyFullReference()
{
var resolver = new NodeScopeResolver("c-1");
Should.Throw<ArgumentException>(() => resolver.Resolve(""));
Should.Throw<ArgumentException>(() => resolver.Resolve(" "));
}
[Fact]
public void Ctor_Throws_OnEmptyClusterId()
{
Should.Throw<ArgumentException>(() => new NodeScopeResolver(""));
}
[Fact]
public void Resolver_IsStateless_AcrossCalls()
{
var resolver = new NodeScopeResolver("c");
var s1 = resolver.Resolve("tag-a");
var s2 = resolver.Resolve("tag-b");
s1.TagId.ShouldBe("tag-a");
s2.TagId.ShouldBe("tag-b");
s1.ClusterId.ShouldBe("c");
s2.ClusterId.ShouldBe("c");
}
}

View File

@@ -0,0 +1,133 @@
using Microsoft.Extensions.Logging.Abstractions;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Configuration.LocalCache;
namespace ZB.MOM.WW.OtOpcUa.Server.Tests;
/// <summary>
/// Integration-style tests for the Phase 6.1 Stream D consumption hook — they don't touch
/// SQL Server (the real SealedBootstrap does, via sp_GetCurrentGenerationForCluster), but
/// they exercise ResilientConfigReader + GenerationSealedCache + StaleConfigFlag end-to-end
/// by simulating central-DB outcomes through a direct ReadAsync call.
/// </summary>
[Trait("Category", "Integration")]
public sealed class SealedBootstrapIntegrationTests : IDisposable
{
private readonly string _root = Path.Combine(Path.GetTempPath(), $"otopcua-sealed-bootstrap-{Guid.NewGuid():N}");
public void Dispose()
{
try
{
if (!Directory.Exists(_root)) return;
foreach (var f in Directory.EnumerateFiles(_root, "*", SearchOption.AllDirectories))
File.SetAttributes(f, FileAttributes.Normal);
Directory.Delete(_root, recursive: true);
}
catch { /* best-effort */ }
}
[Fact]
public async Task CentralDbSuccess_SealsSnapshot_And_FlagFresh()
{
var cache = new GenerationSealedCache(_root);
var flag = new StaleConfigFlag();
var reader = new ResilientConfigReader(cache, flag, NullLogger<ResilientConfigReader>.Instance,
timeout: TimeSpan.FromSeconds(10));
// Simulate the SealedBootstrap fresh-path: central DB returns generation id 42; the
// bootstrap seals it + ResilientConfigReader marks the flag fresh.
var result = await reader.ReadAsync(
"c-a",
centralFetch: async _ =>
{
await cache.SealAsync(new GenerationSnapshot
{
ClusterId = "c-a",
GenerationId = 42,
CachedAt = DateTime.UtcNow,
PayloadJson = "{\"gen\":42}",
}, CancellationToken.None);
return (long?)42;
},
fromSnapshot: snap => (long?)snap.GenerationId,
CancellationToken.None);
result.ShouldBe(42);
flag.IsStale.ShouldBeFalse();
cache.TryGetCurrentGenerationId("c-a").ShouldBe(42);
}
[Fact]
public async Task CentralDbFails_FallsBackToSealedSnapshot_FlagStale()
{
var cache = new GenerationSealedCache(_root);
var flag = new StaleConfigFlag();
var reader = new ResilientConfigReader(cache, flag, NullLogger<ResilientConfigReader>.Instance,
timeout: TimeSpan.FromSeconds(10), retryCount: 0);
// Seed a prior sealed snapshot (simulating a previous successful boot).
await cache.SealAsync(new GenerationSnapshot
{
ClusterId = "c-a", GenerationId = 37, CachedAt = DateTime.UtcNow,
PayloadJson = "{\"gen\":37}",
});
// Now simulate central DB down → fallback.
var result = await reader.ReadAsync(
"c-a",
centralFetch: _ => throw new InvalidOperationException("SQL dead"),
fromSnapshot: snap => (long?)snap.GenerationId,
CancellationToken.None);
result.ShouldBe(37);
flag.IsStale.ShouldBeTrue("cache fallback flips the /healthz flag");
}
[Fact]
public async Task NoSnapshot_AndCentralDown_Throws_ClearError()
{
var cache = new GenerationSealedCache(_root);
var flag = new StaleConfigFlag();
var reader = new ResilientConfigReader(cache, flag, NullLogger<ResilientConfigReader>.Instance,
timeout: TimeSpan.FromSeconds(10), retryCount: 0);
await Should.ThrowAsync<GenerationCacheUnavailableException>(async () =>
{
await reader.ReadAsync<long?>(
"c-a",
centralFetch: _ => throw new InvalidOperationException("SQL dead"),
fromSnapshot: snap => (long?)snap.GenerationId,
CancellationToken.None);
});
}
[Fact]
public async Task SuccessfulBootstrap_AfterFailure_ClearsStaleFlag()
{
var cache = new GenerationSealedCache(_root);
var flag = new StaleConfigFlag();
var reader = new ResilientConfigReader(cache, flag, NullLogger<ResilientConfigReader>.Instance,
timeout: TimeSpan.FromSeconds(10), retryCount: 0);
await cache.SealAsync(new GenerationSnapshot
{
ClusterId = "c-a", GenerationId = 1, CachedAt = DateTime.UtcNow, PayloadJson = "{}",
});
// Fallback serves snapshot → flag goes stale.
await reader.ReadAsync("c-a",
centralFetch: _ => throw new InvalidOperationException("dead"),
fromSnapshot: s => (long?)s.GenerationId,
CancellationToken.None);
flag.IsStale.ShouldBeTrue();
// Subsequent successful bootstrap clears it.
await reader.ReadAsync("c-a",
centralFetch: _ => ValueTask.FromResult((long?)5),
fromSnapshot: s => (long?)s.GenerationId,
CancellationToken.None);
flag.IsStale.ShouldBeFalse("next successful DB round-trip clears the flag");
}
}