fix(driver-opcuaclient): resolve High code-review findings (Driver.OpcUaClient-001..-005)

Driver.OpcUaClient-001 — ReadAsync/WriteAsync/DiscoverAsync captured the
session before acquiring _gate, so a reconnect that completed while the
operation was blocked on the gate left the wire call bound to a stale,
closed session. All three now re-read Session (and parse NodeIds) inside
the _gate critical section after WaitAsync returns.

Driver.OpcUaClient-002 — OnReconnectComplete ignored the give-up (null
session) case, permanently wedging the driver with no Faulted signal and
no reconnect loop. The give-up branch now transitions HostState to
Faulted, sets a Faulted DriverHealth with an explanatory message, and
re-arms a fresh SessionReconnectHandler (TryRearmReconnect) against the
last-known session so an always-on gateway self-heals.

Driver.OpcUaClient-003 — BrowseRecursiveAsync discarded browse
continuation points, silently truncating large remote folders.
It now loops on BrowseResult.ContinuationPoint calling BrowseNextAsync
and appending each page until the continuation point is empty.

Driver.OpcUaClient-004 — driver-specs.md §8 namespace handling was
absent. Added NamespaceMap (built from session.NamespaceUris at connect,
rebuilt on reconnect) which persists discovered NodeIds in the
server-stable nsu=<uri>;... form; reads/writes re-resolve that form
against the current session so a remote namespace-table reorder no
longer misaddresses nodes. Added the TargetNamespaceKind option +
UnsMappingTable and ValidateNamespaceKind startup enforcement.

Driver.OpcUaClient-005 — OnKeepAlive read/wrote _reconnectHandler
without a lock, racing the SDK keep-alive timer thread and leaking
handlers. The check-and-set in OnKeepAlive, the take-and-clear in
ShutdownAsync, and the dispose/re-arm in OnReconnectComplete now all
run inside the _probeLock critical section.

Adds OpcUaClientNamespaceTests (11 xUnit + Shouldly regression tests)
covering ValidateNamespaceKind and the NamespaceMap stable encoding.
Reconnect/browse wire paths remain fixture-gated per finding -015.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-22 06:38:20 -04:00
parent 090d2a4b44
commit ebc0511c72
7 changed files with 638 additions and 99 deletions

View File

@@ -0,0 +1,138 @@
using Opc.Ua;
using Opc.Ua.Client;
namespace ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient;
/// <summary>
/// Bidirectional namespace map for the OPC UA Client (gateway) driver, built at connect
/// time from <c>session.NamespaceUris</c> per <c>docs/v2/driver-specs.md</c> §8
/// "Namespace Remapping".
/// </summary>
/// <remarks>
/// <para>
/// The session-relative namespace index embedded in an <c>ns=N;…</c> NodeId string is
/// <b>not</b> stable: the OPC UA spec permits a server to reorder its namespace table
/// across a restart. A driver that stores raw <c>ns=N</c> references and re-parses
/// them verbatim for reads/writes will, after such a reorder, silently address the
/// wrong namespace.
/// </para>
/// <para>
/// This map captures the upstream namespace table as it was at connect time and lets
/// the driver persist NodeIds in a <b>server-stable</b> form — the namespace
/// <i>URI</i> plus the identifier — via <see cref="ToStableReference"/>. At
/// read/write time <see cref="TryResolve"/> re-binds that stable form against the
/// <i>current</i> session's namespace table, so a table reorder is transparently
/// corrected instead of silently misaddressing nodes.
/// </para>
/// <para>
/// Stable references use the form <c>nsu=&lt;uri&gt;;&lt;idType&gt;=&lt;identifier&gt;</c>,
/// which is the standard OPC UA namespace-URI NodeId encoding the SDK already
/// understands. References that are already in plain <c>ns=N;…</c> form (e.g. a
/// hand-entered config tag) still resolve — <see cref="TryResolve"/> falls back to a
/// direct parse against the current session.
/// </para>
/// </remarks>
internal sealed class NamespaceMap
{
// index -> URI and URI -> index, as the upstream server published them at connect time.
private readonly string[] _uris;
private readonly Dictionary<string, ushort> _uriToIndex;
private NamespaceMap(string[] uris)
{
_uris = uris;
_uriToIndex = new Dictionary<string, ushort>(uris.Length, StringComparer.Ordinal);
for (var i = 0; i < uris.Length; i++)
_uriToIndex[uris[i]] = (ushort)i;
}
/// <summary>Snapshot the namespace table from a live session.</summary>
public static NamespaceMap FromSession(ISession session)
{
ArgumentNullException.ThrowIfNull(session);
return FromTable(session.NamespaceUris);
}
/// <summary>
/// Snapshot a <see cref="NamespaceTable"/> directly. Separated from
/// <see cref="FromSession"/> so the URI encoding can be exercised without standing up
/// a live <see cref="ISession"/>.
/// </summary>
public static NamespaceMap FromTable(NamespaceTable namespaceUris)
{
ArgumentNullException.ThrowIfNull(namespaceUris);
return new NamespaceMap(namespaceUris.ToArray());
}
/// <summary>Number of namespaces captured. Index 0 is always the OPC UA core namespace.</summary>
public int Count => _uris.Length;
/// <summary>The namespace URI at the given index, or null if out of range.</summary>
public string? UriForIndex(int index) =>
index >= 0 && index < _uris.Length ? _uris[index] : null;
/// <summary>The index for a namespace URI, or null if the URI is not in the table.</summary>
public ushort? IndexForUri(string uri) =>
_uriToIndex.TryGetValue(uri, out var idx) ? idx : null;
/// <summary>
/// Render a NodeId resolved against this map's session as a <b>server-stable</b>
/// reference string — namespace URI plus identifier, in the SDK's <c>nsu=…</c>
/// encoding. The OPC UA core namespace (index 0) keeps the compact <c>i=…</c>/<c>ns=0</c>
/// form since URI 0 never moves. Used to persist a discovered NodeId into the local
/// address space so it survives a remote namespace-table reorder.
/// </summary>
public string ToStableReference(NodeId nodeId)
{
ArgumentNullException.ThrowIfNull(nodeId);
// Namespace 0 is fixed by the spec — the compact form is already stable.
if (nodeId.NamespaceIndex == 0)
return nodeId.ToString() ?? string.Empty;
var uri = UriForIndex(nodeId.NamespaceIndex);
if (uri is null)
// Namespace index not in the captured table — fall back to the raw form rather
// than throwing; the read/write path will surface BadNodeIdInvalid if it truly
// can't resolve.
return nodeId.ToString() ?? string.Empty;
// nsu=<uri>;<idType>=<identifier> is the standard namespace-URI NodeId encoding.
return $"nsu={uri};{IdentifierPart(nodeId)}";
}
/// <summary>
/// Resolve a reference string — either a stable <c>nsu=…</c> reference produced by
/// <see cref="ToStableReference"/> or a plain <c>ns=N;…</c> NodeId — against the
/// <paramref name="currentSession"/>'s namespace table. A <c>nsu=…</c> reference is
/// re-bound through the current session's URI table so a remote reorder since connect
/// time is corrected. Returns false for empty/malformed input or an unknown URI.
/// </summary>
public static bool TryResolve(ISession currentSession, string reference, out NodeId nodeId)
{
nodeId = NodeId.Null;
if (currentSession is null || string.IsNullOrWhiteSpace(reference)) return false;
try
{
// NodeId.Parse with a NamespaceUriString form (nsu=…) re-maps the URI against
// the supplied context's NamespaceUris — i.e. the *current* session table — so a
// reorder since the reference was captured resolves correctly. Plain ns=N forms
// resolve directly.
nodeId = NodeId.Parse(currentSession.MessageContext, reference);
return !NodeId.IsNull(nodeId);
}
catch
{
return false;
}
}
/// <summary>Render just the identifier portion (<c>s=…</c>, <c>i=…</c>, <c>g=…</c>, <c>b=…</c>) of a NodeId.</summary>
private static string IdentifierPart(NodeId nodeId) => nodeId.IdType switch
{
IdType.Numeric => $"i={nodeId.Identifier}",
IdType.String => $"s={nodeId.Identifier}",
IdType.Guid => $"g={nodeId.Identifier}",
IdType.Opaque => $"b={Convert.ToBase64String((byte[])nodeId.Identifier)}",
_ => $"s={nodeId.Identifier}",
};
}

View File

@@ -71,10 +71,22 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
/// <summary>
/// SDK-provided reconnect handler that owns the retry loop + session-transfer machinery
/// when the session's keep-alive channel reports a bad status. Null outside the
/// reconnecting window; constructed lazily inside the keep-alive handler.
/// reconnecting window; constructed lazily inside the keep-alive handler. Guarded by
/// <see cref="_probeLock"/> — keep-alive callbacks fire from the SDK timer thread and
/// can race a check-then-set if left unsynchronized (Driver.OpcUaClient-005).
/// </summary>
private SessionReconnectHandler? _reconnectHandler;
/// <summary>
/// Bidirectional namespace map built at connect time from <c>session.NamespaceUris</c>.
/// Stored NodeIds embed the server-stable namespace <b>URI</b> rather than the
/// session-relative <c>ns=N</c> index, so a remote-server namespace-table reorder
/// across a restart does not silently re-point stored references at the wrong
/// namespace (driver-specs.md §8 "Namespace Remapping", finding Driver.OpcUaClient-004).
/// Null until <see cref="InitializeAsync"/> returns cleanly.
/// </summary>
private NamespaceMap? _namespaceMap;
public string DriverInstanceId => driverInstanceId;
public string DriverType => "OpcUaClient";
@@ -83,6 +95,11 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
_health = new DriverHealth(DriverState.Initializing, null, null);
try
{
// Enforce the Equipment-vs-SystemPlatform choice at startup per driver-specs.md
// §8 "Namespace Assignment" — a misconfigured remote fails draft validation here,
// not as a runtime surprise.
ValidateNamespaceKind(_options);
var appConfig = await BuildApplicationConfigurationAsync(cancellationToken).ConfigureAwait(false);
var candidates = ResolveEndpointCandidates(_options);
@@ -126,6 +143,12 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
_keepAliveHandler = OnKeepAlive;
session.KeepAlive += _keepAliveHandler;
// Build the bidirectional namespace map from the freshly negotiated session's
// NamespaceUris (driver-specs.md §8 "Namespace Remapping"). Stored NodeIds carry
// the namespace URI, not the session-relative ns=N index, so a remote namespace
// reorder across a restart can't silently misaddress nodes.
_namespaceMap = NamespaceMap.FromSession(session);
Session = session;
_connectedEndpointUrl = connectedUrl;
_health = new DriverHealth(DriverState.Healthy, DateTime.UtcNow, null);
@@ -235,6 +258,41 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
return [opts.EndpointUrl];
}
/// <summary>
/// Enforce the §8 "Namespace Assignment" rule at startup. An <c>Equipment</c>-kind
/// instance gateways raw equipment data and therefore needs a config-driven UNS
/// mapping table (remote nodes don't conform to UNS); a <c>SystemPlatform</c>-kind
/// instance gateways processed data whose hierarchy is preserved verbatim, so a
/// UNS mapping table is meaningless and rejected. Throwing here surfaces the
/// misconfiguration as a draft-validation failure rather than a runtime surprise.
/// </summary>
internal static void ValidateNamespaceKind(OpcUaClientDriverOptions opts)
{
switch (opts.TargetNamespaceKind)
{
case OpcUaTargetNamespaceKind.Equipment:
if (opts.UnsMappingTable is null || opts.UnsMappingTable.Count == 0)
throw new InvalidOperationException(
"OpcUaClient driver configured with TargetNamespaceKind=Equipment but no " +
"UnsMappingTable: §8 requires a config-driven remote-to-UNS mapping table " +
"because remote nodes do not conform to UNS by default. Provide a mapping " +
"table or set TargetNamespaceKind=SystemPlatform if the remote exposes " +
"processed data.");
break;
case OpcUaTargetNamespaceKind.SystemPlatform:
if (opts.UnsMappingTable is { Count: > 0 })
throw new InvalidOperationException(
"OpcUaClient driver configured with TargetNamespaceKind=SystemPlatform but " +
"a UnsMappingTable was supplied: processed data preserves its own hierarchy " +
"and a UNS mapping table is ambiguous here. Clear the mapping table or set " +
"TargetNamespaceKind=Equipment if the remote exposes raw equipment data.");
break;
default:
throw new ArgumentOutOfRangeException(
nameof(opts), opts.TargetNamespaceKind, "Unknown TargetNamespaceKind.");
}
}
/// <summary>
/// Build the user-identity token from the driver options. Split out of
/// <see cref="InitializeAsync"/> so the failover sweep reuses one identity across
@@ -411,10 +469,17 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
// Abort any in-flight reconnect attempts before touching the session — BeginReconnect's
// retry loop holds a reference to the current session and would fight Session.CloseAsync
// if left spinning.
try { _reconnectHandler?.CancelReconnect(); } catch { }
_reconnectHandler?.Dispose();
_reconnectHandler = null;
// if left spinning. Take the handler under _probeLock so a keep-alive callback racing
// through OnKeepAlive can't arm a fresh handler after we've torn this one down
// (Driver.OpcUaClient-005).
SessionReconnectHandler? handlerToCancel;
lock (_probeLock)
{
handlerToCancel = _reconnectHandler;
_reconnectHandler = null;
}
try { handlerToCancel?.CancelReconnect(); } catch { }
handlerToCancel?.Dispose();
if (_keepAliveHandler is not null && Session is not null)
{
@@ -426,6 +491,7 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
catch { /* best-effort */ }
try { Session?.Dispose(); } catch { }
Session = null;
_namespaceMap = null;
_connectedEndpointUrl = null;
TransitionTo(HostState.Unknown);
@@ -441,31 +507,46 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
public async Task<IReadOnlyList<DataValueSnapshot>> ReadAsync(
IReadOnlyList<string> fullReferences, CancellationToken cancellationToken)
{
var session = RequireSession();
// Make sure a session exists before queuing on the gate, but do NOT bind the wire
// call to this reference — a reconnect can swap Session while we wait on _gate. The
// session actually used is re-read inside the gate (Driver.OpcUaClient-001/-006).
_ = RequireSession();
var results = new DataValueSnapshot[fullReferences.Count];
var now = DateTime.UtcNow;
// Parse NodeIds up-front. Tags whose reference doesn't parse get BadNodeIdInvalid
// and are omitted from the wire request — saves a round-trip against the upstream
// server for a fault the driver can detect locally.
var toSend = new ReadValueIdCollection();
var indexMap = new List<int>(fullReferences.Count); // maps wire-index -> results-index
for (var i = 0; i < fullReferences.Count; i++)
{
if (!TryParseNodeId(session, fullReferences[i], out var nodeId))
{
results[i] = new DataValueSnapshot(null, StatusBadNodeIdInvalid, null, now);
continue;
}
toSend.Add(new ReadValueId { NodeId = nodeId, AttributeId = Attributes.Value });
indexMap.Add(i);
}
if (toSend.Count == 0) return results;
await _gate.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
// Re-read Session inside the critical section: if a reconnect completed while we
// were blocked on _gate, OnReconnectComplete has already swapped in the new
// session. NodeId parsing is namespace-relative, so it must also use the current
// session's namespace table.
var session = Session;
if (session is null)
{
for (var i = 0; i < fullReferences.Count; i++)
results[i] = new DataValueSnapshot(null, StatusBadCommunicationError, null, now);
return results;
}
// Parse NodeIds against the live session. Tags whose reference doesn't parse get
// BadNodeIdInvalid and are omitted from the wire request — saves a round-trip for
// a fault the driver can detect locally.
var toSend = new ReadValueIdCollection();
var indexMap = new List<int>(fullReferences.Count); // maps wire-index -> results-index
for (var i = 0; i < fullReferences.Count; i++)
{
if (!TryParseNodeId(session, fullReferences[i], out var nodeId))
{
results[i] = new DataValueSnapshot(null, StatusBadNodeIdInvalid, null, now);
continue;
}
toSend.Add(new ReadValueId { NodeId = nodeId, AttributeId = Attributes.Value });
indexMap.Add(i);
}
if (toSend.Count == 0) return results;
try
{
var resp = await session.ReadAsync(
@@ -514,32 +595,45 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
public async Task<IReadOnlyList<WriteResult>> WriteAsync(
IReadOnlyList<Core.Abstractions.WriteRequest> writes, CancellationToken cancellationToken)
{
var session = RequireSession();
// See ReadAsync — the wire call must use the session current inside the gate, not a
// reference captured before WaitAsync (Driver.OpcUaClient-001/-006).
_ = RequireSession();
var results = new WriteResult[writes.Count];
var toSend = new WriteValueCollection();
var indexMap = new List<int>(writes.Count);
for (var i = 0; i < writes.Count; i++)
{
if (!TryParseNodeId(session, writes[i].FullReference, out var nodeId))
{
results[i] = new WriteResult(StatusBadNodeIdInvalid);
continue;
}
toSend.Add(new WriteValue
{
NodeId = nodeId,
AttributeId = Attributes.Value,
Value = new DataValue(new Variant(writes[i].Value)),
});
indexMap.Add(i);
}
if (toSend.Count == 0) return results;
await _gate.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
var session = Session;
if (session is null)
{
// Writes are non-idempotent (decision #44/#45) — but here the request never
// reached the wire, so BadCommunicationError ("definitely did not happen") is
// the honest code.
for (var i = 0; i < writes.Count; i++)
results[i] = new WriteResult(StatusBadCommunicationError);
return results;
}
var toSend = new WriteValueCollection();
var indexMap = new List<int>(writes.Count);
for (var i = 0; i < writes.Count; i++)
{
if (!TryParseNodeId(session, writes[i].FullReference, out var nodeId))
{
results[i] = new WriteResult(StatusBadNodeIdInvalid);
continue;
}
toSend.Add(new WriteValue
{
NodeId = nodeId,
AttributeId = Attributes.Value,
Value = new DataValue(new Variant(writes[i].Value)),
});
indexMap.Add(i);
}
if (toSend.Count == 0) return results;
try
{
var resp = await session.WriteAsync(
@@ -568,25 +662,26 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
}
/// <summary>
/// Parse a tag's full-reference string as a NodeId. Accepts the standard OPC UA
/// serialized forms (<c>ns=2;s=…</c>, <c>i=2253</c>, <c>ns=4;g=…</c>, <c>ns=3;b=…</c>).
/// Empty + malformed strings return false; the driver surfaces that as
/// <see cref="StatusBadNodeIdInvalid"/> without a wire round-trip.
/// Parse a tag's full-reference string as a NodeId, resolved against the
/// <paramref name="session"/>'s <i>current</i> namespace table. Accepts both the
/// server-stable <c>nsu=&lt;uri&gt;;…</c> form the driver persists (see
/// <see cref="NamespaceMap.ToStableReference"/>) and plain OPC UA serialized forms
/// (<c>ns=2;s=…</c>, <c>i=2253</c>, <c>ns=4;g=…</c>, <c>ns=3;b=…</c>). Resolving the
/// <c>nsu=…</c> form against the current session re-binds it through that session's
/// URI table, so a remote namespace-table reorder across a restart is transparently
/// corrected (driver-specs.md §8). Empty + malformed strings return false; the driver
/// surfaces that as <see cref="StatusBadNodeIdInvalid"/> without a wire round-trip.
/// </summary>
internal static bool TryParseNodeId(ISession session, string fullReference, out NodeId nodeId)
{
nodeId = NodeId.Null;
if (string.IsNullOrWhiteSpace(fullReference)) return false;
try
{
nodeId = NodeId.Parse(session.MessageContext, fullReference);
return !NodeId.IsNull(nodeId);
}
catch
{
return false;
}
}
internal static bool TryParseNodeId(ISession session, string fullReference, out NodeId nodeId) =>
NamespaceMap.TryResolve(session, fullReference, out nodeId);
/// <summary>
/// Render a discovered NodeId in the server-stable form persisted into the local
/// address space. Falls back to the raw serialized NodeId if the namespace map is not
/// yet built (it always is by the time <see cref="DiscoverAsync"/> runs).
/// </summary>
private string StableReference(NodeId nodeId) =>
_namespaceMap?.ToStableReference(nodeId) ?? nodeId.ToString() ?? string.Empty;
private ISession RequireSession() =>
Session ?? throw new InvalidOperationException("OpcUaClientDriver not initialized");
@@ -596,11 +691,10 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
public async Task DiscoverAsync(IAddressSpaceBuilder builder, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(builder);
var session = RequireSession();
var root = !string.IsNullOrEmpty(_options.BrowseRoot)
? NodeId.Parse(session.MessageContext, _options.BrowseRoot)
: ObjectIds.ObjectsFolder;
// Confirm a session exists before queuing; the session actually browsed is re-read
// inside the gate so a reconnect mid-wait can't leave us browsing a closed session
// (Driver.OpcUaClient-001/-006).
_ = RequireSession();
var rootFolder = builder.Folder("Remote", "Remote");
var visited = new HashSet<NodeId>();
@@ -610,6 +704,14 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
await _gate.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
var session = Session
?? throw new InvalidOperationException(
"OpcUaClient session was lost before discovery could browse the remote server.");
var root = !string.IsNullOrEmpty(_options.BrowseRoot)
? NodeId.Parse(session.MessageContext, _options.BrowseRoot)
: ObjectIds.ObjectsFolder;
// Pass 1: browse hierarchy + create folders inline, collect variables into a
// pending list. Defers variable registration until attributes are resolved — the
// address-space builder's Variable call is the one-way commit, so doing it only
@@ -664,15 +766,41 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
}
};
BrowseResponse resp;
ReferenceDescriptionCollection refs;
try
{
resp = await session.BrowseAsync(
var resp = await session.BrowseAsync(
requestHeader: null,
view: null,
requestedMaxReferencesPerNode: 0,
nodesToBrowse: browseDescriptions,
ct: ct).ConfigureAwait(false);
if (resp.Results.Count == 0) return;
var result = resp.Results[0];
refs = result.References;
// Follow browse continuation points. OPC UA servers cap the references returned
// per node in a single response; when a folder has more children than the cap,
// BrowseResult.ContinuationPoint is non-empty and the remainder must be pulled
// with BrowseNext. Without this loop a large remote folder is silently truncated
// and discovered tags go missing from the local address space
// (Driver.OpcUaClient-003).
var continuationPoint = result.ContinuationPoint;
while (continuationPoint is { Length: > 0 })
{
var next = await session.BrowseNextAsync(
requestHeader: null,
releaseContinuationPoints: false,
continuationPoints: [continuationPoint],
ct: ct).ConfigureAwait(false);
if (next.Results.Count == 0) break;
var nextResult = next.Results[0];
if (nextResult.References is { Count: > 0 })
refs.AddRange(nextResult.References);
continuationPoint = nextResult.ContinuationPoint;
}
}
catch
{
@@ -682,9 +810,6 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
return;
}
if (resp.Results.Count == 0) return;
var refs = resp.Results[0].References;
foreach (var rf in refs)
{
if (discovered() >= _options.MaxDiscoveredNodes) break;
@@ -787,7 +912,7 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
var historizing = StatusCode.IsGood(histDv.StatusCode) && histDv.Value is bool b && b;
pv.ParentFolder.Variable(pv.BrowseName, pv.DisplayName, new DriverAttributeInfo(
FullName: pv.NodeId.ToString() ?? string.Empty,
FullName: StableReference(pv.NodeId),
DriverDataType: dataType,
IsArray: isArray,
ArrayDim: null,
@@ -799,7 +924,7 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
void RegisterFallback(PendingVariable pv)
{
pv.ParentFolder.Variable(pv.BrowseName, pv.DisplayName, new DriverAttributeInfo(
FullName: pv.NodeId.ToString() ?? string.Empty,
FullName: StableReference(pv.NodeId),
DriverDataType: DriverDataType.Int32,
IsArray: false,
ArrayDim: null,
@@ -1304,15 +1429,25 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
TransitionTo(HostState.Stopped);
// Kick off the SDK's reconnect loop exactly once per drop. The handler handles its
// own retry cadence via ReconnectPeriod; we tear it down in OnReconnectComplete.
if (_reconnectHandler is not null) return;
// Kick off the SDK's reconnect loop exactly once per drop. Keep-alive callbacks fire
// from the SDK keep-alive timer thread and the SDK can fire this handler repeatedly
// while the channel stays down — the check-then-set must be atomic, otherwise two
// callbacks both observe null, both construct a SessionReconnectHandler, and the
// second assignment leaks the first (its retry loop keeps running, unreferenced and
// never disposed). Guard with _probeLock (Driver.OpcUaClient-005).
SessionReconnectHandler handler;
lock (_probeLock)
{
if (_reconnectHandler is not null || _disposed) return;
handler = new SessionReconnectHandler(telemetry: null!,
reconnectAbort: false,
maxReconnectPeriod: (int)TimeSpan.FromMinutes(2).TotalMilliseconds);
_reconnectHandler = handler;
}
_reconnectHandler = new SessionReconnectHandler(telemetry: null!,
reconnectAbort: false,
maxReconnectPeriod: (int)TimeSpan.FromMinutes(2).TotalMilliseconds);
var state = _reconnectHandler.BeginReconnect(
// BeginReconnect is started outside the lock — it does no _reconnectHandler mutation
// and we don't want to hold _probeLock across an SDK call.
handler.BeginReconnect(
sender,
(int)_options.ReconnectPeriod.TotalMilliseconds,
OnReconnectComplete);
@@ -1345,16 +1480,88 @@ public sealed class OpcUaClientDriver(OpcUaClientDriverOptions options, string d
}
Session = newSession;
_reconnectHandler?.Dispose();
_reconnectHandler = null;
// Whether the reconnect actually succeeded depends on whether the session is
// non-null + connected. When it succeeded, flip back to Running so downstream
// consumers see recovery.
// Retire the handler that just finished. Done under _probeLock so this can't race
// OnKeepAlive arming a fresh handler for a subsequent drop (Driver.OpcUaClient-005).
lock (_probeLock)
{
if (ReferenceEquals(_reconnectHandler, handler))
{
_reconnectHandler.Dispose();
_reconnectHandler = null;
}
}
if (newSession is not null)
{
// Reconnect succeeded. Rebuild the namespace map from the *new* session: the
// remote server may have reordered its namespace table across the restart that
// caused the drop (driver-specs.md §8). Stable nsu= references stored in the
// address space re-resolve correctly against this fresh map.
_namespaceMap = NamespaceMap.FromSession(newSession);
TransitionTo(HostState.Running);
_health = new DriverHealth(DriverState.Healthy, DateTime.UtcNow, null);
return;
}
// The reconnect handler gave up — its retry loop exhausted the 2-minute
// maxReconnectPeriod and invoked the callback with a null Session. Without an
// explicit Faulted signal the driver is permanently wedged: no session, no live
// keep-alive to re-trigger OnKeepAlive, and the Core never learns it must offer an
// operator reinitialize (Driver.OpcUaClient-002). Surface Faulted so the Core fans
// out Bad quality and ReinitializeAsync becomes available, and arm a fresh reconnect
// attempt against the last-known session for an always-on gateway rather than
// abandoning recovery entirely.
TransitionTo(HostState.Faulted);
_health = new DriverHealth(
DriverState.Faulted, _health.LastSuccessfulRead,
"OPC UA session reconnect exhausted its retry window without recovering. " +
"The remote server is unreachable; reinitialize the driver once it is back.");
if (oldSession is not null && !_disposed)
TryRearmReconnect(handler, oldSession);
}
/// <summary>
/// Arm a fresh reconnect attempt after a previous handler gave up. The OPC UA Client
/// driver gateways an always-on remote server, so abandoning recovery permanently is
/// the wrong default — a new <see cref="SessionReconnectHandler"/> keeps retrying so
/// the driver self-heals when the remote returns, while the Faulted health set by the
/// caller still lets an operator force a clean reinitialize in the meantime.
/// </summary>
private void TryRearmReconnect(SessionReconnectHandler exhausted, ISession lastSession)
{
SessionReconnectHandler handler;
lock (_probeLock)
{
// Only re-arm if no other handler took over and we aren't shutting down.
if (_disposed || (_reconnectHandler is not null && !ReferenceEquals(_reconnectHandler, exhausted)))
return;
handler = new SessionReconnectHandler(telemetry: null!,
reconnectAbort: false,
maxReconnectPeriod: (int)TimeSpan.FromMinutes(2).TotalMilliseconds);
_reconnectHandler = handler;
}
try
{
handler.BeginReconnect(
lastSession,
(int)_options.ReconnectPeriod.TotalMilliseconds,
OnReconnectComplete);
}
catch
{
// If the SDK refuses to re-arm (e.g. the last session is fully torn down), drop
// the handler so a later operator ReinitializeAsync isn't blocked by a stale one.
lock (_probeLock)
{
if (ReferenceEquals(_reconnectHandler, handler))
{
_reconnectHandler.Dispose();
_reconnectHandler = null;
}
}
}
}

View File

@@ -134,6 +134,55 @@ public sealed class OpcUaClientDriverOptions
/// browse forever.
/// </summary>
public int MaxBrowseDepth { get; init; } = 10;
/// <summary>
/// The namespace kind this driver instance gateways into. Per
/// <c>docs/v2/driver-specs.md</c> §8 "Namespace Assignment", the OPC UA Client driver
/// is the only driver that supports <b>either</b> kind, decided per instance:
/// <list type="bullet">
/// <item><see cref="OpcUaTargetNamespaceKind.Equipment"/> — gatewaying a remote
/// server that exposes raw equipment data; the driver remaps remote browse paths
/// to UNS.</item>
/// <item><see cref="OpcUaTargetNamespaceKind.SystemPlatform"/> — gatewaying a
/// remote server that exposes processed/derived data; hierarchy is preserved via
/// the remote browse path with no UNS conversion.</item>
/// </list>
/// The driver enforces the choice at startup — see
/// <see cref="OpcUaClientDriver.ValidateNamespaceKind"/> — so a misconfiguration fails
/// draft validation rather than surfacing as a runtime surprise.
/// </summary>
public OpcUaTargetNamespaceKind TargetNamespaceKind { get; init; } = OpcUaTargetNamespaceKind.Equipment;
/// <summary>
/// Optional config-driven remote-to-UNS mapping table. Required when
/// <see cref="TargetNamespaceKind"/> is <see cref="OpcUaTargetNamespaceKind.Equipment"/>:
/// §8 mandates that a remote server gatewayed as Equipment carry a mapping table
/// because remote nodes do not conform to UNS by default. Keys are remote browse-path
/// prefixes; values are the UNS Area/Line/Name path the matching subtree maps onto.
/// For <see cref="OpcUaTargetNamespaceKind.SystemPlatform"/> the table must be empty —
/// processed data preserves its own hierarchy and a mapping table would be ambiguous.
/// </summary>
public IReadOnlyDictionary<string, string> UnsMappingTable { get; init; }
= new Dictionary<string, string>();
}
/// <summary>
/// The namespace kind an OPC UA Client driver instance gateways its remote server into.
/// See <c>docs/v2/driver-specs.md</c> §8 "Namespace Assignment".
/// </summary>
public enum OpcUaTargetNamespaceKind
{
/// <summary>
/// Remote server exposes raw equipment data; remote browse paths are remapped to UNS
/// via <see cref="OpcUaClientDriverOptions.UnsMappingTable"/>.
/// </summary>
Equipment,
/// <summary>
/// Remote server exposes processed/derived data; hierarchy is preserved via the
/// remote browse path with no UNS conversion.
/// </summary>
SystemPlatform,
}
/// <summary>OPC UA message security mode.</summary>