fix(data-connection-layer): resolve DataConnectionLayer-014..017 — real logger for OPC UA client, initial-connect failover, accurate subscribe response, per-tag write-batch results

This commit is contained in:
Joseph Doherty
2026-05-17 03:18:24 -04:00
parent 3d3f43229f
commit 14ba5495d1
7 changed files with 408 additions and 66 deletions

View File

@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-17 |
| Reviewer | claude-agent |
| Commit reviewed | `39d737e` |
| Open findings | 4 |
| Open findings | 0 |
## Summary
@@ -688,7 +688,7 @@ guard), and it passes against the atomic fix.
|--|--|
| Severity | High |
| Category | Security |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.DataConnectionLayer/Adapters/RealOpcUaClient.cs:325`, `src/ScadaLink.DataConnectionLayer/Adapters/RealOpcUaClient.cs:35-39,79-83` |
**Description**
@@ -724,7 +724,18 @@ the default is `false`.
**Resolution**
_Unresolved._
Resolved 2026-05-17 (commit pending). `RealOpcUaClientFactory` gained an
`ILoggerFactory` constructor parameter and `Create()` now passes
`_loggerFactory.CreateLogger<RealOpcUaClient>()` into every `RealOpcUaClient` it builds;
`DataConnectionFactory` (which already holds an `ILoggerFactory`) now constructs
`RealOpcUaClientFactory(globalOptions, _loggerFactory)`, so the DCL-012 auto-accept-cert
warning reaches a real logger in production instead of being discarded by `NullLogger`.
The parameterless / single-arg factory constructors still default to `NullLoggerFactory`
so existing callers are unaffected. Regression tests
`DCL014_RealOpcUaClientFactory_CreatesClientWithRealLogger` and
`DCL014_DataConnectionFactory_ThreadsLoggerToRealOpcUaClient` fail against the pre-fix
code (the first fails to compile — no logger ctor; the second observes a `NullLogger`)
and pass after.
### DataConnectionLayer-015 — Initial-connect failures never trigger failover; an unreachable primary at startup never tries the backup
@@ -732,7 +743,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionActor.cs:404-417`, `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionActor.cs:419-493` |
**Description**
@@ -769,7 +780,16 @@ backup".
**Resolution**
_Unresolved._
Resolved 2026-05-17 (commit pending). The endpoint-switch failover logic was extracted
from `HandleReconnectResult` into a shared `CountFailureAndMaybeFailover` helper, and
`HandleConnectResult` (the initial-connect handler in the `Connecting` state) now
increments `_consecutiveFailures` and calls that helper on failure — so a primary that
is unreachable at startup fails over to the configured backup after
`FailoverRetryCount` attempts instead of retrying the primary forever. Single-endpoint
connections (no backup) still retry indefinitely. Regression test
`DCL015_PrimaryDownAtStartup_FailsOverToBackup` fails against the pre-fix code (the
backup adapter is never created) and passes after; `DCL015_SingleEndpointDownAtStartup_RetriesIndefinitely_NoFailover`
guards the no-backup path.
### DataConnectionLayer-016 — `HandleSubscribeCompleted` reports `SubscribeTagsResponse` success even on a connection-level subscribe failure
@@ -777,7 +797,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionActor.cs:606,666-672`, `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionActor.cs:232-240` |
**Description**
@@ -814,7 +834,16 @@ Instance Actor can reflect partial success. Add a test asserting the response is
**Resolution**
_Unresolved._
Resolved 2026-05-17 (commit pending). `HandleSubscribeCompleted` now replies
`SubscribeTagsResponse(success: false, error: "connection unavailable — will
re-subscribe on reconnect")` when `connectionLevelFailure` is true — matching the
actor's own assessment as it drives into `Reconnecting` — instead of unconditionally
replying `success: true`. Genuine tag-resolution failures still reply `success: true`
(they are a runtime quality concern tracked via `_unresolvedTags`, with a `Bad`-quality
`TagValueUpdate` already pushed), so that case is not regressed. Regression test
`DCL016_ConnectionLevelSubscribeFailure_RepliesWithUnsuccessfulResponse` fails against
the pre-fix code (it returned `Success: true`) and passes after;
`DCL016_GenuineResolutionFailure_StillRepliesSuccess` guards the resolution-failure path.
### DataConnectionLayer-017 — `WriteBatchAsync` aborts the whole batch on a mid-batch disconnect
@@ -822,7 +851,7 @@ _Unresolved._
|--|--|
| Severity | Low |
| Category | Error handling & resilience |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.DataConnectionLayer/Adapters/OpcUaDataConnection.cs:229-237`, `src/ScadaLink.DataConnectionLayer/Adapters/OpcUaDataConnection.cs:218-227` |
**Description**
@@ -856,4 +885,14 @@ complete, consistent result map.
**Resolution**
_Unresolved._
Resolved 2026-05-17 (commit pending). `WriteBatchAsync` now wraps each per-tag
`WriteAsync` call in a try/catch — mirroring the DCL-007 fix for `ReadBatchAsync` — so a
mid-batch fault (the `InvalidOperationException` from `EnsureConnected` on a dropped
connection) is recorded as a failed `WriteResult(false, ex.Message)` for that tag and
the batch returns a complete result map; `OperationCanceledException` is still
propagated so a cancelled batch aborts as a whole. Callers (including
`WriteBatchAndWaitAsync`) now get a consistent per-tag result shape instead of an
unhandled exception. Regression test
`DCL017_WriteBatch_ReturnsPerTagResults_WhenConnectionDropsMidBatch` fails against the
pre-fix code (the batch throws, no map returned) and passes after;
`DCL017_WriteBatch_CancellationAbortsWholeBatch` guards that cancellation still aborts.

View File

@@ -410,8 +410,14 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
}
else
{
_log.Warning("[{0}] Connection failed: {1}. Retrying in {2}s",
_connectionName, result.Error, _options.ReconnectInterval.TotalSeconds);
// DataConnectionLayer-015: the INITIAL connect must participate in the
// failover counter exactly like a reconnect. Without this a primary that is
// unreachable when the actor first starts (fresh deployment, site restart, or
// a primary simply down) is retried forever and the configured backup is
// never tried. Count the failure and switch endpoint once the retry count is
// exhausted, then re-arm the timer.
_consecutiveFailures++;
CountFailureAndMaybeFailover(result.Error);
Timers.StartSingleTimer("reconnect", new AttemptConnect(), _options.ReconnectInterval);
}
}
@@ -439,59 +445,69 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
else
{
_consecutiveFailures++;
// Failover: switch endpoint after exhausting retry count (only if backup is configured)
if (_backupConfig != null && _consecutiveFailures >= _failoverRetryCount)
{
var previousEndpoint = _activeEndpoint;
_activeEndpoint = _activeEndpoint == ActiveEndpoint.Primary
? ActiveEndpoint.Backup
: ActiveEndpoint.Primary;
_consecutiveFailures = 0;
var newConfig = _activeEndpoint == ActiveEndpoint.Primary
? _primaryConfig
: _backupConfig;
// Dispose old adapter (fire-and-forget — don't await in actor context)
_adapter.Disconnected -= OnAdapterDisconnected;
_ = _adapter.DisposeAsync().AsTask();
// Create new adapter for the target endpoint
_adapter = _factory.Create(_protocolType, newConfig);
_connectionDetails = newConfig;
// Wire disconnect handler on new adapter
_adapter.Disconnected += OnAdapterDisconnected;
// DataConnectionLayer-011: new adapter — bump the generation so callbacks
// from the disposed adapter are recognised as stale and dropped.
_adapterGeneration++;
_log.Warning("[{0}] Failing over from {1} to {2}",
_connectionName, previousEndpoint, _activeEndpoint);
// Log failover event to site event log
if (_siteEventLogger != null)
{
_ = _siteEventLogger.LogEventAsync(
"connection", "Warning", null, _connectionName,
$"Failover from {previousEndpoint} to {_activeEndpoint}",
$"After {_failoverRetryCount} consecutive failures");
}
}
else
{
var retryLimit = _backupConfig != null ? _failoverRetryCount.ToString() : "∞";
_log.Warning("[{0}] Reconnect failed: {1}. Retrying in {2}s (attempt {3}/{4})",
_connectionName, result.Error, _options.ReconnectInterval.TotalSeconds,
_consecutiveFailures, retryLimit);
}
CountFailureAndMaybeFailover(result.Error);
Timers.StartSingleTimer("reconnect", new AttemptConnect(), _options.ReconnectInterval);
}
}
/// <summary>
/// Shared connect-failure handling for both the initial connect (Connecting state)
/// and reconnect (Reconnecting state). Assumes <see cref="_consecutiveFailures"/> has
/// already been incremented for the current failure. Switches to the other endpoint
/// once the retry count is exhausted and a backup is configured
/// (DataConnectionLayer-015 brought the initial connect onto this path).
/// </summary>
private void CountFailureAndMaybeFailover(string? error)
{
// Failover: switch endpoint after exhausting retry count (only if backup is configured)
if (_backupConfig != null && _consecutiveFailures >= _failoverRetryCount)
{
var previousEndpoint = _activeEndpoint;
_activeEndpoint = _activeEndpoint == ActiveEndpoint.Primary
? ActiveEndpoint.Backup
: ActiveEndpoint.Primary;
_consecutiveFailures = 0;
var newConfig = _activeEndpoint == ActiveEndpoint.Primary
? _primaryConfig
: _backupConfig;
// Dispose old adapter (fire-and-forget — don't await in actor context)
_adapter.Disconnected -= OnAdapterDisconnected;
_ = _adapter.DisposeAsync().AsTask();
// Create new adapter for the target endpoint
_adapter = _factory.Create(_protocolType, newConfig);
_connectionDetails = newConfig;
// Wire disconnect handler on new adapter
_adapter.Disconnected += OnAdapterDisconnected;
// DataConnectionLayer-011: new adapter — bump the generation so callbacks
// from the disposed adapter are recognised as stale and dropped.
_adapterGeneration++;
_log.Warning("[{0}] Failing over from {1} to {2}",
_connectionName, previousEndpoint, _activeEndpoint);
// Log failover event to site event log
if (_siteEventLogger != null)
{
_ = _siteEventLogger.LogEventAsync(
"connection", "Warning", null, _connectionName,
$"Failover from {previousEndpoint} to {_activeEndpoint}",
$"After {_failoverRetryCount} consecutive failures");
}
}
else
{
var retryLimit = _backupConfig != null ? _failoverRetryCount.ToString() : "∞";
_log.Warning("[{0}] Connect failed: {1}. Retrying in {2}s (attempt {3}/{4})",
_connectionName, error, _options.ReconnectInterval.TotalSeconds,
_consecutiveFailures, retryLimit);
}
}
private void HandleDisconnect()
{
_log.Warning("[{0}] AdapterDisconnected message received — transitioning to Reconnecting", _connectionName);
@@ -663,8 +679,18 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
_options.TagResolutionRetryInterval);
}
msg.ReplyTo.Tell(new SubscribeTagsResponse(
msg.Request.CorrelationId, instanceName, true, null, DateTimeOffset.UtcNow));
// DataConnectionLayer-016: the response must match the actor's own assessment.
// When a connection-level failure is driving the actor into Reconnecting, the
// tags were never subscribed at the adapter — replying Success: true would tell
// the Instance Actor the subscribe succeeded when it did not. Genuine
// tag-resolution failures stay Success: true (they are a runtime quality concern
// tracked via _unresolvedTags, with a Bad-quality TagValueUpdate already pushed).
msg.ReplyTo.Tell(connectionLevelFailure
? new SubscribeTagsResponse(
msg.Request.CorrelationId, instanceName, false,
"connection unavailable — will re-subscribe on reconnect", DateTimeOffset.UtcNow)
: new SubscribeTagsResponse(
msg.Request.CorrelationId, instanceName, true, null, DateTimeOffset.UtcNow));
// The caller (Connected state only) decides whether to enter Reconnecting.
// In Connecting/Reconnecting the connection is not established anyway, so the

View File

@@ -228,10 +228,28 @@ public class OpcUaDataConnection : IDataConnection
public async Task<IReadOnlyDictionary<string, WriteResult>> WriteBatchAsync(IDictionary<string, object?> values, CancellationToken cancellationToken = default)
{
// DataConnectionLayer-017: a mid-batch fault must not abort the whole batch.
// WriteAsync calls EnsureConnected(), which throws InvalidOperationException when
// the connection drops partway through; catch per-tag exceptions and record a
// failed WriteResult so the caller (including WriteBatchAndWaitAsync) receives a
// complete result map. OperationCanceledException is still propagated so a
// cancelled batch aborts as a whole — mirrors the DCL-007 fix for ReadBatchAsync.
var results = new Dictionary<string, WriteResult>();
foreach (var (tagPath, value) in values)
{
results[tagPath] = await WriteAsync(tagPath, value, cancellationToken);
try
{
results[tagPath] = await WriteAsync(tagPath, value, cancellationToken);
}
catch (OperationCanceledException)
{
// Cancellation aborts the whole batch — propagate it.
throw;
}
catch (Exception ex)
{
results[tagPath] = new WriteResult(false, ex.Message);
}
}
return results;
}

View File

@@ -316,11 +316,24 @@ public class RealOpcUaClientFactory : IOpcUaClientFactory
{
private readonly OpcUaGlobalOptions _globalOptions;
// DataConnectionLayer-014: a real logger must be threaded through to every
// RealOpcUaClient this factory builds, otherwise the DCL-012 auto-accept-certificate
// warning emitted in RealOpcUaClient.ConnectAsync sinks into NullLogger and is never
// seen in production. The factory is constructed by DataConnectionFactory, which has
// an ILoggerFactory available.
private readonly ILoggerFactory _loggerFactory;
public RealOpcUaClientFactory() : this(new OpcUaGlobalOptions()) { }
public RealOpcUaClientFactory(OpcUaGlobalOptions globalOptions)
: this(globalOptions, NullLoggerFactory.Instance) { }
public RealOpcUaClientFactory(OpcUaGlobalOptions globalOptions, ILoggerFactory loggerFactory)
{
_globalOptions = globalOptions;
_loggerFactory = loggerFactory;
}
public IOpcUaClient Create() => new RealOpcUaClient(_globalOptions);
public IOpcUaClient Create() =>
new RealOpcUaClient(_globalOptions, _loggerFactory.CreateLogger<RealOpcUaClient>());
}

View File

@@ -22,9 +22,12 @@ public class DataConnectionFactory : IDataConnectionFactory
_loggerFactory = loggerFactory;
var globalOptions = opcUaGlobalOptions.Value;
// Register built-in protocols
// Register built-in protocols.
// DataConnectionLayer-014: pass the ILoggerFactory into RealOpcUaClientFactory so
// the RealOpcUaClient it builds gets a real logger — without it the DCL-012
// auto-accept-certificate security warning is silently discarded by NullLogger.
RegisterAdapter("OpcUa", details => new OpcUaDataConnection(
new RealOpcUaClientFactory(globalOptions),
new RealOpcUaClientFactory(globalOptions, _loggerFactory),
_loggerFactory.CreateLogger<OpcUaDataConnection>()));
}

View File

@@ -803,6 +803,132 @@ public class DataConnectionActorTests : TestKit
ExpectNoMsg(TimeSpan.FromSeconds(1));
}
// ── DataConnectionLayer-015: initial-connect failures must trigger failover ──
[Fact]
public async Task DCL015_PrimaryDownAtStartup_FailsOverToBackup()
{
// Regression test for DataConnectionLayer-015. HandleConnectResult — the handler
// for the INITIAL connection attempt in the Connecting state — only logged and
// re-armed the reconnect timer. It never incremented _consecutiveFailures and
// never switched endpoint, so a primary that is unreachable when the actor first
// starts (a fresh deployment, a site restart, a primary simply down) retried the
// primary forever and never tried the configured backup. After the fix the
// initial connect participates in the failover counter like HandleReconnectResult.
var primaryConfig = new Dictionary<string, string> { ["Endpoint"] = "opc.tcp://primary:4840" };
var backupConfig = new Dictionary<string, string> { ["Endpoint"] = "opc.tcp://backup:4840" };
var primaryAdapter = Substitute.For<IDataConnection>();
var backupAdapter = Substitute.For<IDataConnection>();
// Primary is down from the very first attempt — it never connects.
primaryAdapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
.Returns(Task.FromException(new Exception("Connection refused")));
// Factory returns the backup adapter when called with the backup config.
_mockFactory.Create("OpcUa", Arg.Is<IDictionary<string, string>>(d => d["Endpoint"] == "opc.tcp://backup:4840"))
.Returns(backupAdapter);
backupAdapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
.Returns(Task.CompletedTask);
var actor = CreateFailoverActor(primaryAdapter, "dcl015-startup-failover",
primaryConfig, backupConfig, failoverRetryCount: 2);
// After failoverRetryCount initial-connect failures on the primary, the actor
// must build the backup adapter. Pre-fix the factory was never called.
AwaitCondition(() =>
_mockFactory.ReceivedCalls().Any(c =>
c.GetMethodInfo().Name == "Create" &&
c.GetArguments()[1] is IDictionary<string, string> d &&
d["Endpoint"] == "opc.tcp://backup:4840"),
TimeSpan.FromSeconds(5));
}
[Fact]
public async Task DCL015_SingleEndpointDownAtStartup_RetriesIndefinitely_NoFailover()
{
// Companion guard: a single-endpoint connection (no backup) whose primary is
// unreachable at startup must keep retrying the same endpoint indefinitely — the
// initial-connect failover counter must not synthesise a non-existent backup.
var primaryConfig = new Dictionary<string, string> { ["Endpoint"] = "opc.tcp://primary:4840" };
var primaryAdapter = Substitute.For<IDataConnection>();
var connectCount = 0;
primaryAdapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
.Returns(_ =>
{
Interlocked.Increment(ref connectCount);
return Task.FromException(new Exception("Connection refused"));
});
var actor = CreateFailoverActor(primaryAdapter, "dcl015-no-backup",
primaryConfig, backupConfig: null, failoverRetryCount: 2);
// Many retries occur (well past the failover threshold) but no adapter is ever
// created via the factory — there is nothing to fail over to.
AwaitCondition(() => connectCount >= 6, TimeSpan.FromSeconds(10));
_mockFactory.DidNotReceive().Create(Arg.Any<string>(), Arg.Any<IDictionary<string, string>>());
}
// ── DataConnectionLayer-016: subscribe response must reflect a connection-level failure ──
[Fact]
public async Task DCL016_ConnectionLevelSubscribeFailure_RepliesWithUnsuccessfulResponse()
{
// Regression test for DataConnectionLayer-016. When a subscribe arrives while the
// adapter is silently down, HandleSubscribeCompleted drove the actor into
// Reconnecting (a connection-level failure) but still replied to the caller with
// SubscribeTagsResponse(Success: true, Error: null). The Instance Actor was told
// the subscribe succeeded while the tags were never actually subscribed at the
// adapter. After the fix the response matches the actor's own assessment:
// Success: false with an explanatory error.
_mockAdapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
.Returns(Task.CompletedTask);
_mockAdapter.Status.Returns(ConnectionHealth.Connected);
// Subscribe fails at connection level (InvalidOperationException from EnsureConnected).
_mockAdapter.SubscribeAsync(Arg.Any<string>(), Arg.Any<SubscriptionCallback>(), Arg.Any<CancellationToken>())
.Returns(Task.FromException<string>(
new InvalidOperationException("OPC UA client is not connected.")));
_mockAdapter.ReadAsync(Arg.Any<string>(), Arg.Any<CancellationToken>())
.Returns(new ReadResult(false, null, null));
var actor = CreateConnectionActor("dcl016-conn-fail");
await Task.Delay(300);
actor.Tell(new SubscribeTagsRequest(
"c1", "inst1", "dcl016-conn-fail", ["some/tag"], DateTimeOffset.UtcNow));
// The response must NOT claim success — the connection-level failure that drove
// Reconnecting means the tags were never subscribed at the adapter.
var response = ExpectMsg<SubscribeTagsResponse>(TimeSpan.FromSeconds(5));
Assert.False(response.Success);
Assert.NotNull(response.ErrorMessage);
}
[Fact]
public async Task DCL016_GenuineResolutionFailure_StillRepliesSuccess()
{
// Companion guard: a genuine tag-resolution failure (the node does not exist) is
// a runtime quality concern, not a connection-level fault — the design tracks it
// via _unresolvedTags and a Bad-quality TagValueUpdate. The overall subscribe
// response stays Success: true so this case is not regressed by the 016 fix.
_mockAdapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
.Returns(Task.CompletedTask);
_mockAdapter.Status.Returns(ConnectionHealth.Connected);
_mockAdapter.SubscribeAsync("missing/tag", Arg.Any<SubscriptionCallback>(), Arg.Any<CancellationToken>())
.Returns(Task.FromException<string>(new KeyNotFoundException("node not found")));
_mockAdapter.ReadAsync(Arg.Any<string>(), Arg.Any<CancellationToken>())
.Returns(new ReadResult(false, null, null));
var actor = CreateConnectionActor("dcl016-genuine");
await Task.Delay(300);
actor.Tell(new SubscribeTagsRequest(
"c1", "inst1", "dcl016-genuine", ["missing/tag"], DateTimeOffset.UtcNow));
var ack = FishForMessage<SubscribeTagsResponse>(_ => true, TimeSpan.FromSeconds(5));
Assert.True(ack.Success);
Assert.Null(ack.ErrorMessage);
}
[Fact]
public async Task DCL001_SubscribeWithFailedTags_CountsResolvedAndUnresolvedSeparately()
{

View File

@@ -1,7 +1,9 @@
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using NSubstitute;
using ScadaLink.Commons.Interfaces.Protocol;
using ScadaLink.Commons.Types.Enums;
using ScadaLink.DataConnectionLayer;
using ScadaLink.DataConnectionLayer.Adapters;
namespace ScadaLink.DataConnectionLayer.Tests;
@@ -54,6 +56,63 @@ public class OpcUaCertificateDefaultTests
}
}
/// <summary>
/// DataConnectionLayer-014: the DCL-012 auto-accept-certificate security warning is
/// only effective if RealOpcUaClient is built with a real logger. The only production
/// path that constructs one is RealOpcUaClientFactory.Create(); that factory must
/// thread a logger through, otherwise the warning sinks into NullLogger and an
/// operator who enables AutoAcceptUntrustedCerts sees no signal anywhere.
/// </summary>
public class RealOpcUaClientFactoryLoggerTests
{
private static ILogger<RealOpcUaClient> ReadLogger(RealOpcUaClient client)
{
var field = typeof(RealOpcUaClient).GetField("_logger",
System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic);
Assert.NotNull(field);
return (ILogger<RealOpcUaClient>)field!.GetValue(client)!;
}
[Fact]
public void DCL014_RealOpcUaClientFactory_CreatesClientWithRealLogger()
{
// Regression test for DataConnectionLayer-014. RealOpcUaClientFactory.Create()
// constructed `new RealOpcUaClient(_globalOptions)` with no logger, so the
// DCL-012 man-in-the-middle warning was always discarded by NullLogger in
// production. After the fix the factory accepts an ILoggerFactory and passes a
// real ILogger<RealOpcUaClient> into every client it creates.
using var loggerFactory = LoggerFactory.Create(b => { });
var factory = new RealOpcUaClientFactory(new OpcUaGlobalOptions(), loggerFactory);
var client = factory.Create();
var logger = ReadLogger((RealOpcUaClient)client);
Assert.NotSame(NullLogger<RealOpcUaClient>.Instance, logger);
}
[Fact]
public void DCL014_DataConnectionFactory_ThreadsLoggerToRealOpcUaClient()
{
// The full production wiring: DataConnectionFactory holds an ILoggerFactory and
// registers the OpcUa adapter. The RealOpcUaClient it ultimately builds must end
// up with a real (non-Null) logger so the auto-accept-cert warning is visible.
using var loggerFactory = LoggerFactory.Create(b => { });
var dataConnectionFactory = new DataConnectionFactory(loggerFactory);
var adapter = (OpcUaDataConnection)dataConnectionFactory.Create(
"OpcUa", new Dictionary<string, string>());
// Reach the RealOpcUaClient the adapter would create on connect.
var clientFactoryField = typeof(OpcUaDataConnection).GetField("_clientFactory",
System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic);
Assert.NotNull(clientFactoryField);
var clientFactory = (RealOpcUaClientFactory)clientFactoryField!.GetValue(adapter)!;
var client = (RealOpcUaClient)clientFactory.Create();
var logger = ReadLogger(client);
Assert.NotSame(NullLogger<RealOpcUaClient>.Instance, logger);
}
}
/// <summary>
/// DataConnectionLayer-009: failover-stability tunables must be configurable.
/// </summary>
@@ -269,6 +328,64 @@ public class OpcUaDataConnectionTests
Assert.NotNull(results["bad"].ErrorMessage);
}
[Fact]
public async Task DCL017_WriteBatch_ReturnsPerTagResults_WhenConnectionDropsMidBatch()
{
// Regression test for DataConnectionLayer-017. WriteBatchAsync looped calling
// WriteAsync per tag; WriteAsync first calls EnsureConnected(), which throws
// InvalidOperationException when the client is disconnected. WriteBatchAsync did
// not catch that, so a connection dropping partway through a batch made the whole
// WriteBatchAsync throw — the caller lost the per-tag outcomes for the tags that
// already wrote. After the fix (mirroring DCL-007's ReadBatchAsync) each per-tag
// failure is recorded as a failed WriteResult and the batch returns a complete map.
var writeCount = 0;
// First write succeeds; then the client "disconnects" so EnsureConnected throws.
_mockClient.IsConnected.Returns(_ => Interlocked.Increment(ref writeCount) <= 1);
_mockClient.WriteValueAsync(Arg.Any<string>(), Arg.Any<object?>(), Arg.Any<CancellationToken>())
.Returns((uint)0);
// Connect leaves IsConnected true for the first WriteAsync's EnsureConnected check.
_mockClient.IsConnected.Returns(true);
await _adapter.ConnectAsync(new Dictionary<string, string>());
// Re-arm: IsConnected true for tag1's check, false for tag2 and tag3.
var checks = 0;
_mockClient.IsConnected.Returns(_ => Interlocked.Increment(ref checks) <= 1);
var results = await _adapter.WriteBatchAsync(new Dictionary<string, object?>
{
["tag1"] = 1,
["tag2"] = 2,
["tag3"] = 3
});
// Every requested tag is present in the result map — the batch was not aborted.
Assert.Equal(3, results.Count);
Assert.True(results["tag1"].Success);
// tag2 and tag3 fail at the connection check but are reported per-tag.
Assert.False(results["tag2"].Success);
Assert.NotNull(results["tag2"].ErrorMessage);
Assert.False(results["tag3"].Success);
Assert.NotNull(results["tag3"].ErrorMessage);
}
[Fact]
public async Task DCL017_WriteBatch_CancellationAbortsWholeBatch()
{
// Companion guard: a cancelled batch must still abort as a whole — only
// connection/device faults are demoted to per-tag results, never cancellation.
_mockClient.IsConnected.Returns(true);
await _adapter.ConnectAsync(new Dictionary<string, string>());
using var cts = new CancellationTokenSource();
cts.Cancel();
_mockClient.WriteValueAsync(Arg.Any<string>(), Arg.Any<object?>(), Arg.Any<CancellationToken>())
.Returns<uint>(_ => throw new OperationCanceledException());
await Assert.ThrowsAnyAsync<OperationCanceledException>(() =>
_adapter.WriteBatchAsync(new Dictionary<string, object?> { ["tag1"] = 1 }, cts.Token));
}
[Fact]
public async Task NotConnected_ThrowsOnOperations()
{