fix(data-connection-layer): resolve DataConnectionLayer-014..017 — real logger for OPC UA client, initial-connect failover, accurate subscribe response, per-tag write-batch results
This commit is contained in:
@@ -8,7 +8,7 @@
|
||||
| Last reviewed | 2026-05-17 |
|
||||
| Reviewer | claude-agent |
|
||||
| Commit reviewed | `39d737e` |
|
||||
| Open findings | 4 |
|
||||
| Open findings | 0 |
|
||||
|
||||
## Summary
|
||||
|
||||
@@ -688,7 +688,7 @@ guard), and it passes against the atomic fix.
|
||||
|--|--|
|
||||
| Severity | High |
|
||||
| Category | Security |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.DataConnectionLayer/Adapters/RealOpcUaClient.cs:325`, `src/ScadaLink.DataConnectionLayer/Adapters/RealOpcUaClient.cs:35-39,79-83` |
|
||||
|
||||
**Description**
|
||||
@@ -724,7 +724,18 @@ the default is `false`.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Resolved 2026-05-17 (commit pending). `RealOpcUaClientFactory` gained an
|
||||
`ILoggerFactory` constructor parameter and `Create()` now passes
|
||||
`_loggerFactory.CreateLogger<RealOpcUaClient>()` into every `RealOpcUaClient` it builds;
|
||||
`DataConnectionFactory` (which already holds an `ILoggerFactory`) now constructs
|
||||
`RealOpcUaClientFactory(globalOptions, _loggerFactory)`, so the DCL-012 auto-accept-cert
|
||||
warning reaches a real logger in production instead of being discarded by `NullLogger`.
|
||||
The parameterless / single-arg factory constructors still default to `NullLoggerFactory`
|
||||
so existing callers are unaffected. Regression tests
|
||||
`DCL014_RealOpcUaClientFactory_CreatesClientWithRealLogger` and
|
||||
`DCL014_DataConnectionFactory_ThreadsLoggerToRealOpcUaClient` fail against the pre-fix
|
||||
code (the first fails to compile — no logger ctor; the second observes a `NullLogger`)
|
||||
and pass after.
|
||||
|
||||
### DataConnectionLayer-015 — Initial-connect failures never trigger failover; an unreachable primary at startup never tries the backup
|
||||
|
||||
@@ -732,7 +743,7 @@ _Unresolved._
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionActor.cs:404-417`, `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionActor.cs:419-493` |
|
||||
|
||||
**Description**
|
||||
@@ -769,7 +780,16 @@ backup".
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Resolved 2026-05-17 (commit pending). The endpoint-switch failover logic was extracted
|
||||
from `HandleReconnectResult` into a shared `CountFailureAndMaybeFailover` helper, and
|
||||
`HandleConnectResult` (the initial-connect handler in the `Connecting` state) now
|
||||
increments `_consecutiveFailures` and calls that helper on failure — so a primary that
|
||||
is unreachable at startup fails over to the configured backup after
|
||||
`FailoverRetryCount` attempts instead of retrying the primary forever. Single-endpoint
|
||||
connections (no backup) still retry indefinitely. Regression test
|
||||
`DCL015_PrimaryDownAtStartup_FailsOverToBackup` fails against the pre-fix code (the
|
||||
backup adapter is never created) and passes after; `DCL015_SingleEndpointDownAtStartup_RetriesIndefinitely_NoFailover`
|
||||
guards the no-backup path.
|
||||
|
||||
### DataConnectionLayer-016 — `HandleSubscribeCompleted` reports `SubscribeTagsResponse` success even on a connection-level subscribe failure
|
||||
|
||||
@@ -777,7 +797,7 @@ _Unresolved._
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionActor.cs:606,666-672`, `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionActor.cs:232-240` |
|
||||
|
||||
**Description**
|
||||
@@ -814,7 +834,16 @@ Instance Actor can reflect partial success. Add a test asserting the response is
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Resolved 2026-05-17 (commit pending). `HandleSubscribeCompleted` now replies
|
||||
`SubscribeTagsResponse(success: false, error: "connection unavailable — will
|
||||
re-subscribe on reconnect")` when `connectionLevelFailure` is true — matching the
|
||||
actor's own assessment as it drives into `Reconnecting` — instead of unconditionally
|
||||
replying `success: true`. Genuine tag-resolution failures still reply `success: true`
|
||||
(they are a runtime quality concern tracked via `_unresolvedTags`, with a `Bad`-quality
|
||||
`TagValueUpdate` already pushed), so that case is not regressed. Regression test
|
||||
`DCL016_ConnectionLevelSubscribeFailure_RepliesWithUnsuccessfulResponse` fails against
|
||||
the pre-fix code (it returned `Success: true`) and passes after;
|
||||
`DCL016_GenuineResolutionFailure_StillRepliesSuccess` guards the resolution-failure path.
|
||||
|
||||
### DataConnectionLayer-017 — `WriteBatchAsync` aborts the whole batch on a mid-batch disconnect
|
||||
|
||||
@@ -822,7 +851,7 @@ _Unresolved._
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Error handling & resilience |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.DataConnectionLayer/Adapters/OpcUaDataConnection.cs:229-237`, `src/ScadaLink.DataConnectionLayer/Adapters/OpcUaDataConnection.cs:218-227` |
|
||||
|
||||
**Description**
|
||||
@@ -856,4 +885,14 @@ complete, consistent result map.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Resolved 2026-05-17 (commit pending). `WriteBatchAsync` now wraps each per-tag
|
||||
`WriteAsync` call in a try/catch — mirroring the DCL-007 fix for `ReadBatchAsync` — so a
|
||||
mid-batch fault (the `InvalidOperationException` from `EnsureConnected` on a dropped
|
||||
connection) is recorded as a failed `WriteResult(false, ex.Message)` for that tag and
|
||||
the batch returns a complete result map; `OperationCanceledException` is still
|
||||
propagated so a cancelled batch aborts as a whole. Callers (including
|
||||
`WriteBatchAndWaitAsync`) now get a consistent per-tag result shape instead of an
|
||||
unhandled exception. Regression test
|
||||
`DCL017_WriteBatch_ReturnsPerTagResults_WhenConnectionDropsMidBatch` fails against the
|
||||
pre-fix code (the batch throws, no map returned) and passes after;
|
||||
`DCL017_WriteBatch_CancellationAbortsWholeBatch` guards that cancellation still aborts.
|
||||
|
||||
@@ -410,8 +410,14 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
|
||||
}
|
||||
else
|
||||
{
|
||||
_log.Warning("[{0}] Connection failed: {1}. Retrying in {2}s",
|
||||
_connectionName, result.Error, _options.ReconnectInterval.TotalSeconds);
|
||||
// DataConnectionLayer-015: the INITIAL connect must participate in the
|
||||
// failover counter exactly like a reconnect. Without this a primary that is
|
||||
// unreachable when the actor first starts (fresh deployment, site restart, or
|
||||
// a primary simply down) is retried forever and the configured backup is
|
||||
// never tried. Count the failure and switch endpoint once the retry count is
|
||||
// exhausted, then re-arm the timer.
|
||||
_consecutiveFailures++;
|
||||
CountFailureAndMaybeFailover(result.Error);
|
||||
Timers.StartSingleTimer("reconnect", new AttemptConnect(), _options.ReconnectInterval);
|
||||
}
|
||||
}
|
||||
@@ -439,59 +445,69 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
|
||||
else
|
||||
{
|
||||
_consecutiveFailures++;
|
||||
|
||||
// Failover: switch endpoint after exhausting retry count (only if backup is configured)
|
||||
if (_backupConfig != null && _consecutiveFailures >= _failoverRetryCount)
|
||||
{
|
||||
var previousEndpoint = _activeEndpoint;
|
||||
_activeEndpoint = _activeEndpoint == ActiveEndpoint.Primary
|
||||
? ActiveEndpoint.Backup
|
||||
: ActiveEndpoint.Primary;
|
||||
_consecutiveFailures = 0;
|
||||
|
||||
var newConfig = _activeEndpoint == ActiveEndpoint.Primary
|
||||
? _primaryConfig
|
||||
: _backupConfig;
|
||||
|
||||
// Dispose old adapter (fire-and-forget — don't await in actor context)
|
||||
_adapter.Disconnected -= OnAdapterDisconnected;
|
||||
_ = _adapter.DisposeAsync().AsTask();
|
||||
|
||||
// Create new adapter for the target endpoint
|
||||
_adapter = _factory.Create(_protocolType, newConfig);
|
||||
_connectionDetails = newConfig;
|
||||
|
||||
// Wire disconnect handler on new adapter
|
||||
_adapter.Disconnected += OnAdapterDisconnected;
|
||||
|
||||
// DataConnectionLayer-011: new adapter — bump the generation so callbacks
|
||||
// from the disposed adapter are recognised as stale and dropped.
|
||||
_adapterGeneration++;
|
||||
|
||||
_log.Warning("[{0}] Failing over from {1} to {2}",
|
||||
_connectionName, previousEndpoint, _activeEndpoint);
|
||||
|
||||
// Log failover event to site event log
|
||||
if (_siteEventLogger != null)
|
||||
{
|
||||
_ = _siteEventLogger.LogEventAsync(
|
||||
"connection", "Warning", null, _connectionName,
|
||||
$"Failover from {previousEndpoint} to {_activeEndpoint}",
|
||||
$"After {_failoverRetryCount} consecutive failures");
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
var retryLimit = _backupConfig != null ? _failoverRetryCount.ToString() : "∞";
|
||||
_log.Warning("[{0}] Reconnect failed: {1}. Retrying in {2}s (attempt {3}/{4})",
|
||||
_connectionName, result.Error, _options.ReconnectInterval.TotalSeconds,
|
||||
_consecutiveFailures, retryLimit);
|
||||
}
|
||||
|
||||
CountFailureAndMaybeFailover(result.Error);
|
||||
Timers.StartSingleTimer("reconnect", new AttemptConnect(), _options.ReconnectInterval);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Shared connect-failure handling for both the initial connect (Connecting state)
|
||||
/// and reconnect (Reconnecting state). Assumes <see cref="_consecutiveFailures"/> has
|
||||
/// already been incremented for the current failure. Switches to the other endpoint
|
||||
/// once the retry count is exhausted and a backup is configured
|
||||
/// (DataConnectionLayer-015 brought the initial connect onto this path).
|
||||
/// </summary>
|
||||
private void CountFailureAndMaybeFailover(string? error)
|
||||
{
|
||||
// Failover: switch endpoint after exhausting retry count (only if backup is configured)
|
||||
if (_backupConfig != null && _consecutiveFailures >= _failoverRetryCount)
|
||||
{
|
||||
var previousEndpoint = _activeEndpoint;
|
||||
_activeEndpoint = _activeEndpoint == ActiveEndpoint.Primary
|
||||
? ActiveEndpoint.Backup
|
||||
: ActiveEndpoint.Primary;
|
||||
_consecutiveFailures = 0;
|
||||
|
||||
var newConfig = _activeEndpoint == ActiveEndpoint.Primary
|
||||
? _primaryConfig
|
||||
: _backupConfig;
|
||||
|
||||
// Dispose old adapter (fire-and-forget — don't await in actor context)
|
||||
_adapter.Disconnected -= OnAdapterDisconnected;
|
||||
_ = _adapter.DisposeAsync().AsTask();
|
||||
|
||||
// Create new adapter for the target endpoint
|
||||
_adapter = _factory.Create(_protocolType, newConfig);
|
||||
_connectionDetails = newConfig;
|
||||
|
||||
// Wire disconnect handler on new adapter
|
||||
_adapter.Disconnected += OnAdapterDisconnected;
|
||||
|
||||
// DataConnectionLayer-011: new adapter — bump the generation so callbacks
|
||||
// from the disposed adapter are recognised as stale and dropped.
|
||||
_adapterGeneration++;
|
||||
|
||||
_log.Warning("[{0}] Failing over from {1} to {2}",
|
||||
_connectionName, previousEndpoint, _activeEndpoint);
|
||||
|
||||
// Log failover event to site event log
|
||||
if (_siteEventLogger != null)
|
||||
{
|
||||
_ = _siteEventLogger.LogEventAsync(
|
||||
"connection", "Warning", null, _connectionName,
|
||||
$"Failover from {previousEndpoint} to {_activeEndpoint}",
|
||||
$"After {_failoverRetryCount} consecutive failures");
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
var retryLimit = _backupConfig != null ? _failoverRetryCount.ToString() : "∞";
|
||||
_log.Warning("[{0}] Connect failed: {1}. Retrying in {2}s (attempt {3}/{4})",
|
||||
_connectionName, error, _options.ReconnectInterval.TotalSeconds,
|
||||
_consecutiveFailures, retryLimit);
|
||||
}
|
||||
}
|
||||
|
||||
private void HandleDisconnect()
|
||||
{
|
||||
_log.Warning("[{0}] AdapterDisconnected message received — transitioning to Reconnecting", _connectionName);
|
||||
@@ -663,8 +679,18 @@ public class DataConnectionActor : UntypedActor, IWithStash, IWithTimers
|
||||
_options.TagResolutionRetryInterval);
|
||||
}
|
||||
|
||||
msg.ReplyTo.Tell(new SubscribeTagsResponse(
|
||||
msg.Request.CorrelationId, instanceName, true, null, DateTimeOffset.UtcNow));
|
||||
// DataConnectionLayer-016: the response must match the actor's own assessment.
|
||||
// When a connection-level failure is driving the actor into Reconnecting, the
|
||||
// tags were never subscribed at the adapter — replying Success: true would tell
|
||||
// the Instance Actor the subscribe succeeded when it did not. Genuine
|
||||
// tag-resolution failures stay Success: true (they are a runtime quality concern
|
||||
// tracked via _unresolvedTags, with a Bad-quality TagValueUpdate already pushed).
|
||||
msg.ReplyTo.Tell(connectionLevelFailure
|
||||
? new SubscribeTagsResponse(
|
||||
msg.Request.CorrelationId, instanceName, false,
|
||||
"connection unavailable — will re-subscribe on reconnect", DateTimeOffset.UtcNow)
|
||||
: new SubscribeTagsResponse(
|
||||
msg.Request.CorrelationId, instanceName, true, null, DateTimeOffset.UtcNow));
|
||||
|
||||
// The caller (Connected state only) decides whether to enter Reconnecting.
|
||||
// In Connecting/Reconnecting the connection is not established anyway, so the
|
||||
|
||||
@@ -228,10 +228,28 @@ public class OpcUaDataConnection : IDataConnection
|
||||
|
||||
public async Task<IReadOnlyDictionary<string, WriteResult>> WriteBatchAsync(IDictionary<string, object?> values, CancellationToken cancellationToken = default)
|
||||
{
|
||||
// DataConnectionLayer-017: a mid-batch fault must not abort the whole batch.
|
||||
// WriteAsync calls EnsureConnected(), which throws InvalidOperationException when
|
||||
// the connection drops partway through; catch per-tag exceptions and record a
|
||||
// failed WriteResult so the caller (including WriteBatchAndWaitAsync) receives a
|
||||
// complete result map. OperationCanceledException is still propagated so a
|
||||
// cancelled batch aborts as a whole — mirrors the DCL-007 fix for ReadBatchAsync.
|
||||
var results = new Dictionary<string, WriteResult>();
|
||||
foreach (var (tagPath, value) in values)
|
||||
{
|
||||
results[tagPath] = await WriteAsync(tagPath, value, cancellationToken);
|
||||
try
|
||||
{
|
||||
results[tagPath] = await WriteAsync(tagPath, value, cancellationToken);
|
||||
}
|
||||
catch (OperationCanceledException)
|
||||
{
|
||||
// Cancellation aborts the whole batch — propagate it.
|
||||
throw;
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
results[tagPath] = new WriteResult(false, ex.Message);
|
||||
}
|
||||
}
|
||||
return results;
|
||||
}
|
||||
|
||||
@@ -316,11 +316,24 @@ public class RealOpcUaClientFactory : IOpcUaClientFactory
|
||||
{
|
||||
private readonly OpcUaGlobalOptions _globalOptions;
|
||||
|
||||
// DataConnectionLayer-014: a real logger must be threaded through to every
|
||||
// RealOpcUaClient this factory builds, otherwise the DCL-012 auto-accept-certificate
|
||||
// warning emitted in RealOpcUaClient.ConnectAsync sinks into NullLogger and is never
|
||||
// seen in production. The factory is constructed by DataConnectionFactory, which has
|
||||
// an ILoggerFactory available.
|
||||
private readonly ILoggerFactory _loggerFactory;
|
||||
|
||||
public RealOpcUaClientFactory() : this(new OpcUaGlobalOptions()) { }
|
||||
|
||||
public RealOpcUaClientFactory(OpcUaGlobalOptions globalOptions)
|
||||
: this(globalOptions, NullLoggerFactory.Instance) { }
|
||||
|
||||
public RealOpcUaClientFactory(OpcUaGlobalOptions globalOptions, ILoggerFactory loggerFactory)
|
||||
{
|
||||
_globalOptions = globalOptions;
|
||||
_loggerFactory = loggerFactory;
|
||||
}
|
||||
|
||||
public IOpcUaClient Create() => new RealOpcUaClient(_globalOptions);
|
||||
public IOpcUaClient Create() =>
|
||||
new RealOpcUaClient(_globalOptions, _loggerFactory.CreateLogger<RealOpcUaClient>());
|
||||
}
|
||||
|
||||
@@ -22,9 +22,12 @@ public class DataConnectionFactory : IDataConnectionFactory
|
||||
_loggerFactory = loggerFactory;
|
||||
var globalOptions = opcUaGlobalOptions.Value;
|
||||
|
||||
// Register built-in protocols
|
||||
// Register built-in protocols.
|
||||
// DataConnectionLayer-014: pass the ILoggerFactory into RealOpcUaClientFactory so
|
||||
// the RealOpcUaClient it builds gets a real logger — without it the DCL-012
|
||||
// auto-accept-certificate security warning is silently discarded by NullLogger.
|
||||
RegisterAdapter("OpcUa", details => new OpcUaDataConnection(
|
||||
new RealOpcUaClientFactory(globalOptions),
|
||||
new RealOpcUaClientFactory(globalOptions, _loggerFactory),
|
||||
_loggerFactory.CreateLogger<OpcUaDataConnection>()));
|
||||
}
|
||||
|
||||
|
||||
@@ -803,6 +803,132 @@ public class DataConnectionActorTests : TestKit
|
||||
ExpectNoMsg(TimeSpan.FromSeconds(1));
|
||||
}
|
||||
|
||||
// ── DataConnectionLayer-015: initial-connect failures must trigger failover ──
|
||||
|
||||
[Fact]
|
||||
public async Task DCL015_PrimaryDownAtStartup_FailsOverToBackup()
|
||||
{
|
||||
// Regression test for DataConnectionLayer-015. HandleConnectResult — the handler
|
||||
// for the INITIAL connection attempt in the Connecting state — only logged and
|
||||
// re-armed the reconnect timer. It never incremented _consecutiveFailures and
|
||||
// never switched endpoint, so a primary that is unreachable when the actor first
|
||||
// starts (a fresh deployment, a site restart, a primary simply down) retried the
|
||||
// primary forever and never tried the configured backup. After the fix the
|
||||
// initial connect participates in the failover counter like HandleReconnectResult.
|
||||
var primaryConfig = new Dictionary<string, string> { ["Endpoint"] = "opc.tcp://primary:4840" };
|
||||
var backupConfig = new Dictionary<string, string> { ["Endpoint"] = "opc.tcp://backup:4840" };
|
||||
var primaryAdapter = Substitute.For<IDataConnection>();
|
||||
var backupAdapter = Substitute.For<IDataConnection>();
|
||||
|
||||
// Primary is down from the very first attempt — it never connects.
|
||||
primaryAdapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
|
||||
.Returns(Task.FromException(new Exception("Connection refused")));
|
||||
|
||||
// Factory returns the backup adapter when called with the backup config.
|
||||
_mockFactory.Create("OpcUa", Arg.Is<IDictionary<string, string>>(d => d["Endpoint"] == "opc.tcp://backup:4840"))
|
||||
.Returns(backupAdapter);
|
||||
backupAdapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
|
||||
.Returns(Task.CompletedTask);
|
||||
|
||||
var actor = CreateFailoverActor(primaryAdapter, "dcl015-startup-failover",
|
||||
primaryConfig, backupConfig, failoverRetryCount: 2);
|
||||
|
||||
// After failoverRetryCount initial-connect failures on the primary, the actor
|
||||
// must build the backup adapter. Pre-fix the factory was never called.
|
||||
AwaitCondition(() =>
|
||||
_mockFactory.ReceivedCalls().Any(c =>
|
||||
c.GetMethodInfo().Name == "Create" &&
|
||||
c.GetArguments()[1] is IDictionary<string, string> d &&
|
||||
d["Endpoint"] == "opc.tcp://backup:4840"),
|
||||
TimeSpan.FromSeconds(5));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task DCL015_SingleEndpointDownAtStartup_RetriesIndefinitely_NoFailover()
|
||||
{
|
||||
// Companion guard: a single-endpoint connection (no backup) whose primary is
|
||||
// unreachable at startup must keep retrying the same endpoint indefinitely — the
|
||||
// initial-connect failover counter must not synthesise a non-existent backup.
|
||||
var primaryConfig = new Dictionary<string, string> { ["Endpoint"] = "opc.tcp://primary:4840" };
|
||||
var primaryAdapter = Substitute.For<IDataConnection>();
|
||||
var connectCount = 0;
|
||||
primaryAdapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
|
||||
.Returns(_ =>
|
||||
{
|
||||
Interlocked.Increment(ref connectCount);
|
||||
return Task.FromException(new Exception("Connection refused"));
|
||||
});
|
||||
|
||||
var actor = CreateFailoverActor(primaryAdapter, "dcl015-no-backup",
|
||||
primaryConfig, backupConfig: null, failoverRetryCount: 2);
|
||||
|
||||
// Many retries occur (well past the failover threshold) but no adapter is ever
|
||||
// created via the factory — there is nothing to fail over to.
|
||||
AwaitCondition(() => connectCount >= 6, TimeSpan.FromSeconds(10));
|
||||
_mockFactory.DidNotReceive().Create(Arg.Any<string>(), Arg.Any<IDictionary<string, string>>());
|
||||
}
|
||||
|
||||
// ── DataConnectionLayer-016: subscribe response must reflect a connection-level failure ──
|
||||
|
||||
[Fact]
|
||||
public async Task DCL016_ConnectionLevelSubscribeFailure_RepliesWithUnsuccessfulResponse()
|
||||
{
|
||||
// Regression test for DataConnectionLayer-016. When a subscribe arrives while the
|
||||
// adapter is silently down, HandleSubscribeCompleted drove the actor into
|
||||
// Reconnecting (a connection-level failure) but still replied to the caller with
|
||||
// SubscribeTagsResponse(Success: true, Error: null). The Instance Actor was told
|
||||
// the subscribe succeeded while the tags were never actually subscribed at the
|
||||
// adapter. After the fix the response matches the actor's own assessment:
|
||||
// Success: false with an explanatory error.
|
||||
_mockAdapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
|
||||
.Returns(Task.CompletedTask);
|
||||
_mockAdapter.Status.Returns(ConnectionHealth.Connected);
|
||||
// Subscribe fails at connection level (InvalidOperationException from EnsureConnected).
|
||||
_mockAdapter.SubscribeAsync(Arg.Any<string>(), Arg.Any<SubscriptionCallback>(), Arg.Any<CancellationToken>())
|
||||
.Returns(Task.FromException<string>(
|
||||
new InvalidOperationException("OPC UA client is not connected.")));
|
||||
_mockAdapter.ReadAsync(Arg.Any<string>(), Arg.Any<CancellationToken>())
|
||||
.Returns(new ReadResult(false, null, null));
|
||||
|
||||
var actor = CreateConnectionActor("dcl016-conn-fail");
|
||||
await Task.Delay(300);
|
||||
|
||||
actor.Tell(new SubscribeTagsRequest(
|
||||
"c1", "inst1", "dcl016-conn-fail", ["some/tag"], DateTimeOffset.UtcNow));
|
||||
|
||||
// The response must NOT claim success — the connection-level failure that drove
|
||||
// Reconnecting means the tags were never subscribed at the adapter.
|
||||
var response = ExpectMsg<SubscribeTagsResponse>(TimeSpan.FromSeconds(5));
|
||||
Assert.False(response.Success);
|
||||
Assert.NotNull(response.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task DCL016_GenuineResolutionFailure_StillRepliesSuccess()
|
||||
{
|
||||
// Companion guard: a genuine tag-resolution failure (the node does not exist) is
|
||||
// a runtime quality concern, not a connection-level fault — the design tracks it
|
||||
// via _unresolvedTags and a Bad-quality TagValueUpdate. The overall subscribe
|
||||
// response stays Success: true so this case is not regressed by the 016 fix.
|
||||
_mockAdapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
|
||||
.Returns(Task.CompletedTask);
|
||||
_mockAdapter.Status.Returns(ConnectionHealth.Connected);
|
||||
_mockAdapter.SubscribeAsync("missing/tag", Arg.Any<SubscriptionCallback>(), Arg.Any<CancellationToken>())
|
||||
.Returns(Task.FromException<string>(new KeyNotFoundException("node not found")));
|
||||
_mockAdapter.ReadAsync(Arg.Any<string>(), Arg.Any<CancellationToken>())
|
||||
.Returns(new ReadResult(false, null, null));
|
||||
|
||||
var actor = CreateConnectionActor("dcl016-genuine");
|
||||
await Task.Delay(300);
|
||||
|
||||
actor.Tell(new SubscribeTagsRequest(
|
||||
"c1", "inst1", "dcl016-genuine", ["missing/tag"], DateTimeOffset.UtcNow));
|
||||
|
||||
var ack = FishForMessage<SubscribeTagsResponse>(_ => true, TimeSpan.FromSeconds(5));
|
||||
Assert.True(ack.Success);
|
||||
Assert.Null(ack.ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task DCL001_SubscribeWithFailedTags_CountsResolvedAndUnresolvedSeparately()
|
||||
{
|
||||
|
||||
@@ -1,7 +1,9 @@
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using NSubstitute;
|
||||
using ScadaLink.Commons.Interfaces.Protocol;
|
||||
using ScadaLink.Commons.Types.Enums;
|
||||
using ScadaLink.DataConnectionLayer;
|
||||
using ScadaLink.DataConnectionLayer.Adapters;
|
||||
|
||||
namespace ScadaLink.DataConnectionLayer.Tests;
|
||||
@@ -54,6 +56,63 @@ public class OpcUaCertificateDefaultTests
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// DataConnectionLayer-014: the DCL-012 auto-accept-certificate security warning is
|
||||
/// only effective if RealOpcUaClient is built with a real logger. The only production
|
||||
/// path that constructs one is RealOpcUaClientFactory.Create(); that factory must
|
||||
/// thread a logger through, otherwise the warning sinks into NullLogger and an
|
||||
/// operator who enables AutoAcceptUntrustedCerts sees no signal anywhere.
|
||||
/// </summary>
|
||||
public class RealOpcUaClientFactoryLoggerTests
|
||||
{
|
||||
private static ILogger<RealOpcUaClient> ReadLogger(RealOpcUaClient client)
|
||||
{
|
||||
var field = typeof(RealOpcUaClient).GetField("_logger",
|
||||
System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic);
|
||||
Assert.NotNull(field);
|
||||
return (ILogger<RealOpcUaClient>)field!.GetValue(client)!;
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void DCL014_RealOpcUaClientFactory_CreatesClientWithRealLogger()
|
||||
{
|
||||
// Regression test for DataConnectionLayer-014. RealOpcUaClientFactory.Create()
|
||||
// constructed `new RealOpcUaClient(_globalOptions)` with no logger, so the
|
||||
// DCL-012 man-in-the-middle warning was always discarded by NullLogger in
|
||||
// production. After the fix the factory accepts an ILoggerFactory and passes a
|
||||
// real ILogger<RealOpcUaClient> into every client it creates.
|
||||
using var loggerFactory = LoggerFactory.Create(b => { });
|
||||
var factory = new RealOpcUaClientFactory(new OpcUaGlobalOptions(), loggerFactory);
|
||||
|
||||
var client = factory.Create();
|
||||
|
||||
var logger = ReadLogger((RealOpcUaClient)client);
|
||||
Assert.NotSame(NullLogger<RealOpcUaClient>.Instance, logger);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void DCL014_DataConnectionFactory_ThreadsLoggerToRealOpcUaClient()
|
||||
{
|
||||
// The full production wiring: DataConnectionFactory holds an ILoggerFactory and
|
||||
// registers the OpcUa adapter. The RealOpcUaClient it ultimately builds must end
|
||||
// up with a real (non-Null) logger so the auto-accept-cert warning is visible.
|
||||
using var loggerFactory = LoggerFactory.Create(b => { });
|
||||
var dataConnectionFactory = new DataConnectionFactory(loggerFactory);
|
||||
var adapter = (OpcUaDataConnection)dataConnectionFactory.Create(
|
||||
"OpcUa", new Dictionary<string, string>());
|
||||
|
||||
// Reach the RealOpcUaClient the adapter would create on connect.
|
||||
var clientFactoryField = typeof(OpcUaDataConnection).GetField("_clientFactory",
|
||||
System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic);
|
||||
Assert.NotNull(clientFactoryField);
|
||||
var clientFactory = (RealOpcUaClientFactory)clientFactoryField!.GetValue(adapter)!;
|
||||
var client = (RealOpcUaClient)clientFactory.Create();
|
||||
|
||||
var logger = ReadLogger(client);
|
||||
Assert.NotSame(NullLogger<RealOpcUaClient>.Instance, logger);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// DataConnectionLayer-009: failover-stability tunables must be configurable.
|
||||
/// </summary>
|
||||
@@ -269,6 +328,64 @@ public class OpcUaDataConnectionTests
|
||||
Assert.NotNull(results["bad"].ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task DCL017_WriteBatch_ReturnsPerTagResults_WhenConnectionDropsMidBatch()
|
||||
{
|
||||
// Regression test for DataConnectionLayer-017. WriteBatchAsync looped calling
|
||||
// WriteAsync per tag; WriteAsync first calls EnsureConnected(), which throws
|
||||
// InvalidOperationException when the client is disconnected. WriteBatchAsync did
|
||||
// not catch that, so a connection dropping partway through a batch made the whole
|
||||
// WriteBatchAsync throw — the caller lost the per-tag outcomes for the tags that
|
||||
// already wrote. After the fix (mirroring DCL-007's ReadBatchAsync) each per-tag
|
||||
// failure is recorded as a failed WriteResult and the batch returns a complete map.
|
||||
var writeCount = 0;
|
||||
// First write succeeds; then the client "disconnects" so EnsureConnected throws.
|
||||
_mockClient.IsConnected.Returns(_ => Interlocked.Increment(ref writeCount) <= 1);
|
||||
_mockClient.WriteValueAsync(Arg.Any<string>(), Arg.Any<object?>(), Arg.Any<CancellationToken>())
|
||||
.Returns((uint)0);
|
||||
|
||||
// Connect leaves IsConnected true for the first WriteAsync's EnsureConnected check.
|
||||
_mockClient.IsConnected.Returns(true);
|
||||
await _adapter.ConnectAsync(new Dictionary<string, string>());
|
||||
|
||||
// Re-arm: IsConnected true for tag1's check, false for tag2 and tag3.
|
||||
var checks = 0;
|
||||
_mockClient.IsConnected.Returns(_ => Interlocked.Increment(ref checks) <= 1);
|
||||
|
||||
var results = await _adapter.WriteBatchAsync(new Dictionary<string, object?>
|
||||
{
|
||||
["tag1"] = 1,
|
||||
["tag2"] = 2,
|
||||
["tag3"] = 3
|
||||
});
|
||||
|
||||
// Every requested tag is present in the result map — the batch was not aborted.
|
||||
Assert.Equal(3, results.Count);
|
||||
Assert.True(results["tag1"].Success);
|
||||
// tag2 and tag3 fail at the connection check but are reported per-tag.
|
||||
Assert.False(results["tag2"].Success);
|
||||
Assert.NotNull(results["tag2"].ErrorMessage);
|
||||
Assert.False(results["tag3"].Success);
|
||||
Assert.NotNull(results["tag3"].ErrorMessage);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task DCL017_WriteBatch_CancellationAbortsWholeBatch()
|
||||
{
|
||||
// Companion guard: a cancelled batch must still abort as a whole — only
|
||||
// connection/device faults are demoted to per-tag results, never cancellation.
|
||||
_mockClient.IsConnected.Returns(true);
|
||||
await _adapter.ConnectAsync(new Dictionary<string, string>());
|
||||
|
||||
using var cts = new CancellationTokenSource();
|
||||
cts.Cancel();
|
||||
_mockClient.WriteValueAsync(Arg.Any<string>(), Arg.Any<object?>(), Arg.Any<CancellationToken>())
|
||||
.Returns<uint>(_ => throw new OperationCanceledException());
|
||||
|
||||
await Assert.ThrowsAnyAsync<OperationCanceledException>(() =>
|
||||
_adapter.WriteBatchAsync(new Dictionary<string, object?> { ["tag1"] = 1 }, cts.Token));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task NotConnected_ThrowsOnOperations()
|
||||
{
|
||||
|
||||
Reference in New Issue
Block a user