docs(dcl): add primary/backup data connections implementation plan
8 tasks with TDD steps, exact file paths, and code samples. Covers entity model, failover state machine, health reporting, UI, CLI, management API, deployment, and documentation.
This commit is contained in:
695
docs/plans/2026-03-22-primary-backup-data-connections.md
Normal file
695
docs/plans/2026-03-22-primary-backup-data-connections.md
Normal file
@@ -0,0 +1,695 @@
|
|||||||
|
# Primary/Backup Data Connection Endpoints — Implementation Plan
|
||||||
|
|
||||||
|
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.
|
||||||
|
|
||||||
|
**Goal:** Add optional backup endpoints to data connections with automatic failover after configurable retry count.
|
||||||
|
|
||||||
|
**Architecture:** The `DataConnectionActor` gains failover logic in its Reconnecting state — after N failed retries on the active endpoint, it disposes the adapter and creates a fresh one with the other endpoint's config. Adapters remain single-endpoint. Entity model splits `Configuration` into `PrimaryConfiguration` + `BackupConfiguration`.
|
||||||
|
|
||||||
|
**Tech Stack:** C# / .NET 10, Akka.NET, EF Core, Blazor Server, System.CommandLine
|
||||||
|
|
||||||
|
**Design doc:** `docs/plans/2026-03-22-primary-backup-data-connections-design.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 1: Entity Model & Database Migration
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/ScadaLink.Commons/Entities/Sites/DataConnection.cs`
|
||||||
|
- Modify: `src/ScadaLink.ConfigurationDatabase/Configurations/SiteConfiguration.cs` (lines 32-56)
|
||||||
|
- Modify: `src/ScadaLink.Commons/Messages/Artifacts/DataConnectionArtifact.cs`
|
||||||
|
|
||||||
|
### Step 1: Update DataConnection entity
|
||||||
|
|
||||||
|
In `DataConnection.cs`, rename `Configuration` to `PrimaryConfiguration`, add `BackupConfiguration` and `FailoverRetryCount`:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public class DataConnection
|
||||||
|
{
|
||||||
|
public int Id { get; set; }
|
||||||
|
public int SiteId { get; set; }
|
||||||
|
public string Name { get; set; }
|
||||||
|
public string Protocol { get; set; }
|
||||||
|
public string? PrimaryConfiguration { get; set; }
|
||||||
|
public string? BackupConfiguration { get; set; }
|
||||||
|
public int FailoverRetryCount { get; set; } = 3;
|
||||||
|
|
||||||
|
public DataConnection(int siteId, string name, string protocol)
|
||||||
|
{
|
||||||
|
SiteId = siteId;
|
||||||
|
Name = name ?? throw new ArgumentNullException(nameof(name));
|
||||||
|
Protocol = protocol ?? throw new ArgumentNullException(nameof(protocol));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Update EF Core mapping
|
||||||
|
|
||||||
|
In `SiteConfiguration.cs`, update the DataConnection mapping (around lines 46-47):
|
||||||
|
|
||||||
|
- Rename `Configuration` property mapping to `PrimaryConfiguration` (MaxLength 4000)
|
||||||
|
- Add `BackupConfiguration` property (optional, MaxLength 4000)
|
||||||
|
- Add `FailoverRetryCount` property (required, default 3)
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
builder.Property(d => d.PrimaryConfiguration).HasMaxLength(4000);
|
||||||
|
builder.Property(d => d.BackupConfiguration).HasMaxLength(4000);
|
||||||
|
builder.Property(d => d.FailoverRetryCount).HasDefaultValue(3);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Create EF Core migration
|
||||||
|
|
||||||
|
Run:
|
||||||
|
```bash
|
||||||
|
cd src/ScadaLink.ConfigurationDatabase
|
||||||
|
dotnet ef migrations add AddDataConnectionBackupEndpoint \
|
||||||
|
--startup-project ../ScadaLink.Host
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify the migration renames `Configuration` → `PrimaryConfiguration` (should use `RenameColumn`, not drop+add). If the scaffolded migration drops and recreates, manually fix it:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
migrationBuilder.RenameColumn(
|
||||||
|
name: "Configuration",
|
||||||
|
table: "DataConnections",
|
||||||
|
newName: "PrimaryConfiguration");
|
||||||
|
|
||||||
|
migrationBuilder.AddColumn<string>(
|
||||||
|
name: "BackupConfiguration",
|
||||||
|
table: "DataConnections",
|
||||||
|
maxLength: 4000,
|
||||||
|
nullable: true);
|
||||||
|
|
||||||
|
migrationBuilder.AddColumn<int>(
|
||||||
|
name: "FailoverRetryCount",
|
||||||
|
table: "DataConnections",
|
||||||
|
nullable: false,
|
||||||
|
defaultValue: 3);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: Update DataConnectionArtifact
|
||||||
|
|
||||||
|
In `DataConnectionArtifact.cs`, replace single `ConfigurationJson` with both:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public record DataConnectionArtifact(
|
||||||
|
string Name,
|
||||||
|
string Protocol,
|
||||||
|
string? PrimaryConfigurationJson,
|
||||||
|
string? BackupConfigurationJson,
|
||||||
|
int FailoverRetryCount = 3);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 5: Build and fix compile errors
|
||||||
|
|
||||||
|
Run: `dotnet build ScadaLink.slnx`
|
||||||
|
|
||||||
|
This will surface all references to the old `Configuration` and `ConfigurationJson` fields across the codebase. Fix each one — this includes:
|
||||||
|
- ManagementActor handlers
|
||||||
|
- CLI commands
|
||||||
|
- UI pages
|
||||||
|
- Deployment/flattening code
|
||||||
|
- Tests
|
||||||
|
|
||||||
|
Fix only the field name renames in this step (use `PrimaryConfiguration` where `Configuration` was). Don't add backup logic yet — just make it compile.
|
||||||
|
|
||||||
|
### Step 6: Run tests, fix failures
|
||||||
|
|
||||||
|
Run: `dotnet test ScadaLink.slnx`
|
||||||
|
|
||||||
|
Fix any test failures caused by the rename.
|
||||||
|
|
||||||
|
### Step 7: Commit
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add -A
|
||||||
|
git commit -m "feat(dcl): rename Configuration to PrimaryConfiguration, add BackupConfiguration and FailoverRetryCount"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 2: Update CreateConnectionCommand & Manager Actor
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/ScadaLink.Commons/Messages/DataConnection/CreateConnectionCommand.cs`
|
||||||
|
- Modify: `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionManagerActor.cs` (lines 39-62)
|
||||||
|
|
||||||
|
### Step 1: Update CreateConnectionCommand message
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public record CreateConnectionCommand(
|
||||||
|
string ConnectionName,
|
||||||
|
string ProtocolType,
|
||||||
|
IDictionary<string, string> PrimaryConnectionDetails,
|
||||||
|
IDictionary<string, string>? BackupConnectionDetails = null,
|
||||||
|
int FailoverRetryCount = 3);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Update DataConnectionManagerActor.HandleCreateConnection
|
||||||
|
|
||||||
|
Update the handler (around line 39-62) to pass both configs to DataConnectionActor:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
private void HandleCreateConnection(CreateConnectionCommand command)
|
||||||
|
{
|
||||||
|
if (_connectionActors.ContainsKey(command.ConnectionName))
|
||||||
|
{
|
||||||
|
_log.Warning("Connection {0} already exists", command.ConnectionName);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
var adapter = _factory.Create(command.ProtocolType, command.PrimaryConnectionDetails);
|
||||||
|
|
||||||
|
var props = Props.Create(() => new DataConnectionActor(
|
||||||
|
command.ConnectionName,
|
||||||
|
adapter,
|
||||||
|
_options,
|
||||||
|
_healthCollector,
|
||||||
|
command.ProtocolType,
|
||||||
|
command.PrimaryConnectionDetails,
|
||||||
|
command.BackupConnectionDetails,
|
||||||
|
command.FailoverRetryCount));
|
||||||
|
|
||||||
|
var actorName = new string(command.ConnectionName
|
||||||
|
.Select(c => char.IsLetterOrDigit(c) || "-_.*$+:@&=,!~';()".Contains(c) ? c : '-')
|
||||||
|
.ToArray());
|
||||||
|
var actorRef = Context.ActorOf(props, actorName);
|
||||||
|
_connectionActors[command.ConnectionName] = actorRef;
|
||||||
|
|
||||||
|
_log.Info("Created DataConnectionActor for {0} (protocol={1}, backup={2})",
|
||||||
|
command.ConnectionName, command.ProtocolType, command.BackupConnectionDetails != null ? "yes" : "none");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Update all callers of CreateConnectionCommand
|
||||||
|
|
||||||
|
Search for all places that construct `CreateConnectionCommand` and update them to use the new signature. The primary caller is the site-side deployment handler.
|
||||||
|
|
||||||
|
### Step 4: Build and test
|
||||||
|
|
||||||
|
Run: `dotnet build ScadaLink.slnx && dotnet test tests/ScadaLink.DataConnectionLayer.Tests`
|
||||||
|
|
||||||
|
### Step 5: Commit
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add -A
|
||||||
|
git commit -m "feat(dcl): extend CreateConnectionCommand with backup config and failover retry count"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 3: DataConnectionActor Failover State Machine
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionActor.cs`
|
||||||
|
- Modify: `src/ScadaLink.DataConnectionLayer/DataConnectionFactory.cs`
|
||||||
|
|
||||||
|
This is the core change. The actor gains failover logic in its Reconnecting state.
|
||||||
|
|
||||||
|
### Step 1: Add new state fields to DataConnectionActor
|
||||||
|
|
||||||
|
Add these fields alongside the existing ones (around line 30):
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
private readonly string _protocolType;
|
||||||
|
private readonly IDictionary<string, string> _primaryConfig;
|
||||||
|
private readonly IDictionary<string, string>? _backupConfig;
|
||||||
|
private readonly int _failoverRetryCount;
|
||||||
|
private readonly IDataConnectionFactory _factory;
|
||||||
|
private ActiveEndpoint _activeEndpoint = ActiveEndpoint.Primary;
|
||||||
|
private int _consecutiveFailures;
|
||||||
|
|
||||||
|
public enum ActiveEndpoint { Primary, Backup }
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Update constructor
|
||||||
|
|
||||||
|
Extend the constructor to accept both configs and the factory:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public DataConnectionActor(
|
||||||
|
string connectionName,
|
||||||
|
IDataConnection adapter,
|
||||||
|
DataConnectionOptions options,
|
||||||
|
ISiteHealthCollector healthCollector,
|
||||||
|
string protocolType,
|
||||||
|
IDictionary<string, string> primaryConfig,
|
||||||
|
IDictionary<string, string>? backupConfig = null,
|
||||||
|
int failoverRetryCount = 3)
|
||||||
|
{
|
||||||
|
_connectionName = connectionName;
|
||||||
|
_adapter = adapter;
|
||||||
|
_options = options;
|
||||||
|
_healthCollector = healthCollector;
|
||||||
|
_protocolType = protocolType;
|
||||||
|
_primaryConfig = primaryConfig;
|
||||||
|
_backupConfig = backupConfig;
|
||||||
|
_failoverRetryCount = failoverRetryCount;
|
||||||
|
_connectionDetails = primaryConfig; // start with primary
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Note: The actor also needs `IDataConnectionFactory` injected to create new adapters on failover. Pass it through the constructor or resolve via DI. The `DataConnectionManagerActor` already has the factory — pass it through to the actor constructor.
|
||||||
|
|
||||||
|
### Step 3: Extend HandleReconnectResult with failover logic
|
||||||
|
|
||||||
|
Replace the reconnect failure handling (around lines 279-296) to include failover:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
private void HandleReconnectResult(ConnectResult result)
|
||||||
|
{
|
||||||
|
if (result.Success)
|
||||||
|
{
|
||||||
|
_consecutiveFailures = 0;
|
||||||
|
_log.Info("Reconnected {0} on {1} endpoint", _connectionName, _activeEndpoint);
|
||||||
|
ReSubscribeAll();
|
||||||
|
BecomeConnected();
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
_consecutiveFailures++;
|
||||||
|
_log.Warning("Reconnect attempt {0}/{1} failed for {2} on {3}: {4}",
|
||||||
|
_consecutiveFailures, _failoverRetryCount, _connectionName, _activeEndpoint, result.Error);
|
||||||
|
|
||||||
|
if (_consecutiveFailures >= _failoverRetryCount && _backupConfig != null)
|
||||||
|
{
|
||||||
|
// Switch endpoint
|
||||||
|
var previousEndpoint = _activeEndpoint;
|
||||||
|
_activeEndpoint = _activeEndpoint == ActiveEndpoint.Primary
|
||||||
|
? ActiveEndpoint.Backup
|
||||||
|
: ActiveEndpoint.Primary;
|
||||||
|
_consecutiveFailures = 0;
|
||||||
|
|
||||||
|
var newConfig = _activeEndpoint == ActiveEndpoint.Primary ? _primaryConfig : _backupConfig;
|
||||||
|
|
||||||
|
_log.Warning("Failing over {0} from {1} to {2}", _connectionName, previousEndpoint, _activeEndpoint);
|
||||||
|
|
||||||
|
// Dispose old adapter, create new one
|
||||||
|
_ = _adapter.DisposeAsync();
|
||||||
|
_adapter = _factory.Create(_protocolType, newConfig);
|
||||||
|
_connectionDetails = newConfig;
|
||||||
|
|
||||||
|
// Wire up disconnect handler on new adapter
|
||||||
|
_adapter.Disconnected += () => _self.Tell(new AdapterDisconnected());
|
||||||
|
}
|
||||||
|
|
||||||
|
// Schedule next retry
|
||||||
|
Context.System.Scheduler.ScheduleTellOnce(
|
||||||
|
_options.ReconnectInterval, Self, AttemptConnect.Instance, ActorRefs.NoSender);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: Pass IDataConnectionFactory to DataConnectionActor
|
||||||
|
|
||||||
|
Update `DataConnectionManagerActor.HandleCreateConnection` to pass the factory:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var props = Props.Create(() => new DataConnectionActor(
|
||||||
|
command.ConnectionName, adapter, _options, _healthCollector,
|
||||||
|
_factory, // pass factory for failover adapter creation
|
||||||
|
command.ProtocolType, command.PrimaryConnectionDetails,
|
||||||
|
command.BackupConnectionDetails, command.FailoverRetryCount));
|
||||||
|
```
|
||||||
|
|
||||||
|
And update the DataConnectionActor constructor to store `_factory`.
|
||||||
|
|
||||||
|
### Step 5: Build and run existing tests
|
||||||
|
|
||||||
|
Run: `dotnet build ScadaLink.slnx && dotnet test tests/ScadaLink.DataConnectionLayer.Tests`
|
||||||
|
|
||||||
|
Existing tests must pass (they use single-endpoint configs, so no failover triggered).
|
||||||
|
|
||||||
|
### Step 6: Commit
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add -A
|
||||||
|
git commit -m "feat(dcl): add failover state machine to DataConnectionActor with round-robin endpoint switching"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 4: Failover Tests
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `tests/ScadaLink.DataConnectionLayer.Tests/DataConnectionActorTests.cs`
|
||||||
|
|
||||||
|
### Step 1: Write test — failover after N retries
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
[Fact]
|
||||||
|
public async Task Reconnecting_AfterFailoverRetryCount_SwitchesToBackup()
|
||||||
|
{
|
||||||
|
// Arrange: create actor with primary + backup, failoverRetryCount = 2
|
||||||
|
var primaryAdapter = Substitute.For<IDataConnection>();
|
||||||
|
var backupAdapter = Substitute.For<IDataConnection>();
|
||||||
|
var factory = Substitute.For<IDataConnectionFactory>();
|
||||||
|
factory.Create("OpcUa", Arg.Is<IDictionary<string, string>>(d => d["endpoint"] == "backup"))
|
||||||
|
.Returns(backupAdapter);
|
||||||
|
|
||||||
|
// Primary connects then disconnects
|
||||||
|
primaryAdapter.ConnectAsync(Arg.Any<IDictionary<string, string>>(), Arg.Any<CancellationToken>())
|
||||||
|
.Returns(Task.CompletedTask);
|
||||||
|
primaryAdapter.Status.Returns(ConnectionHealth.Connected);
|
||||||
|
|
||||||
|
var primaryConfig = new Dictionary<string, string> { ["endpoint"] = "primary" };
|
||||||
|
var backupConfig = new Dictionary<string, string> { ["endpoint"] = "backup" };
|
||||||
|
|
||||||
|
// Create actor, connect on primary
|
||||||
|
// ... (use test kit patterns from existing tests)
|
||||||
|
// Simulate disconnect, verify 2 failures then factory.Create called with backup config
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Write test — single endpoint retries forever
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
[Fact]
|
||||||
|
public async Task Reconnecting_NoBackup_RetriesIndefinitely()
|
||||||
|
{
|
||||||
|
// Arrange: create actor with primary only, no backup
|
||||||
|
// Simulate 10 reconnect failures
|
||||||
|
// Verify: factory.Create never called with backup, just keeps retrying
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Write test — round-robin back to primary after backup fails
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
[Fact]
|
||||||
|
public async Task Reconnecting_BackupFails_SwitchesBackToPrimary()
|
||||||
|
{
|
||||||
|
// Arrange: primary + backup, failoverRetryCount = 1
|
||||||
|
// Simulate: primary fails 1x → switch to backup → backup fails 1x → switch to primary
|
||||||
|
// Verify: round-robin pattern
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: Write test — successful reconnect resets counter
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
[Fact]
|
||||||
|
public async Task Reconnecting_SuccessfulConnect_ResetsConsecutiveFailures()
|
||||||
|
{
|
||||||
|
// Arrange: failoverRetryCount = 3
|
||||||
|
// Simulate: 2 failures on primary, then success
|
||||||
|
// Verify: no failover, counter reset
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 5: Write test — ReSubscribeAll called after failover
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
[Fact]
|
||||||
|
public async Task Failover_ReSubscribesAllTagsOnNewAdapter()
|
||||||
|
{
|
||||||
|
// Arrange: actor with subscriptions, then failover
|
||||||
|
// Verify: new adapter receives SubscribeAsync calls for all previously subscribed tags
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 6: Run all tests
|
||||||
|
|
||||||
|
Run: `dotnet test tests/ScadaLink.DataConnectionLayer.Tests -v`
|
||||||
|
|
||||||
|
### Step 7: Commit
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add -A
|
||||||
|
git commit -m "test(dcl): add failover state machine tests for DataConnectionActor"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 5: Health Reporting & Site Event Logging
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/ScadaLink.Commons/Messages/DataConnection/DataConnectionHealthReport.cs`
|
||||||
|
- Modify: `src/ScadaLink.DataConnectionLayer/Actors/DataConnectionActor.cs` (ReplyWithHealthReport, HandleReconnectResult)
|
||||||
|
|
||||||
|
### Step 1: Add ActiveEndpoint to health report
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public record DataConnectionHealthReport(
|
||||||
|
string ConnectionName,
|
||||||
|
ConnectionHealth Status,
|
||||||
|
int TotalSubscribedTags,
|
||||||
|
int ResolvedTags,
|
||||||
|
string ActiveEndpoint,
|
||||||
|
DateTimeOffset Timestamp);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Update ReplyWithHealthReport in DataConnectionActor
|
||||||
|
|
||||||
|
Update the health report method (around line 516) to include the active endpoint:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
private void ReplyWithHealthReport()
|
||||||
|
{
|
||||||
|
var endpointLabel = _backupConfig == null
|
||||||
|
? "Primary (no backup)"
|
||||||
|
: _activeEndpoint.ToString();
|
||||||
|
|
||||||
|
Sender.Tell(new DataConnectionHealthReport(
|
||||||
|
_connectionName, _adapter.Status,
|
||||||
|
_subscriptionsByInstance.Values.Sum(s => s.Count),
|
||||||
|
_resolvedTags,
|
||||||
|
endpointLabel,
|
||||||
|
DateTimeOffset.UtcNow));
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Add site event logging on failover
|
||||||
|
|
||||||
|
In `HandleReconnectResult`, after switching endpoints, log a site event:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
if (_siteEventLogger != null)
|
||||||
|
{
|
||||||
|
_ = _siteEventLogger.LogEventAsync(
|
||||||
|
"connection", "Warning", null, _connectionName,
|
||||||
|
$"Failover from {previousEndpoint} to {_activeEndpoint}",
|
||||||
|
$"After {_failoverRetryCount} consecutive failures");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Note: The actor needs `ISiteEventLogger` injected. Add it as an optional constructor parameter.
|
||||||
|
|
||||||
|
### Step 4: Add site event logging on successful reconnect after failover
|
||||||
|
|
||||||
|
In `HandleReconnectResult` success path, if the endpoint changed from last known good:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
if (_siteEventLogger != null)
|
||||||
|
{
|
||||||
|
_ = _siteEventLogger.LogEventAsync(
|
||||||
|
"connection", "Info", null, _connectionName,
|
||||||
|
$"Connection restored on {_activeEndpoint} endpoint", null);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 5: Build and test
|
||||||
|
|
||||||
|
Run: `dotnet build ScadaLink.slnx && dotnet test tests/ScadaLink.DataConnectionLayer.Tests`
|
||||||
|
|
||||||
|
### Step 6: Commit
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add -A
|
||||||
|
git commit -m "feat(dcl): add active endpoint to health reports and log failover events"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 6: Central UI Changes
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/ScadaLink.CentralUI/Components/Pages/Admin/DataConnections.razor`
|
||||||
|
- Modify: `src/ScadaLink.CentralUI/Components/Pages/Admin/DataConnectionForm.razor`
|
||||||
|
|
||||||
|
### Step 1: Update DataConnections list page
|
||||||
|
|
||||||
|
Add `Active Endpoint` column to the table (around line 28-64). Insert after the Protocol column:
|
||||||
|
|
||||||
|
```html
|
||||||
|
<th>Active Endpoint</th>
|
||||||
|
```
|
||||||
|
|
||||||
|
And in the row template:
|
||||||
|
|
||||||
|
```html
|
||||||
|
<td>@connection.ActiveEndpoint</td>
|
||||||
|
```
|
||||||
|
|
||||||
|
This requires the list page to fetch health data alongside the connection list. Add a health status lookup or include `ActiveEndpoint` in the data connection response.
|
||||||
|
|
||||||
|
### Step 2: Update DataConnectionForm — rename Configuration label
|
||||||
|
|
||||||
|
Change the "Configuration" label to "Primary Endpoint Configuration" (around line 44-61).
|
||||||
|
|
||||||
|
### Step 3: Add backup endpoint section
|
||||||
|
|
||||||
|
Below the primary config field, add:
|
||||||
|
|
||||||
|
```html
|
||||||
|
@if (!_showBackup)
|
||||||
|
{
|
||||||
|
<button type="button" class="btn btn-outline-secondary btn-sm mt-2"
|
||||||
|
@onclick="() => _showBackup = true">
|
||||||
|
Add Backup Endpoint
|
||||||
|
</button>
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
<div class="mt-3">
|
||||||
|
<div class="d-flex justify-content-between align-items-center">
|
||||||
|
<label class="form-label">Backup Endpoint Configuration</label>
|
||||||
|
<button type="button" class="btn btn-outline-danger btn-sm"
|
||||||
|
@onclick="RemoveBackup">
|
||||||
|
Remove Backup
|
||||||
|
</button>
|
||||||
|
</div>
|
||||||
|
<textarea class="form-control" rows="4"
|
||||||
|
@bind="_model.BackupConfiguration"
|
||||||
|
placeholder='{"Host": "backup-host", "Port": 50101}' />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="mt-3">
|
||||||
|
<label class="form-label">Failover Retry Count</label>
|
||||||
|
<input type="number" class="form-control" min="1" max="20"
|
||||||
|
@bind="_model.FailoverRetryCount" />
|
||||||
|
<small class="text-muted">Retries before switching to backup (default: 3)</small>
|
||||||
|
</div>
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: Update form model and save logic
|
||||||
|
|
||||||
|
Add `BackupConfiguration` and `FailoverRetryCount` to the form model. Update the save method to pass both configs to the management API.
|
||||||
|
|
||||||
|
In edit mode, set `_showBackup = true` if `BackupConfiguration` is not null.
|
||||||
|
|
||||||
|
### Step 5: Build and verify visually
|
||||||
|
|
||||||
|
Run: `dotnet build ScadaLink.slnx`
|
||||||
|
|
||||||
|
Visual verification requires running the cluster — document as manual test.
|
||||||
|
|
||||||
|
### Step 6: Commit
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add -A
|
||||||
|
git commit -m "feat(ui): add primary/backup endpoint fields to data connection form"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 7: CLI, Management API, and Deployment
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/ScadaLink.Commons/Messages/Management/DataConnectionCommands.cs`
|
||||||
|
- Modify: `src/ScadaLink.CLI/Commands/DataConnectionCommands.cs`
|
||||||
|
- Modify: `src/ScadaLink.ManagementService/ManagementActor.cs` (lines 689-711)
|
||||||
|
- Modify: Deployment/flattening code that creates DataConnectionArtifact
|
||||||
|
|
||||||
|
### Step 1: Update management command messages
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public record CreateDataConnectionCommand(
|
||||||
|
int SiteId, string Name, string Protocol,
|
||||||
|
string? PrimaryConfiguration,
|
||||||
|
string? BackupConfiguration = null,
|
||||||
|
int FailoverRetryCount = 3);
|
||||||
|
|
||||||
|
public record UpdateDataConnectionCommand(
|
||||||
|
int DataConnectionId, string Name, string Protocol,
|
||||||
|
string? PrimaryConfiguration,
|
||||||
|
string? BackupConfiguration = null,
|
||||||
|
int FailoverRetryCount = 3);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Update ManagementActor handlers
|
||||||
|
|
||||||
|
In `HandleCreateDataConnection` (around line 689): set `PrimaryConfiguration`, `BackupConfiguration`, `FailoverRetryCount` from command.
|
||||||
|
|
||||||
|
In `HandleUpdateDataConnection` (around line 699): same fields.
|
||||||
|
|
||||||
|
### Step 3: Update CLI commands
|
||||||
|
|
||||||
|
In `BuildCreate` (around line 75-98):
|
||||||
|
- Rename `--configuration` to `--primary-config`
|
||||||
|
- Add hidden alias `--configuration` pointing to same option
|
||||||
|
- Add `--backup-config` option (optional)
|
||||||
|
- Add `--failover-retry-count` option (optional, default 3)
|
||||||
|
|
||||||
|
In `BuildUpdate` (around line 36-59): same changes.
|
||||||
|
|
||||||
|
In `BuildGet` (around line 22-34): update output to show both configs.
|
||||||
|
|
||||||
|
### Step 4: Update deployment artifact creation
|
||||||
|
|
||||||
|
Find where `DataConnectionArtifact` is constructed (in deployment/flattening code). Update to pass `PrimaryConfigurationJson` and `BackupConfigurationJson` from the entity.
|
||||||
|
|
||||||
|
### Step 5: Build and test CLI
|
||||||
|
|
||||||
|
Run: `dotnet build ScadaLink.slnx`
|
||||||
|
|
||||||
|
Test CLI manually:
|
||||||
|
```bash
|
||||||
|
scadalink data-connection create --site-id 1 --name "Test" --protocol OpcUa \
|
||||||
|
--primary-config '{"endpoint":"opc.tcp://localhost:50000"}' \
|
||||||
|
--backup-config '{"endpoint":"opc.tcp://localhost:50010"}' \
|
||||||
|
--failover-retry-count 3
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 6: Commit
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add -A
|
||||||
|
git commit -m "feat(cli): add --primary-config, --backup-config, --failover-retry-count to data connection commands"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 8: Documentation Updates
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `docs/requirements/Component-DataConnectionLayer.md`
|
||||||
|
- Modify: `docs/requirements/HighLevelReqs.md`
|
||||||
|
- Modify: `docs/requirements/Component-CentralUI.md`
|
||||||
|
- Modify: `docs/test_infra/test_infra.md`
|
||||||
|
|
||||||
|
### Step 1: Update Component-DataConnectionLayer.md
|
||||||
|
|
||||||
|
Add new section "Endpoint Redundancy" covering:
|
||||||
|
- Optional backup endpoints
|
||||||
|
- Failover state machine (include ASCII diagram from design doc)
|
||||||
|
- Configuration model (PrimaryConfiguration + BackupConfiguration)
|
||||||
|
- Failover retry count and round-robin behavior
|
||||||
|
- Subscription re-creation on failover
|
||||||
|
- Health reporting (ActiveEndpoint field)
|
||||||
|
- Site event logging (DataConnectionFailover, DataConnectionRestored)
|
||||||
|
|
||||||
|
Update the configuration reference tables to show the new entity fields.
|
||||||
|
|
||||||
|
### Step 2: Update HighLevelReqs.md
|
||||||
|
|
||||||
|
Add requirement: "Data connections support optional backup endpoints with automatic failover after configurable retry count. On failover, all subscriptions are transparently re-created on the new endpoint."
|
||||||
|
|
||||||
|
### Step 3: Update Component-CentralUI.md
|
||||||
|
|
||||||
|
Update the Data Connections workflow section to describe:
|
||||||
|
- Primary/backup config fields on the form
|
||||||
|
- Collapsible backup section
|
||||||
|
- Failover retry count field
|
||||||
|
- Active endpoint column on list page
|
||||||
|
|
||||||
|
### Step 4: Update test_infra.md
|
||||||
|
|
||||||
|
Add a note in the Remote Test Infrastructure section that the dual OPC UA servers (50000/50010) and dual LmxProxy instances (50100/50101) enable primary/backup testing.
|
||||||
|
|
||||||
|
### Step 5: Commit
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add -A
|
||||||
|
git commit -m "docs(dcl): document primary/backup endpoint redundancy across requirements and test infra"
|
||||||
|
```
|
||||||
@@ -0,0 +1,14 @@
|
|||||||
|
{
|
||||||
|
"planPath": "docs/plans/2026-03-22-primary-backup-data-connections.md",
|
||||||
|
"tasks": [
|
||||||
|
{"id": 1, "subject": "Task 1: Entity Model & Database Migration", "status": "pending"},
|
||||||
|
{"id": 2, "subject": "Task 2: Update CreateConnectionCommand & Manager Actor", "status": "pending", "blockedBy": [1]},
|
||||||
|
{"id": 3, "subject": "Task 3: DataConnectionActor Failover State Machine", "status": "pending", "blockedBy": [1, 2]},
|
||||||
|
{"id": 4, "subject": "Task 4: Failover Tests", "status": "pending", "blockedBy": [3]},
|
||||||
|
{"id": 5, "subject": "Task 5: Health Reporting & Site Event Logging", "status": "pending", "blockedBy": [3]},
|
||||||
|
{"id": 6, "subject": "Task 6: Central UI Changes", "status": "pending", "blockedBy": [1]},
|
||||||
|
{"id": 7, "subject": "Task 7: CLI, Management API, and Deployment", "status": "pending", "blockedBy": [1]},
|
||||||
|
{"id": 8, "subject": "Task 8: Documentation Updates", "status": "pending", "blockedBy": [3]}
|
||||||
|
],
|
||||||
|
"lastUpdated": "2026-03-22T12:00:00Z"
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user