Files
lmxopcua/docs/plans/2026-05-26-akka-hosting-alignment-plan.md
Joseph Doherty fac32ad69b docs(plans): add v2 implementation plan with 66 bite-sized tasks
Converts the akka-hosting-alignment design into an executable plan:
12 phases covering branch/scaffold, ConfigDb schema, Commons,
Cluster, Security, ControlPlane singletons, Runtime per-node actors,
OpcUaServer extraction, AdminUI migration, Host entry point, cleanup,
integration tests, deploy scripts, and docs. Each task has files,
TDD steps, exact commands, classification, time estimate, and
parallelizable list. Co-located .tasks.json drives executing-plans
resume from any session.
2026-05-26 03:17:29 -04:00

80 KiB

OtOpcUa v2 — Akka.NET + Fused Hosting Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.

Goal: Fuse OtOpcUa.Server and OtOpcUa.Admin into a single role-gated binary (OtOpcUa.Host), introduce an Akka.NET cluster (admin/driver roles) for control-plane singletons and per-node runtime actors, replace the draft/publish ConfigGeneration lifecycle with a live-edit + snapshot-deploy model, and drive OPC UA ServiceLevel from Akka cluster leadership while preserving the dual-endpoint warm-redundancy client behavior.

Architecture: Single solution with new component libraries (Cluster, Security, ControlPlane, Runtime, OpcUaServer, AdminUI, Commons) reused by one Host web binary. Akka 1.5.62 with Akka.Hosting + Akka.Cluster.Hosting + Akka.Cluster.Tools. Cluster singletons pinned to admin role; per-node actor trees on driver-role nodes. Existing ZB.MOM.WW.OtOpcUa.Configuration project keeps the EF Core DbContext (renamed-in-place, no project rename) and grows new tables for Deployment, NodeDeploymentState, ConfigEdit, DataProtectionKeys. EF migrations executed via auto-migration on dev + idempotent SQL script Migrate-To-V2.ps1 for prod.

Tech Stack: .NET 10, Akka.NET 1.5.62 (Akka.Hosting, Akka.Cluster.Hosting, Akka.Cluster.Tools, Akka.Remote.Hosting, Akka.Streams), EF Core 10.0.7 (SQL Server), Blazor Server, SignalR, OPCFoundation .NET Standard stack, LDAP (Novell.Directory.Ldap.NETStandard), Bootstrap 5 (vendored).

Design source: docs/plans/2026-05-26-akka-hosting-alignment-design.md. Always read it before starting a task; it is the spec.

Branch: v2-akka-fuse off master.

Reference project: Sister repo ~/Desktop/scadalink-design — copy patterns, not code (different domain). Pattern files to copy from:

  • ScadaLink HOCON: src/ScadaLink.Host/Akka/akka.conf
  • ScadaLink Security setup: src/ScadaLink.Security/ServiceCollectionExtensions.cs
  • ScadaLink Cluster bootstrap: src/ScadaLink.Host/Program.cs:60-228
  • ScadaLink ClusterSingleton pattern: src/ScadaLink.ManagementService/

Conventions for every task

  • Branch: Stay on v2-akka-fuse. Never commit to master while plan is running.
  • TDD where it makes sense: New actors, new domain logic — write the test first. Pure refactors / file moves — verify-by-build is enough.
  • Build command: dotnet build ZB.MOM.WW.OtOpcUa.slnx — must be green before commit.
  • Test command: dotnet test ZB.MOM.WW.OtOpcUa.slnx --no-build — relevant new/changed tests must pass.
  • Commit format: Conventional Commits — feat(scope):, refactor(scope):, chore(scope):, test(scope):. Scope examples: host, cluster, runtime, controlplane, security, adminui, configdb.
  • Mac compatibility: All code must build on macOS. Windows-only APIs (AddWindowsService, Galaxy/Wonderware drivers) must be gated by OperatingSystem.IsWindows() or [SupportedOSPlatform].

Phase 0 — Branch & scaffolding

Task 0: Create branch and central package management

Classification: small Estimated implement time: ~3 min Parallelizable with: none (first task)

Files:

  • Create: /Users/dohertj2/Desktop/OtOpcUa/Directory.Packages.props
  • Create: /Users/dohertj2/Desktop/OtOpcUa/Directory.Build.props

Step 1: Create branch

cd ~/Desktop/OtOpcUa
git checkout -b v2-akka-fuse

Step 2: Create Directory.Packages.props with central package management for Akka + EF Core + ASP.NET Core. Source versions from ~/Desktop/scadalink-design/Directory.Packages.props. At minimum include:

<Project>
  <PropertyGroup>
    <ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
  </PropertyGroup>
  <ItemGroup>
    <PackageVersion Include="Akka" Version="1.5.62" />
    <PackageVersion Include="Akka.Cluster" Version="1.5.62" />
    <PackageVersion Include="Akka.Cluster.Hosting" Version="1.5.62" />
    <PackageVersion Include="Akka.Cluster.Tools" Version="1.5.62" />
    <PackageVersion Include="Akka.Hosting" Version="1.5.62" />
    <PackageVersion Include="Akka.Remote" Version="1.5.62" />
    <PackageVersion Include="Akka.Remote.Hosting" Version="1.5.62" />
    <PackageVersion Include="Akka.Streams" Version="1.5.62" />
    <PackageVersion Include="Akka.Streams.TestKit" Version="1.5.62" />
    <PackageVersion Include="Akka.TestKit.Xunit2" Version="1.5.62" />
    <PackageVersion Include="Microsoft.AspNetCore.Authentication.JwtBearer" Version="10.0.7" />
    <PackageVersion Include="Microsoft.AspNetCore.DataProtection.EntityFrameworkCore" Version="10.0.7" />
    <PackageVersion Include="Microsoft.EntityFrameworkCore" Version="10.0.7" />
    <PackageVersion Include="Microsoft.EntityFrameworkCore.Design" Version="10.0.7" />
    <PackageVersion Include="Microsoft.EntityFrameworkCore.SqlServer" Version="10.0.7" />
    <PackageVersion Include="Microsoft.EntityFrameworkCore.InMemory" Version="10.0.7" />
  </ItemGroup>
</Project>

Audit the existing .csproj files for any package not listed; add it to Directory.Packages.props and strip the Version attribute from the csprojs.

Step 3: Create minimal Directory.Build.props:

<Project>
  <PropertyGroup>
    <TargetFramework>net10.0</TargetFramework>
    <Nullable>enable</Nullable>
    <ImplicitUsings>enable</ImplicitUsings>
    <TreatWarningsAsErrors>true</TreatWarningsAsErrors>
    <LangVersion>latest</LangVersion>
  </PropertyGroup>
</Project>

Step 4: Build green check

Run: dotnet build ZB.MOM.WW.OtOpcUa.slnx Expected: Build succeeded. If any csproj has a duplicate Version after centralization, fix.

Step 5: Commit

git add Directory.Packages.props Directory.Build.props
git commit -m "chore(build): introduce central package management for v2"

Task 1: Create OtOpcUa.Commons project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 2, 3, 4, 5, 6, 7, 8

Files:

  • Create: /Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/ZB.MOM.WW.OtOpcUa.Commons.csproj
  • Create: /Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/.gitkeep
  • Create: /Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/.gitkeep
  • Create: /Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/.gitkeep
  • Modify: /Users/dohertj2/Desktop/OtOpcUa/ZB.MOM.WW.OtOpcUa.slnx (add Commons project)

Step 1: Create csproj

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <RootNamespace>ZB.MOM.WW.OtOpcUa.Commons</RootNamespace>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Akka" />
  </ItemGroup>
</Project>

Step 2: Add to solution

Run: dotnet sln ZB.MOM.WW.OtOpcUa.slnx add src/Core/ZB.MOM.WW.OtOpcUa.Commons/ZB.MOM.WW.OtOpcUa.Commons.csproj

Step 3: Build green

Run: dotnet build ZB.MOM.WW.OtOpcUa.slnx Expected: Build succeeded.

Step 4: Commit

git add src/Core/ZB.MOM.WW.OtOpcUa.Commons/ ZB.MOM.WW.OtOpcUa.slnx
git commit -m "feat(commons): scaffold OtOpcUa.Commons project"

Task 2: Create OtOpcUa.Cluster project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 3, 4, 5, 6, 7, 8

Files:

  • Create: /Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ZB.MOM.WW.OtOpcUa.Cluster.csproj
  • Modify: ZB.MOM.WW.OtOpcUa.slnx

Step 1: Create csproj

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <RootNamespace>ZB.MOM.WW.OtOpcUa.Cluster</RootNamespace>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Akka.Hosting" />
    <PackageReference Include="Akka.Cluster" />
    <PackageReference Include="Akka.Cluster.Hosting" />
    <PackageReference Include="Akka.Cluster.Tools" />
    <PackageReference Include="Akka.Remote.Hosting" />
    <PackageReference Include="Microsoft.Extensions.Hosting" />
    <PackageReference Include="Microsoft.Extensions.Options.ConfigurationExtensions" />
  </ItemGroup>
  <ItemGroup>
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Commons\ZB.MOM.WW.OtOpcUa.Commons.csproj" />
  </ItemGroup>
</Project>

Step 2-4: add to solution, build, commit (feat(cluster): scaffold OtOpcUa.Cluster project).


Task 3: Create OtOpcUa.Security project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 2, 4, 5, 6, 7, 8

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/ZB.MOM.WW.OtOpcUa.Security.csproj
  • Modify: ZB.MOM.WW.OtOpcUa.slnx

csproj: classlib targeting net10.0, references OtOpcUa.Commons, OtOpcUa.Configuration. Packages: Microsoft.AspNetCore.Authentication.Cookies, Microsoft.AspNetCore.Authentication.JwtBearer, Microsoft.IdentityModel.Tokens, System.IdentityModel.Tokens.Jwt, Novell.Directory.Ldap.NETStandard.

Commit: feat(security): scaffold OtOpcUa.Security project.


Task 4: Create OtOpcUa.ControlPlane project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 2, 3, 5, 6, 7, 8

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/ZB.MOM.WW.OtOpcUa.ControlPlane.csproj

csproj: classlib, references OtOpcUa.Commons, OtOpcUa.Cluster, OtOpcUa.Configuration. Packages: Akka.Hosting, Akka.Cluster.Tools, Microsoft.AspNetCore.SignalR.Core.

Commit: feat(controlplane): scaffold OtOpcUa.ControlPlane project.


Task 5: Create OtOpcUa.Runtime project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 2, 3, 4, 6, 7, 8

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ZB.MOM.WW.OtOpcUa.Runtime.csproj

csproj: classlib, references OtOpcUa.Commons, OtOpcUa.Cluster, OtOpcUa.Configuration, OtOpcUa.OpcUaServer, all OtOpcUa.Driver.* abstraction projects (NOT concrete driver implementations — those are loaded reflectively). Packages: Akka.Hosting, Akka.Cluster.Tools.

Commit: feat(runtime): scaffold OtOpcUa.Runtime project.


Task 6: Create OtOpcUa.OpcUaServer project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 2, 3, 4, 5, 7, 8

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/ZB.MOM.WW.OtOpcUa.OpcUaServer.csproj

csproj: classlib, references OtOpcUa.Commons, OtOpcUa.Configuration. Packages: OPCFoundation.NetStandard.Opc.Ua.Server, OPCFoundation.NetStandard.Opc.Ua.Configuration. Copy exact versions from current ZB.MOM.WW.OtOpcUa.Server.csproj.

Commit: feat(opcua): scaffold OtOpcUa.OpcUaServer project.


Task 7: Create OtOpcUa.AdminUI Razor class library

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 2, 3, 4, 5, 6, 8

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/ZB.MOM.WW.OtOpcUa.AdminUI.csproj

csproj:

<Project Sdk="Microsoft.NET.Sdk.Razor">
  <PropertyGroup>
    <RootNamespace>ZB.MOM.WW.OtOpcUa.AdminUI</RootNamespace>
    <AddRazorSupportForMvc>true</AddRazorSupportForMvc>
  </PropertyGroup>
  <ItemGroup>
    <FrameworkReference Include="Microsoft.AspNetCore.App" />
  </ItemGroup>
  <ItemGroup>
    <ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Commons\ZB.MOM.WW.OtOpcUa.Commons.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Security\ZB.MOM.WW.OtOpcUa.Security.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.ControlPlane\ZB.MOM.WW.OtOpcUa.ControlPlane.csproj" />
    <ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Configuration\ZB.MOM.WW.OtOpcUa.Configuration.csproj" />
  </ItemGroup>
</Project>

Commit: feat(adminui): scaffold OtOpcUa.AdminUI Razor class library.


Task 8: Create OtOpcUa.Host Web SDK project

Classification: small Estimated implement time: ~5 min Parallelizable with: Task 1, 2, 3, 4, 5, 6, 7

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs (minimal "Hello, host" stub)
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Properties/launchSettings.json

csproj:

<Project Sdk="Microsoft.NET.Sdk.Web">
  <PropertyGroup>
    <RootNamespace>ZB.MOM.WW.OtOpcUa.Host</RootNamespace>
    <UserSecretsId>zb-mom-ww-otopcua-host</UserSecretsId>
    <AssemblyName>OtOpcUa.Host</AssemblyName>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.Extensions.Hosting.WindowsServices" Condition="'$([System.Runtime.InteropServices.RuntimeInformation]::IsOSPlatform($([System.Runtime.InteropServices.OSPlatform]::Windows)))' == 'true'" />
    <PackageReference Include="Serilog.AspNetCore" />
    <PackageReference Include="Akka.Hosting" />
  </ItemGroup>
  <ItemGroup>
    <ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Commons\ZB.MOM.WW.OtOpcUa.Commons.csproj" />
    <ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Cluster\ZB.MOM.WW.OtOpcUa.Cluster.csproj" />
    <ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Configuration\ZB.MOM.WW.OtOpcUa.Configuration.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Security\ZB.MOM.WW.OtOpcUa.Security.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.ControlPlane\ZB.MOM.WW.OtOpcUa.ControlPlane.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Runtime\ZB.MOM.WW.OtOpcUa.Runtime.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.OpcUaServer\ZB.MOM.WW.OtOpcUa.OpcUaServer.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.AdminUI\ZB.MOM.WW.OtOpcUa.AdminUI.csproj" />
  </ItemGroup>
</Project>

Stub Program.cs:

var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();
app.MapGet("/", () => "OtOpcUa.Host scaffold");
await app.RunAsync();

appsettings.json: empty {} for now.

launchSettings.json: profile OtOpcUa.Host with applicationUrl=http://localhost:9000.

Commit: feat(host): scaffold OtOpcUa.Host web project.


Task 9: Build green smoke

Classification: trivial Estimated implement time: ~2 min Parallelizable with: none (depends on Tasks 0-8)

Step 1: Run dotnet build ZB.MOM.WW.OtOpcUa.slnx. Expected: succeeded, no warnings-as-errors. Fix anything that broke. No commit (verification only).


Phase 1 — ConfigDb schema (live-edit + deploy model)

Task 10: Add Deployment entity

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 11, 12, 13

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/Deployment.cs
  • Modify: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/OtOpcUaConfigDbContext.cs (add DbSet<Deployment> + OnModelCreating mapping)

Step 1: Create Deployment.cs:

namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;

public sealed class Deployment
{
    public Guid DeploymentId { get; init; } = Guid.NewGuid();
    public required string RevisionHash { get; init; }
    public DeploymentStatus Status { get; set; } = DeploymentStatus.Dispatching;
    public required string CreatedBy { get; init; }
    public DateTime CreatedAtUtc { get; init; } = DateTime.UtcNow;
    public byte[] ArtifactBlob { get; init; } = Array.Empty<byte>();
    public byte[] RowVersion { get; set; } = Array.Empty<byte>();
    public string? FailureReason { get; set; }
    public DateTime? SealedAtUtc { get; set; }
}

public enum DeploymentStatus
{
    Dispatching = 0,
    AwaitingApplyAcks = 1,
    Sealed = 2,
    PartiallyFailed = 3,
    TimedOut = 4
}

Step 2: Add mapping in OtOpcUaConfigDbContext.OnModelCreating:

modelBuilder.Entity<Deployment>(b =>
{
    b.ToTable("Deployment");
    b.HasKey(d => d.DeploymentId);
    b.Property(d => d.RevisionHash).HasMaxLength(64).IsRequired();
    b.Property(d => d.Status).HasConversion<int>();
    b.Property(d => d.CreatedBy).HasMaxLength(128).IsRequired();
    b.Property(d => d.FailureReason).HasMaxLength(2048);
    b.Property(d => d.RowVersion).IsRowVersion();
    b.HasIndex(d => d.Status);
    b.HasIndex(d => d.CreatedAtUtc);
});

Step 3: Build green. Commit: feat(configdb): add Deployment entity.


Task 11: Add NodeDeploymentState entity (replaces ClusterNodeGenerationState)

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 10, 12, 13

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/NodeDeploymentState.cs
  • Modify: OtOpcUaConfigDbContext.cs (add DbSet + mapping)

Schema: (NodeId, DeploymentId) composite key; Status enum Applying|Applied|Failed; StartedAtUtc, AppliedAtUtc?, FailureReason?, RowVersion.

Do NOT delete ClusterNodeGenerationState.cs yet — keep it for the migration step in Task 14.

Commit: feat(configdb): add NodeDeploymentState entity.


Task 12: Add ConfigEdit audit entity

Classification: small Estimated implement time: ~4 min Parallelizable with: Task 10, 11, 13

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigEdit.cs
  • Modify: OtOpcUaConfigDbContext.cs

Schema: (EditId GUID PK, EntityType string, EntityId GUID, FieldsJson nvarchar(max), ExecutionId GUID NULL, EditedBy, EditedAtUtc, SourceNode).

Captures per-row edits to Equipment, Driver, DriverInstance, Script, etc. Inserted by AdminOperationsActor on every mutating op.

Commit: feat(configdb): add ConfigEdit audit entity.


Task 13: Add DataProtection keys table

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 10, 11, 12

Files:

  • Modify: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/ZB.MOM.WW.OtOpcUa.Configuration.csproj — add Microsoft.AspNetCore.DataProtection.EntityFrameworkCore package
  • Modify: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/OtOpcUaConfigDbContext.cs — implement IDataProtectionKeyContext:
public DbSet<Microsoft.AspNetCore.DataProtection.EntityFrameworkCore.DataProtectionKey> DataProtectionKeys
    => Set<Microsoft.AspNetCore.DataProtection.EntityFrameworkCore.DataProtectionKey>();

Commit: feat(configdb): persist DataProtection keys in ConfigDb.


Task 14: EF migration — drop ConfigGeneration and ClusterNode.RedundancyRole, add new tables

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: none (depends on Tasks 10-13)

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/<timestamp>_V2HostingAlignment.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/<timestamp>_V2HostingAlignment.Designer.cs
  • Modify: OtOpcUaConfigDbContext.cs — remove DbSet<ConfigGeneration> and DbSet<ClusterNodeGenerationState>; remove ClusterNode.RedundancyRole property
  • Delete: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigGeneration.cs
  • Delete: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ClusterNodeGenerationState.cs

Step 1: Generate migration

Run: dotnet ef migrations add V2HostingAlignment --project src/Core/ZB.MOM.WW.OtOpcUa.Configuration --startup-project src/Server/ZB.MOM.WW.OtOpcUa.Host

If dotnet-ef isn't installed: dotnet tool install --global dotnet-ef --version 10.0.7.

Step 2: Audit the generated migration — it should:

  • DropTable("ConfigGeneration")
  • DropTable("ClusterNodeGenerationState")
  • DropColumn("RedundancyRole", "ClusterNode")
  • CreateTable("Deployment", ...)
  • CreateTable("NodeDeploymentState", ...)
  • CreateTable("ConfigEdit", ...)
  • CreateTable("DataProtectionKeys", ...)

If extra changes appear (e.g., column-type drift), reconcile by editing the entity classes — do not edit the migration directly.

Step 3: Verify on a scratch SQL Server

docker run --rm -d --name v2-migration-test -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=Pass@word123" -p 14333:1433 mcr.microsoft.com/mssql/server:2022-latest
# Wait ~10s for SQL Server to start
ConnectionStrings__ConfigDb="Server=localhost,14333;Database=OtOpcUaV2Test;User Id=sa;Password=Pass@word123;TrustServerCertificate=true" \
  dotnet ef database update --project src/Core/ZB.MOM.WW.OtOpcUa.Configuration --startup-project src/Server/ZB.MOM.WW.OtOpcUa.Host

Expected: completes without error. Verify with docker exec v2-migration-test /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P Pass@word123 -d OtOpcUaV2Test -Q "SELECT name FROM sys.tables ORDER BY name".

Step 4: Tear down

docker stop v2-migration-test.

Step 5: Commit

git add src/Core/ZB.MOM.WW.OtOpcUa.Configuration/
git commit -m "feat(configdb): V2HostingAlignment migration — drop ConfigGeneration, add Deployment+NodeDeploymentState+ConfigEdit"

Task 15: Migrate-To-V2.ps1 idempotent prod migration script

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 16, 17, 18 (Phase 2)

Files:

  • Create: scripts/migration/Migrate-To-V2.ps1
  • Create: scripts/migration/Migrate-To-V2.sql (the idempotent SQL output)

Step 1: Generate idempotent SQL from EF

Run: dotnet ef migrations script --idempotent --project src/Core/ZB.MOM.WW.OtOpcUa.Configuration --startup-project src/Server/ZB.MOM.WW.OtOpcUa.Host --output scripts/migration/Migrate-To-V2.sql

Step 2: PowerShell wrapper:

[CmdletBinding()]
param(
  [Parameter(Mandatory)][string] $ConnectionString,
  [string] $BackupPath = "$env:TEMP\OtOpcUa-V1-Backup-$(Get-Date -Format yyyyMMddHHmmss).bak"
)

Write-Host "Step 1/4 — Backup ConfigDb to $BackupPath"
Invoke-Sqlcmd -ConnectionString $ConnectionString -Query "BACKUP DATABASE [OtOpcUaConfigDb] TO DISK = '$BackupPath' WITH FORMAT, COMPRESSION"

Write-Host "Step 2/4 — Row counts before"
$beforeCounts = Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\count-rows.sql"
$beforeCounts | Format-Table

Write-Host "Step 3/4 — Apply Migrate-To-V2.sql"
Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\Migrate-To-V2.sql"

Write-Host "Step 4/4 — Row counts after + validation"
$afterCounts = Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\count-rows.sql"
$afterCounts | Format-Table

# Validation gates
$tablesNow = (Invoke-Sqlcmd -ConnectionString $ConnectionString -Query "SELECT name FROM sys.tables ORDER BY name").name
foreach ($t in 'Deployment','NodeDeploymentState','ConfigEdit','DataProtectionKeys') {
  if ($tablesNow -notcontains $t) { throw "Expected table $t missing." }
}
foreach ($t in 'ConfigGeneration','ClusterNodeGenerationState') {
  if ($tablesNow -contains $t) { throw "Legacy table $t still present." }
}
Write-Host "Migration complete. Backup at $BackupPath"

Also create scripts/migration/count-rows.sql listing per-table row counts for the audit.

Commit: feat(migration): add Migrate-To-V2.ps1 idempotent migration runner.


Phase 2 — Commons types and contracts

Task 16: Common types

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 17, 18

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/CorrelationId.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/ExecutionId.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/NodeId.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/DeploymentId.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/RevisionHash.cs

Each is a readonly record struct wrapping a Guid (IDs) or string (hash). Implement ToString(), parse, IEquatable<T>.

Example (CorrelationId.cs):

namespace ZB.MOM.WW.OtOpcUa.Commons.Types;

public readonly record struct CorrelationId(Guid Value)
{
    public static CorrelationId NewId() => new(Guid.NewGuid());
    public override string ToString() => Value.ToString("N");
    public static CorrelationId Parse(string s) => new(Guid.ParseExact(s, "N"));
}

Same pattern for ExecutionId, DeploymentId, NodeId (string), RevisionHash (string).

Commit: feat(commons): add correlation/execution/node/deployment/revisionhash types.


Task 17: Akka message contracts

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 16, 18

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/DispatchDeployment.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/ApplyAck.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/DeploymentSealed.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/DeploymentFailed.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Admin/StartDeployment.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Admin/StartDeploymentResult.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Audit/AuditEvent.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Redundancy/RedundancyStateChanged.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Redundancy/NodeRedundancyState.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Fleet/FleetStatusChanged.cs

All as sealed record with CorrelationId field. Example:

namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Deploy;

public sealed record DispatchDeployment(
    DeploymentId DeploymentId,
    RevisionHash RevisionHash,
    CorrelationId CorrelationId);

public sealed record ApplyAck(
    DeploymentId DeploymentId,
    NodeId NodeId,
    ApplyAckOutcome Outcome,
    string? FailureReason,
    CorrelationId CorrelationId);

public enum ApplyAckOutcome { Applied, Failed }

Commit: feat(commons): add deploy/admin/audit/redundancy/fleet message contracts.


Task 18: Common interfaces

Classification: small Estimated implement time: ~4 min Parallelizable with: Task 16, 17

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/IClusterRoleInfo.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/IAdminOperationsClient.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/IFleetDiagnosticsClient.cs
public interface IClusterRoleInfo
{
    NodeId LocalNode { get; }
    IReadOnlySet<string> LocalRoles { get; }
    bool HasRole(string role);
    IReadOnlyList<NodeId> MembersWithRole(string role);
    NodeId? RoleLeader(string role);
    event EventHandler<RoleLeaderChangedEventArgs>? RoleLeaderChanged;
}

public interface IAdminOperationsClient
{
    Task<StartDeploymentResult> StartDeploymentAsync(string createdBy, CancellationToken ct);
    // … other mutating ops added in later tasks
}

public interface IFleetDiagnosticsClient
{
    Task<NodeDiagnosticsSnapshot> GetDiagnosticsAsync(NodeId nodeId, CancellationToken ct);
}

Commit: feat(commons): add cluster/admin/diagnostics client interfaces.


Phase 3 — Cluster library

Task 19: HOCON config

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 20, 21, 22

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/Resources/akka.conf
  • Modify: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ZB.MOM.WW.OtOpcUa.Cluster.csproj (embed resource)

Step 1: Copy ~/Desktop/scadalink-design/src/ScadaLink.Host/Akka/akka.conf (or equivalent path — check what ScadaLink actually has) as a starting template, then adapt:

  • actor.provider = cluster
  • remote.dot-netty.tcp { hostname = "0.0.0.0", port = 4053 }
  • cluster.roles = [] (populated dynamically by Task 21)
  • cluster.split-brain-resolver.active-strategy = keep-oldest
  • cluster.split-brain-resolver.stable-after = 15s
  • cluster.down-removal-margin = 15s
  • cluster.failure-detector.heartbeat-interval = 2s
  • cluster.failure-detector.threshold = 10.0
  • cluster.singleton.singleton-name = "singleton"
  • cluster.singleton-proxy.singleton-identification-interval = 1s
  • Synchronized dispatcher for OPC UA actors (Task 44):
    opcua-synchronized-dispatcher {
      type = "PinnedDispatcher"
      executor = "thread-pool-executor"
    }
    

If ScadaLink puts HOCON inline in Program.cs rather than a .conf file, embed it the same way — but a separate .conf file is preferred for editability.

Step 2: Mark as embedded resource in csproj:

<ItemGroup>
  <EmbeddedResource Include="Resources\akka.conf" />
</ItemGroup>

Step 3: Add a loader helper src/Core/ZB.MOM.WW.OtOpcUa.Cluster/HoconLoader.cs:

public static class HoconLoader
{
    public static string LoadBaseConfig()
    {
        using var stream = typeof(HoconLoader).Assembly
            .GetManifestResourceStream("ZB.MOM.WW.OtOpcUa.Cluster.Resources.akka.conf")
            ?? throw new InvalidOperationException("akka.conf resource not found");
        using var reader = new StreamReader(stream);
        return reader.ReadToEnd();
    }
}

Commit: feat(cluster): embed Akka HOCON config matching ScadaLink tuning.


Task 20: AkkaHostedService implementation

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 19, 21, 22

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/AkkaHostedService.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/AkkaClusterOptions.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ServiceCollectionExtensions.cs

AkkaClusterOptions.cs:

public sealed class AkkaClusterOptions
{
    public string SystemName { get; set; } = "otopcua";
    public string Hostname { get; set; } = "0.0.0.0";
    public int Port { get; set; } = 4053;
    public string PublicHostname { get; set; } = "127.0.0.1";
    public string[] SeedNodes { get; set; } = Array.Empty<string>();
    public string[] Roles { get; set; } = Array.Empty<string>();
}

AkkaHostedService.cs: Implements IHostedService. On Start, builds ActorSystem from HoconLoader.LoadBaseConfig() + overlay from AkkaClusterOptions. Joins cluster (Cluster.Get(system).Join against seed nodes). On Stop, calls CoordinatedShutdown.Get(system).Run(CoordinatedShutdown.ClusterLeavingReason.Instance) with a 30s timeout.

ServiceCollectionExtensions.AddOtOpcUaCluster(IConfiguration): binds AkkaClusterOptions, registers AkkaHostedService as IHostedService, registers ActorSystem as a singleton resolved from the hosted service.

Mirror the wiring in ~/Desktop/scadalink-design/src/ScadaLink.Host/Program.cs Akka block. Don't deviate on tuning.

Commit: feat(cluster): AkkaHostedService and DI extension.


Task 21: Role parsing from OTOPCUA_ROLES env

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 19, 20, 22

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/RoleParser.cs
  • Create: tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/RoleParserTests.cs (also creates the test project — see Task 23 for the csproj)
public static class RoleParser
{
    public static string[] Parse(string? raw)
    {
        if (string.IsNullOrWhiteSpace(raw)) return Array.Empty<string>();
        var roles = raw.Split(',', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
                       .Select(r => r.ToLowerInvariant())
                       .Distinct()
                       .ToArray();
        foreach (var r in roles)
            if (r is not ("admin" or "driver" or "dev"))
                throw new ArgumentException($"Unknown role '{r}'. Allowed: admin, driver, dev.");
        return roles;
    }
}

Tests cover: empty input → empty; "admin"["admin"]; "admin,driver" → both; whitespace tolerant; case-insensitive; throws on unknown role.

Commit: feat(cluster): parse OTOPCUA_ROLES env var with validation.


Task 22: IClusterRoleInfo implementation

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 19, 20, 21

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ClusterRoleInfo.cs

Implements IClusterRoleInfo (from Task 18). Wraps Akka.Cluster.Cluster.Get(ActorSystem). Subscribes to ClusterEvent.LeaderChanged, ClusterEvent.RoleLeaderChanged, ClusterEvent.IMemberEvent via an internal subscriber actor, raises CLR event.

Commit: feat(cluster): ClusterRoleInfo wraps Akka.Cluster for app-facing role queries.


Task 23: Cluster test project + initial tests

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (verification task — depends on Tasks 19-22)

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/HoconLoaderTests.cs — asserts HOCON parses and key values present
  • Move: tests/.../RoleParserTests.cs if Task 21 dropped it elsewhere

csproj: xUnit test project, references OtOpcUa.Cluster, OtOpcUa.Commons. Packages: xunit, xunit.runner.visualstudio, Microsoft.NET.Test.Sdk, FluentAssertions.

HoconLoaderTests.cs: parses HOCON via Akka.Configuration.ConfigurationFactory.ParseString(HoconLoader.LoadBaseConfig()), asserts actor.provider == "cluster", cluster.split-brain-resolver.active-strategy == "keep-oldest", etc.

Run: dotnet test tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/. Expected: all green.

Add to solution: dotnet sln ZB.MOM.WW.OtOpcUa.slnx add tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests.csproj.

Commit: test(cluster): HOCON parses, role parser truth table.


Phase 4 — Security library

Task 24: Move LdapAuthService into OtOpcUa.Security

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 25 (different file)

Files:

  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Admin/Security/LdapAuthService.cssrc/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapAuthService.cs
  • Rename namespace: ZB.MOM.WW.OtOpcUa.Admin.SecurityZB.MOM.WW.OtOpcUa.Security.Ldap
  • Update all callers (use grep -rl 'OtOpcUa.Admin.Security' to find them; update with sed or by hand)

Commit: refactor(security): move LdapAuthService into OtOpcUa.Security library.


Task 25: JwtTokenService

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 24

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/Jwt/JwtTokenService.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/Jwt/JwtOptions.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Security.Tests/JwtTokenServiceTests.cs (also creates test csproj)

Mirror ~/Desktop/scadalink-design/src/ScadaLink.Security/JwtTokenService.cs. Options: SigningKey (HS256, ≥32 bytes), Issuer, Audience, ExpiryMinutes (default 15). Issue(claims) → string. TryValidate(token, out principal) → bool.

Tests cover: valid token roundtrip; expired token rejected; tampered token rejected; missing required claim rejected.

Commit: feat(security): JwtTokenService with HS256 + 15-min expiry.


Task 26: Cookie+JWT hybrid registration extension

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 27, 28

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs

AddOtOpcUaAuth(IConfiguration):

  1. Bind JwtOptions from Security:Jwt, bind CookieOptions from Security:Cookie.
  2. services.AddDataProtection().PersistKeysToDbContext<OtOpcUaConfigDbContext>().SetApplicationName("OtOpcUa").
  3. services.AddAuthentication(CookieAuthenticationDefaults.AuthenticationScheme)
    • .AddCookie(o => { o.Cookie.Name = "OtOpcUa.Auth"; o.Cookie.HttpOnly = true; o.Cookie.SameSite = SameSiteMode.Strict; o.Cookie.SecurePolicy = CookieSecurePolicy.SameAsRequest; o.SlidingExpiration = true; o.ExpireTimeSpan = TimeSpan.FromMinutes(30); })
    • .AddJwtBearer(JwtBearerDefaults.AuthenticationScheme, o => { /* HS256 with JwtOptions.SigningKey */ }).
  4. services.AddAuthorization() + fallback policy requiring authenticated user.
  5. Register LdapAuthService, JwtTokenService, RoleMapper.

Mirror the wiring in ~/Desktop/scadalink-design/src/ScadaLink.Security/ServiceCollectionExtensions.cs exactly for the cookie/JWT/DataProtection plumbing.

Commit: feat(security): cookie+JWT hybrid auth via AddOtOpcUaAuth.


Task 27: /auth/login, /auth/ping, /auth/token endpoints

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 26, 28

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/Endpoints/AuthEndpoints.cs

Mirror ~/Desktop/scadalink-design/src/ScadaLink.Security/Endpoints/AuthEndpoints.cs. Three minimal-API endpoints:

  • POST /auth/login — accepts {username, password}, calls LdapAuthService.AuthenticateAsync, builds claims (sub, roles), issues cookie via HttpContext.SignInAsync AND embeds JWT in cookie. Returns 204 on success / 401 on bad creds / 503 on LDAP unreachable.
  • GET /auth/ping[AllowAnonymous], returns 200 if User.Identity.IsAuthenticated, 401 otherwise.
  • POST /auth/token — authenticated, returns {token: "..."} JWT bearer for external clients.

Extension method MapOtOpcUaAuth(this IEndpointRouteBuilder). Wire in Host Program.cs at Task 53.

Commit: feat(security): /auth/login, /auth/ping, /auth/token endpoints.


Task 28: CookieAuthenticationStateProvider for Blazor circuits

Classification: small Estimated implement time: ~4 min Parallelizable with: Task 26, 27

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/Blazor/CookieAuthenticationStateProvider.cs

Standard pattern: snapshots HttpContext.User at circuit construction, polls /auth/ping every 60s to detect expiry, calls NotifyAuthenticationStateChanged on transition. Mirror ScadaLink's equivalent — search ~/Desktop/scadalink-design/src/ScadaLink.CentralUI/ for the *AuthenticationStateProvider* file.

Commit: feat(security): CookieAuthenticationStateProvider for Blazor circuit expiry detection.


Task 29: Security test project + tests

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (verification — depends on Tasks 24-28)

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.Security.Tests/ZB.MOM.WW.OtOpcUa.Security.Tests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.Security.Tests/JwtTokenServiceTests.cs (moved from Task 25 if dropped elsewhere)
  • Create: tests/ZB.MOM.WW.OtOpcUa.Security.Tests/AuthEndpointsTests.cs — uses Microsoft.AspNetCore.Mvc.Testing with a WebApplicationFactory<Program> against a stubbed LDAP

Tests cover: login happy path issues cookie+JWT; login bad password returns 401; login with LDAP outage returns 503; /auth/ping after expired cookie returns 401; /auth/token issues a valid JWT for authenticated user.

Add to solution. Run: dotnet test tests/ZB.MOM.WW.OtOpcUa.Security.Tests/. Expected: all green.

Commit: test(security): cookie+JWT roundtrip, login/ping/token endpoint tests.


Phase 5 — ControlPlane cluster singletons

Task 30: ConfigPublishCoordinator — happy path

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 32 (different files; sibling singletons)

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Coordinators/ConfigPublishCoordinator.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ConfigPublishCoordinatorTests.cs (also creates test csproj)

Step 1: Write failing test (Akka.TestKit.Xunit2)

[Fact]
public async Task HappyPath_AllNodesAck_SealsDeployment()
{
    using var harness = new ControlPlaneHarness();
    var coord = harness.Sys.ActorOf(ConfigPublishCoordinator.Props(harness.DbFactory));

    var ack1 = new ApplyAck(harness.DeploymentId, NodeId.Of("node-a"), ApplyAckOutcome.Applied, null, CorrelationId.NewId());
    var ack2 = new ApplyAck(harness.DeploymentId, NodeId.Of("node-b"), ApplyAckOutcome.Applied, null, CorrelationId.NewId());

    coord.Tell(new DispatchDeployment(harness.DeploymentId, harness.RevisionHash, CorrelationId.NewId()));
    coord.Tell(ack1);
    coord.Tell(ack2);

    await harness.WaitUntil(() => harness.LoadDeploymentStatus() == DeploymentStatus.Sealed, TimeSpan.FromSeconds(5));
}

ControlPlaneHarness is a helper that spins up Akka TestKit + in-memory EF Core ConfigDb seeded with a Deployment row in Dispatching and two ClusterNode rows.

Step 2: Run test, expect FAIL (class doesn't exist).

Step 3: Implement ConfigPublishCoordinator minimal:

public sealed class ConfigPublishCoordinator : ReceiveActor
{
    public static Props Props(IDbContextFactory<OtOpcUaConfigDbContext> dbFactory) =>
        Akka.Actor.Props.Create(() => new ConfigPublishCoordinator(dbFactory));

    private readonly IDbContextFactory<OtOpcUaConfigDbContext> _dbFactory;
    private readonly HashSet<NodeId> _expectedAcks = new();
    private DeploymentId _current;
    private readonly Dictionary<NodeId, ApplyAckOutcome> _acks = new();

    public ConfigPublishCoordinator(IDbContextFactory<OtOpcUaConfigDbContext> dbFactory)
    {
        _dbFactory = dbFactory;
        Receive<DispatchDeployment>(HandleDispatch);
        Receive<ApplyAck>(HandleAck);
    }

    private void HandleDispatch(DispatchDeployment msg)
    {
        _current = msg.DeploymentId;
        using var ctx = _dbFactory.CreateDbContext();
        _expectedAcks.UnionWith(ctx.ClusterNodes.Where(n => n.RolesCsv.Contains("driver")).Select(n => NodeId.Of(n.NodeId)).ToList());
        DistributedPubSub.Get(Context.System).Mediator.Tell(new Publish("deployments", msg));
    }

    private void HandleAck(ApplyAck msg)
    {
        if (msg.DeploymentId != _current) return; // stale
        _acks[msg.NodeId] = msg.Outcome;
        if (_acks.Count == _expectedAcks.Count && _acks.Values.All(o => o == ApplyAckOutcome.Applied))
            SealDeployment();
    }

    private void SealDeployment()
    {
        using var ctx = _dbFactory.CreateDbContext();
        var d = ctx.Deployments.Single(x => x.DeploymentId == _current.Value);
        d.Status = DeploymentStatus.Sealed;
        d.SealedAtUtc = DateTime.UtcNow;
        ctx.SaveChanges();
    }
}

Step 4: Run test, expect PASS.

Step 5: Commit: feat(controlplane): ConfigPublishCoordinator happy path.


Task 31: ConfigPublishCoordinator — timeout + failover recovery

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 32

Files:

  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Coordinators/ConfigPublishCoordinator.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ConfigPublishCoordinatorTimeoutTests.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ConfigPublishCoordinatorRecoveryTests.cs

Step 1: Add tests for:

  • Deadline elapses with one node unacked → Deployment.Status = TimedOut.
  • New Coordinator started with in-flight Dispatching deployment recovers state via PreStart (queries Deployment + NodeDeploymentState).

Step 2: Extend Coordinator with:

  • Context.System.Scheduler.ScheduleTellOnce(applyMaxDuration, Self, new DeadlineElapsed(_current)) after dispatch.
  • Receive<DeadlineElapsed> handler that marks TimedOut if any node unacked.
  • protected override void PreStart(): read Deployment rows where Status{Dispatching, AwaitingApplyAcks}; for each, repopulate _current, _expectedAcks, _acks from NodeDeploymentState; schedule remaining deadline.

Step 3: Run all ConfigPublishCoordinatorTests. Expected: all green.

Commit: feat(controlplane): ConfigPublishCoordinator deadline timeout + failover recovery.


Task 32: AdminOperationsActor + StartDeployment handler

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 30, 31, 33, 34, 35

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/AdminOperations/AdminOperationsActor.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/AdminOperations/ConfigComposer.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/AdminOperationsActorTests.cs

Responsibilities:

  1. Receive StartDeployment(createdBy, correlationId).
  2. ConfigComposer.SnapshotAndFlatten(dbContext) → byte[] ArtifactBlob (DataContract-serialized or System.Text.Json over the flat artifact). Pure function.
  3. Compute RevisionHash = SHA256(artifactBlob).ToHexString().
  4. Insert Deployment row (Status = Dispatching).
  5. Insert one ConfigEdit audit row marking the deployment snapshot.
  6. coordinator.Tell(new DispatchDeployment(deploymentId, revisionHash, correlationId)).
  7. Reply StartDeploymentResult(deploymentId, revisionHash) to sender.

For now stub CRUD ops as TODO comments — they'll be filled in Task 51 (UI wiring).

Tests: snapshot is deterministic given a fixed seed of equipment rows; hash matches; Deployment row inserted; DispatchDeployment dispatched to mocked coordinator.

Commit: feat(controlplane): AdminOperationsActor + ConfigComposer + StartDeployment flow.


Task 33: AuditWriterActor

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 30, 31, 32, 34, 35

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/AuditWriterActorTests.cs

Receives AuditEvent messages, batches into in-memory buffer (cap 500 events / 5s flush window), bulk-inserts to ConfigAuditLog. Idempotent on EventId (INSERT IF NOT EXISTS or MERGE). On PreRestart flushes buffer.

Tests: 1000 events with random duplicates → ConfigAuditLog has correct count, no duplicates; PreRestart simulates supervisor restart and verifies buffer is flushed before death.

Commit: feat(controlplane): AuditWriterActor with batched idempotent insert.


Task 34: FleetStatusBroadcaster

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 30, 31, 32, 33, 35

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Fleet/FleetStatusBroadcaster.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/FleetStatusBroadcasterTests.cs

Subscribes to ClusterEvent.MemberUp, MemberRemoved, UnreachableMember, ReachableMember, LeaderChanged, RoleLeaderChanged. Receives per-node DriverHostStatusHeartbeat Tells. Maintains in-memory FleetSnapshot. Pushes diffs via injected IHubContext<FleetStatusHub> and IHubContext<AlertHub>.

Hubs themselves are not built yet — at this stage inject mock IHubContext for tests. UI rewiring happens in Task 50.

Tests: cluster member up → diff broadcast; heartbeat staleness → unreachable broadcast; full snapshot on OnConnectedAsync request.

Commit: feat(controlplane): FleetStatusBroadcaster push-driven from Akka cluster events.


Task 35: RedundancyStateActor

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 30, 31, 32, 33, 34

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Redundancy/RedundancyStateActor.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Redundancy/ServiceLevelCalculator.cs (pure function)
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ServiceLevelCalculatorTests.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/RedundancyStateActorTests.cs

ServiceLevelCalculator: pure static function per design §6:

public static byte Compute(NodeHealthInputs h)
{
    if (h.MemberState is not (MemberStatus.Up or MemberStatus.Joining))
        return 0;
    byte basis = (h.DbReachable, h.OpcUaProbeOk, h.Stale) switch
    {
        (true,  true,  false) => 240,
        (true,  _,     true)  => 200,
        (false, _,     true)  => 100,
        _ => 0
    };
    return (byte)Math.Clamp(basis + (h.IsDriverRoleLeader ? 10 : 0), 0, 255);
}

Tests: every combination of inputs → expected byte (FsCheck or table-driven).

RedundancyStateActor: subscribes to cluster events, debounces 250ms, recomputes per-node NodeRedundancyState, publishes RedundancyStateChanged via DistributedPubSub topic redundancy-state.

Commit: feat(controlplane): RedundancyStateActor + pure ServiceLevelCalculator.


Task 36: Singleton registration extension

Classification: standard Estimated implement time: ~4 min Parallelizable with: none (depends on Tasks 30-35)

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/ServiceCollectionExtensions.cs

AddOtOpcUaControlPlane(): registers all five singletons via Akka.Cluster.Hosting WithClusterSingletonProxy<TActor> extension methods, all pinned to admin role.

Pattern (mirror ~/Desktop/scadalink-design/src/ScadaLink.ManagementService/ServiceCollectionExtensions.cs):

public static IServiceCollection AddOtOpcUaControlPlane(this IServiceCollection services)
{
    services.AddSingleton<IControlPlaneStartup, ControlPlaneStartup>();
    return services;
}

internal sealed class ControlPlaneStartup : IControlPlaneStartup
{
    public void Configure(AkkaConfigurationBuilder cb)
    {
        cb.WithClusterSingleton<ConfigPublishCoordinator>("config-publish", new ClusterSingletonOptions { Role = "admin" });
        cb.WithClusterSingleton<AdminOperationsActor>("admin-ops",         new ClusterSingletonOptions { Role = "admin" });
        cb.WithClusterSingleton<AuditWriterActor>("audit-writer",          new ClusterSingletonOptions { Role = "admin" });
        cb.WithClusterSingleton<FleetStatusBroadcaster>("fleet-status",    new ClusterSingletonOptions { Role = "admin" });
        cb.WithClusterSingleton<RedundancyStateActor>("redundancy-state",  new ClusterSingletonOptions { Role = "admin" });
    }
}

Verify against ScadaLink's actual API surface — Akka.Hosting syntax may differ slightly across versions.

Commit: feat(controlplane): singleton registration extension pinned to admin role.


Phase 6 — Runtime per-node actors

Task 37: DriverHostActor scaffolding + bootstrap

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 41, 42, 43, 44 (different actors)

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverHostActorBootstrapTests.cs (also creates test csproj)

DriverHostActor responsibilities (this task):

  • PreStart: read NodeDeploymentState for self; if Applied → Become Steady(currentDeployment); if Applying (orphan) → discard, replay; if no row + ConfigDb unreachable → fall back to LiteDb cache → Become Stale.
  • Subscribe to DistributedPubSub topic deployments.

State machine via Become: Bootstrapping → Steady | Applying(id) | Stale.

Tests: orphan Applying row → re-runs apply on PreStart; missing row + DB unreachable → Stale state.

Commit: feat(runtime): DriverHostActor scaffolding + PreStart recovery.


Task 38: DriverHostActor DispatchDeployment handler

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 41, 42, 43, 44 (different actors)

Files:

  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverHostActorDispatchTests.cs

Add:

  • Receive<DispatchDeployment>:
    • If currentRevisionHash == msg.RevisionHash → reply ApplyAck(Applied) immediately.
    • Else → write NodeDeploymentState(Applying), Become Applying(msg.DeploymentId), fetch artifact, compute delta, dispatch ApplyDelta to children, collect acks, write NodeDeploymentState(Applied|Failed), reply ApplyAck to coordinator, Become Steady.

For now children dispatch is mocked — actual DriverInstanceActor integration in Task 41.

Tests: idempotent dispatch (same hash → ack, no work); successful apply → ack Applied; child failure → ack Failed.

Commit: feat(runtime): DriverHostActor handles DispatchDeployment idempotently.


Task 39: DriverHostActor stale-config fallback

Classification: standard Estimated implement time: ~4 min Parallelizable with: Task 41, 42, 43, 44

Files:

  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs (background reconnect)
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverHostActorStaleTests.cs

Background Context.System.Scheduler.ScheduleTellRepeatedly(30s, 30s, Self, RetryConfigDbConnection.Instance). On RetryConfigDbConnection: try ConfigDb; on success and current state is Stale, pull latest sealed deployment, apply, Become Steady; publish NodeRedundancyState(Stale=false) to redundancy-state topic.

Tests: simulated DB outage → Stale published; DB recovery → state advances + Stale=false published.

Commit: feat(runtime): DriverHostActor stale-config fallback + reconnect.


Task 40: Runtime test project bootstrap

Classification: small Estimated implement time: ~3 min Parallelizable with: none (depends on Tasks 37-39)

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/RuntimeHarness.cs — TestKit base with EF in-memory + driver mocks

Confirm all DriverHostActor tests from Tasks 37-39 pass. Add to solution.

Commit: test(runtime): test project scaffold + DriverHostActor tests passing.


Task 41: DriverInstanceActor state machine

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 37-39 already done; parallel with Task 42, 43, 44

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverInstanceActorTests.cs

States via Become: Connecting → Connected → Reconnecting → Failed.

  • PreStart → enter Connecting; call IDriver.InitializeAsync.
  • On connect success → Become Connected; subscribe tags; publish OpcUaPublishActor.AttributeValueUpdate.
  • On disconnect → Become Reconnecting; publish bad quality to all subscribed tags; schedule retry at fixed interval (driver.ReconnectIntervalSeconds, default 10).
  • On ApplyDelta(plan) → idempotent diff against current state; only changed attributes update; reply ApplyResult to parent.
  • On write request via Ask<WriteAttribute> → synchronous; failure returned to caller.
  • Restart with exponential backoff supervises via parent.

Reuse existing IDriver interface (from current OtOpcUa.Driver.* projects).

Tests: connecting transitions to Connected on success; disconnect triggers bad-quality publish + Reconnecting; write failure returned to Ask caller; ApplyDelta diffs correctly.

Commit: feat(runtime): DriverInstanceActor with Connecting/Connected/Reconnecting/Failed.


Task 42: VirtualTagActor

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 41, 43, 44

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/VirtualTags/VirtualTagActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/VirtualTagActorTests.cs

Wraps existing VirtualTagEngine from ~/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/. On subscribe-to-dependencies value update, recomputes expression, publishes result to OpcUaPublishActor.

Restart with backoff; expression compile errors fail the actor (parent restarts with backoff).

Commit: feat(runtime): VirtualTagActor wrapping VirtualTagEngine.


Task 43: ScriptedAlarmActor

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 41, 42, 44

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ScriptedAlarms/ScriptedAlarmActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ScriptedAlarmActorTests.cs

Wraps existing AlarmConditionService. State machine Inactive → Active → Acknowledged → Inactive. On state change, emits history row to HistorianAdapterActor. PreRestart hook serializes current alarm state to ScriptedAlarmState ConfigDb table; PostStop/PreStart rehydrates from it.

Commit: feat(runtime): ScriptedAlarmActor with state preservation across restart.


Task 44: OpcUaPublishActor on synchronized dispatcher

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 41, 42, 43

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/OpcUa/OpcUaPublishActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/OpcUaPublishActorTests.cs

Bridge between Akka messages and OPCFoundation address space. Pinned dispatcher: opcua-synchronized-dispatcher (from HOCON, Task 19) — Props.WithDispatcher("opcua-synchronized-dispatcher").

Responsibilities:

  • Receive AttributeValueUpdate(nodeId, value, quality, timestampUtc) → write to OPC UA address space.
  • Receive AlarmStateUpdate(...) → write alarm node.
  • Subscribe to DistributedPubSub topic redundancy-state → on NodeRedundancyState for this node, write ServiceLevel byte + ServerUriArray nodes.
  • Receive RebuildAddressSpace → marshal address-space rebuild via OPC UA SDK API; bump sequence number.

OPC UA SDK objects are NEVER exposed in message payloads — actor owns them internally.

Tests: receive update → SDK write invoked; ServiceLevel update → ServiceLevel node written with correct byte.

Commit: feat(runtime): OpcUaPublishActor bridges Akka and OPCFoundation address space.


Task 45: HistorianAdapterActor, PeerOpcUaProbeActor, DbHealthProbeActor

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (last Phase 6 task — combines three small actors)

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Historian/HistorianAdapterActor.cs

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Health/PeerOpcUaProbeActor.cs

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Health/DbHealthProbeActor.cs

  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/HealthProbeActorTests.cs

  • HistorianAdapterActor: wraps existing named-pipe IPC to Wonderware sidecar. Buffers writes to SQLite store-and-forward on pipe disconnect. Reuses existing SqliteStoreAndForwardSink from current OtOpcUa code (find via grep -rln SqliteStoreAndForwardSink ~/Desktop/OtOpcUa/src).

  • PeerOpcUaProbeActor: per-peer-node periodic OPC UA opc.tcp://peer:4840 ping. Publishes OpcUaProbeResult(nodeId, ok) to redundancy-state topic input.

  • DbHealthProbeActor: cached DB probe (single-flight) feeding /health/ready + RedundancyStateActor. Reuses DbHealthCache if present.

Wrap all three actors as children under DriverHostActor.

Commit: feat(runtime): HistorianAdapter + PeerOpcUaProbe + DbHealthProbe actors.


Phase 7 — OpcUaServer extraction

Task 46: Move OpcUaApplicationHost + Phase7Composer

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (large file move with namespace rename)

Files:

  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaApplicationHost.cssrc/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OpcUaApplicationHost.cs
  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7Composer.cssrc/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Phase7Composer.cs
  • Update all callers (namespace rename to ZB.MOM.WW.OtOpcUa.OpcUaServer)

Use grep -rln 'ZB.MOM.WW.OtOpcUa.Server.OpcUa' ~/Desktop/OtOpcUa/src and grep -rln 'ZB.MOM.WW.OtOpcUa.Server.Phase7' ~/Desktop/OtOpcUa/src to find imports; update them.

Build green check.

Commit: refactor(opcua): extract OpcUaApplicationHost and Phase7Composer to OpcUaServer library.


Task 47: Make Phase7Composer pure + property test

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 48-52 (Phase 8)

Files:

  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Phase7Composer.cs (remove side effects; take inputs as parameters)
  • Test: tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests/Phase7ComposerPurityTests.cs

Refactor: remove static state, remove logging side effects (or make logging optional via injected ILogger?), return a Phase7CompositionResult record. Same inputs must always produce identical output.

Property test (FsCheck or hand-rolled): generate random EquipmentRow[], DriverInstanceRow[], ScriptRow[] arrays; call ComposeAsync twice; assert results structurally equal.

Commit: refactor(opcua): make Phase7Composer pure + property tests.


Phase 8 — AdminUI library migration

Task 48: Move Blazor components into AdminUI library

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 47

Files:

  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Admin/Components/*src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/*
  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Admin/wwwroot/*src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/wwwroot/*

Namespace rename across all .razor + .razor.cs: ZB.MOM.WW.OtOpcUa.Admin.ComponentsZB.MOM.WW.OtOpcUa.AdminUI.Components.

MapAdminUI<TApp>(this IEndpointRouteBuilder, IServiceCollection) extension method in src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/EndpointRouteBuilderExtensions.cs that maps Razor components and static assets. Mirror ScadaLink's MapCentralUI<TApp> exactly.

Build green check.

Commit: refactor(adminui): move Blazor components from Admin into AdminUI Razor class library.


Task 49: Move SignalR hubs into AdminUI; rewire to FleetStatusBroadcaster

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 50, 51, 52

Files:

  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Admin/Hubs/*.cssrc/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Hubs/*.cs
  • Delete: src/Server/ZB.MOM.WW.OtOpcUa.Admin/Hubs/FleetStatusPoller.cs (replaced by FleetStatusBroadcaster)
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Fleet/FleetStatusBroadcaster.cs — inject IHubContext<FleetStatusHub>, push diffs to it; do same for AlertHub, ScriptLogHub

Note: hubs in AdminUI reference ControlPlane only for telemetry types; ControlPlane references hub interfaces via DI'd IHubContext<T> — no project-reference cycle.

Build green check.

Commit: refactor(adminui): SignalR hubs fed by FleetStatusBroadcaster push, no polling.


Task 50: IAdminOperationsClient wrapper

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 49, 51, 52

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Clients/AdminOperationsClient.cs

Implements IAdminOperationsClient (from Task 18) via ClusterSingletonProxy to admin-ops. Each method does proxy.Ask<TResult>(message, timeout) with 10s timeout + propagated cancellation.

Register in DI: services.AddScoped<IAdminOperationsClient, AdminOperationsClient>() (scoped because per-circuit HttpContext.User flows in claims).

Commit: feat(adminui): IAdminOperationsClient backed by ClusterSingletonProxy.


Task 51: Replace DriverDiagnosticsClient with IFleetDiagnosticsClient

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 49, 50, 52

Files:

  • Delete: src/Server/ZB.MOM.WW.OtOpcUa.Admin/Services/DriverDiagnosticsClient.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Clients/FleetDiagnosticsClient.cs
  • Modify: any Blazor pages that referenced DriverDiagnosticsClient (use grep -rln DriverDiagnosticsClient ~/Desktop/OtOpcUa/src)

FleetDiagnosticsClient uses ClusterClient (or ActorSelection if same cluster) to send GetDiagnosticsRequest(nodeId) to /user/driver-host at the target node and await response.

Pages updated to inject IFleetDiagnosticsClient instead.

Commit: refactor(adminui): replace HTTP DriverDiagnosticsClient with actor-based IFleetDiagnosticsClient.


Task 52: Drift indicator + Deploy button

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 49, 50, 51

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Deployments.razor
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Layout/MainLayout.razor (add drift badge if applicable)

Deployments.razor:

  • Table of Deployment rows (most recent first), columns: DeploymentId (short), RevisionHash (short), Status, CreatedBy, CreatedAtUtc, SealedAtUtc.
  • "Deploy current configuration" button (requires FleetAdmin or ConfigEditor role) → calls IAdminOperationsClient.StartDeploymentAsync(User.Identity.Name, ct) → toast + auto-refresh table.
  • Drift badge: green "in sync" if latest sealed Deployment's revision hash matches ConfigComposer.SnapshotAndFlatten() of current ConfigDb state; yellow "drift" otherwise.

Use frontend-design skill aesthetic: clean corporate Bootstrap, vertical stacking per feedback_form_layout.md.

Commit: feat(adminui): Deployments page with drift indicator and Deploy button.


Phase 9 — Host entry point

Task 53: Host/Program.cs role-gated startup

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 54, 55

Files:

  • Replace: src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs

Mirror ~/Desktop/scadalink-design/src/ScadaLink.Host/Program.cs structure. Pseudocode:

var roles = RoleParser.Parse(Environment.GetEnvironmentVariable("OTOPCUA_ROLES"));
var builder = WebApplication.CreateBuilder(args);
builder.Configuration.AddJsonFile($"appsettings.{string.Join('-', roles.OrderBy(r=>r))}.json", optional: true);
builder.Host.UseSerilog(...);
if (OperatingSystem.IsWindows()) builder.Host.UseWindowsService();

builder.Services.AddOtOpcUaConfigDb(builder.Configuration);
builder.Services.AddOtOpcUaCluster(builder.Configuration);
builder.Services.AddOtOpcUaSecurity(builder.Configuration);
builder.Services.AddAkka("otopcua", (ab, sp) => {
    ab.AddOtOpcUaClusterConfig(roles);
    if (roles.Contains("admin"))  sp.GetRequiredService<IControlPlaneStartup>().Configure(ab);
    if (roles.Contains("driver")) sp.GetRequiredService<IRuntimeStartup>().Configure(ab);
});
if (roles.Contains("admin"))
{
    builder.Services.AddRazorComponents().AddInteractiveServerComponents();
    builder.Services.AddSignalR();
    builder.Services.AddOtOpcUaAdminUI();
}

var app = builder.Build();
app.UseSerilogRequestLogging();
if (roles.Contains("admin"))
{
    app.UseAuthentication();
    app.UseAuthorization();
    app.UseAntiforgery();
    app.MapOtOpcUaAuth();
    app.MapAdminUI<App>();
    app.MapHub<FleetStatusHub>("/hubs/fleet");
    app.MapHub<AlertHub>("/hubs/alerts");
    app.MapHub<ScriptLogHub>("/hubs/script-log");
}
app.MapHealthEndpoints();
await app.RunAsync();

Reads Roles from env; binds Akka cluster config; conditionally maps Blazor + hubs only if admin role.

Commit: feat(host): role-gated Program.cs composes all components.


Task 54: Health endpoints + appsettings layout

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 53, 55

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/HealthEndpoints.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/DatabaseHealthCheck.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AkkaClusterHealthCheck.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AdminRoleLeaderHealthCheck.cs
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json (full Cluster/Security/ConfigDb/OpcUa/Drivers/Historian sections)
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.admin-driver.json (combined-role default)
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.admin.json
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.driver.json

Three endpoints (mirror ScadaLink's pattern):

  • MapHealthChecks("/health/ready", new { Predicate = c => c.Tags.Contains("ready") })
  • MapHealthChecks("/health/active", new { Predicate = c => c.Tags.Contains("active") })
  • /healthz on port 4841 — preserve current OPC UA stack health probe semantics

Commit: feat(host): health endpoints + per-role appsettings split.


Task 55: Mac dev mode + dev-stub drivers

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 53, 54

Files:

  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs (add Stubbed Become state)
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.Development.json
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Cluster/RoleParser.cs (already allows "dev" role per Task 21)

DriverInstanceActor: at PreStart, if any of:

  • roles.Contains("dev") AND driverType is "Galaxy" or "Historian.Wonderware"
  • !OperatingSystem.IsWindows() AND driverType is Windows-only

→ Become Stubbed immediately; log INFO [DEV-STUB] driver={Name} reason={dev-role|non-windows}. Stubbed state returns deterministic test values for read; no-op for write.

appsettings.Development.json sets Security:Ldap:DevStubMode = true.

Commit: feat(runtime): DEV-STUB mode for Galaxy/Wonderware on non-Windows or dev role.


Phase 10 — Cleanup & deletions

Task 56: Delete OtOpcUa.Server and OtOpcUa.Admin projects

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: none (depends on Tasks 0-55)

Files:

  • Delete (directory): src/Server/ZB.MOM.WW.OtOpcUa.Server/
  • Delete (directory): src/Server/ZB.MOM.WW.OtOpcUa.Admin/
  • Modify: ZB.MOM.WW.OtOpcUa.slnx (remove the two project entries)
  • Sweep & delete files referenced in design §10 step 12:
    • DriverInstanceBootstrapper.cs (should be in Server, already deleted)
    • Redundancy/RedundancyCoordinator.cs
    • Redundancy/RedundancyStatePublisher.cs
    • Redundancy/ApplyLeaseRegistry.cs
    • Hosting/PeerHttpProbeLoop.cs
    • Hosting/PeerUaProbeLoop.cs — if not yet ported to PeerOpcUaProbeActor, port it now
    • Hubs/FleetStatusPoller.cs (should be moved/deleted in Task 49)
    • Security/HubTokenService.cs
  • Grep sweep: grep -rln 'RedundancyRole\|ConfigGeneration\|ApplyLeaseRegistry\|PeerHttpProbeLoop\|FleetStatusPoller\|HubTokenService' ~/Desktop/OtOpcUa/src — if any reference survives, fix it.
  • Delete corresponding tests/ZB.MOM.WW.OtOpcUa.Server.Tests/ and tests/ZB.MOM.WW.OtOpcUa.Admin.Tests/ (or keep and gut, depending on what's salvageable — recommend full delete and rebuild from Phase 11)

Build green:

dotnet build ZB.MOM.WW.OtOpcUa.slnx

Run all surviving tests:

dotnet test ZB.MOM.WW.OtOpcUa.slnx --no-build

Commit: chore(cleanup): delete OtOpcUa.Server, OtOpcUa.Admin, and obsoleted v1 services.


Task 57: Build & test green check

Classification: trivial Estimated implement time: ~3 min Parallelizable with: none

Verify. No commit unless cleanup needed.


Phase 11 — Integration & E2E tests

Task 58: Host integration test harness (2-node in-process cluster)

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (foundational)

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/TwoNodeClusterHarness.cs
  • Create: tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/docker-compose.yml (SQL Server + OpenLDAP for Mac-friendly local runs)

TwoNodeClusterHarness spins up two WebApplicationFactory<Program> instances on different ports + different Akka ports + shared SQL Server. Forms a 2-member cluster (both admin+driver). Exposes AdminA, AdminB, DriverA, DriverB references (in this harness, A==A and B==B since both roles on both nodes).

Commit: test(host): 2-node integration test harness.


Task 59: Deploy happy path + failover integration tests

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 60

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/DeployHappyPathTests.cs
  • Create: tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/FailoverDuringDeployTests.cs

Test cases mirror design §8 "Failover-specific test cases" 1-7. Each test spins up the 2-node harness, performs the scenario, asserts final ConfigDb + actor state.

Commit: test(host): deploy happy path + failover-during-deploy integration tests.


Task 60: OPC UA integration tests

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 59

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests/DualEndpointTests.cs
  • Create: tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests/ServiceLevelTests.cs

Tests: real OPCFoundation client → both endpoints visible in ServerUriArray; ServiceLevel byte = 250 on leader, 240 on follower (with the +10 leader bonus); write through OpcUaPublishActor returns synchronous failure on driver write error.

Commit: test(opcua): dual-endpoint visibility + ServiceLevel leader-bonus tests.


Task 61: E2E test infrastructure + CI

Classification: standard Estimated implement time: ~5 min Parallelizable with: none

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.E2ETests/ZB.MOM.WW.OtOpcUa.E2ETests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.E2ETests/docker-compose.yml (4 Host processes — 2 admin+driver + 2 driver-only + Traefik + SQL + LDAP)
  • Create: .github/workflows/v2-ci.yml — unit + integration jobs; nightly E2E job

CI runs dotnet build, dotnet test --filter Category!=E2E, dotnet test --filter Category=E2E nightly only.

Commit: ci(v2): integration test workflow + nightly E2E.


Phase 12 — Deploy scripts & docs

Task 62: Rewrite Install-Services.ps1

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 63, 64, 65

Files:

  • Replace: scripts/install/Install-Services.ps1

New script installs a single Windows Service OtOpcUaHost per node; takes -Roles parameter, writes OTOPCUA_ROLES to service env; binds to a configurable port (default 9000). Uses sc.exe create with restart-on-failure.

Update Refresh-Services.ps1 and Uninstall-Services.ps1 to match.

Commit: feat(install): single-service Install-Services.ps1 with -Roles parameter.


Task 63: Traefik config + docker-dev/

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 62, 64, 65

Files:

  • Create: scripts/install/Install-Traefik.ps1
  • Create: scripts/install/traefik.toml (or traefik.yml)
  • Create: docker-dev/docker-compose.yml
  • Create: docker-dev/README.md
  • Create: docker-dev/Dockerfile

traefik.toml: one entrypoint :80, one router host=otopcua.*, one service load-balancing admin-a:9000 + admin-b:9000 with /health/active health check (interval 5s, timeout 2s, expected 200).

docker-dev/ runs four Host containers (admin-a, admin-b, driver-a, driver-b) + SQL Server + OpenLDAP + Traefik. Mac-friendly. README walks through docker compose up -d and access at http://localhost.

Commit: feat(deploy): Traefik config + docker-dev Mac dev compose.


Task 64: Update existing docs

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 62, 63, 65

Files:

  • Rewrite: docs/Redundancy.md
  • Rewrite: docs/ServiceHosting.md
  • Update: docs/security.md
  • Update: docs/README.md

Redundancy.md: replace operator-managed RedundancyRole story with Akka-leader-driven ServiceLevel. Document the ServiceLevelCalculator truth table. ServiceHosting.md: single fused service, role gating, Traefik, health endpoints. security.md: cookie+JWT hybrid, DataProtection keys in ConfigDb, /auth/ping polling.

Commit: docs: rewrite Redundancy + ServiceHosting + security for v2.


Task 65: New v2 architecture docs

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 62, 63, 64

Files:

  • Create: docs/Architecture-v2.md (high-level summary, references design doc)
  • Create: docs/Cluster.md (Akka HOCON, roles, split-brain, failure detector)
  • Create: docs/ControlPlane.md (singletons, their state machines, ConfigDb tables)
  • Create: docs/Runtime.md (per-node actor tree, OPC UA bridge, dev-stub mode)

Each ~1-2 pages. Link to design doc as source of truth.

Commit: docs: v2 architecture overview + Cluster/ControlPlane/Runtime guides.


Final verification

After Task 65:

  1. dotnet build ZB.MOM.WW.OtOpcUa.slnx — green
  2. dotnet test ZB.MOM.WW.OtOpcUa.slnx — all green (unit + integration)
  3. cd docker-dev && docker compose up -d — manual smoke: login at http://localhost, deploy from UI, verify OPC UA dual endpoint via UaExpert
  4. Run scripts/migration/Migrate-To-V2.ps1 against a copy of a real ConfigDb backup; verify row counts match expectations.
  5. Tag v1.x.x-final on master for backport-only fixes.
  6. Open PR v2-akka-fusemaster titled "v2: Akka.NET cluster + fused hosting alignment".

Task index

# Title Class Time Parallel with
0 Branch + Directory.Packages.props small 3m
1 Commons project small 3m 2-8
2 Cluster project small 3m 1,3-8
3 Security project small 3m 1,2,4-8
4 ControlPlane project small 3m 1-3,5-8
5 Runtime project small 3m 1-4,6-8
6 OpcUaServer project small 3m 1-5,7,8
7 AdminUI project small 3m 1-6,8
8 Host project small 5m 1-7
9 Build green trivial 2m
10 Deployment entity standard 5m 11-13
11 NodeDeploymentState entity standard 5m 10,12,13
12 ConfigEdit entity small 4m 10,11,13
13 DataProtection keys small 3m 10-12
14 V2 migration high-risk 5m
15 Migrate-To-V2.ps1 standard 5m 16-18
16 Common types standard 5m 17,18
17 Message contracts standard 5m 16,18
18 Common interfaces small 4m 16,17
19 HOCON standard 5m 20-22
20 AkkaHostedService standard 5m 19,21,22
21 Role parser small 3m 19,20,22
22 ClusterRoleInfo standard 5m 19-21
23 Cluster tests standard 5m
24 Move LdapAuthService standard 5m 25
25 JwtTokenService standard 5m 24
26 AddOtOpcUaAuth standard 5m 27,28
27 Auth endpoints standard 5m 26,28
28 CookieAuthStateProvider small 4m 26,27
29 Security tests standard 5m
30 ConfigPublishCoordinator happy high-risk 5m 32-35
31 Coordinator timeout/recovery high-risk 5m 32-35
32 AdminOperationsActor standard 5m 30,31,33-35
33 AuditWriterActor standard 5m 30-32,34,35
34 FleetStatusBroadcaster standard 5m 30-33,35
35 RedundancyStateActor high-risk 5m 30-34
36 Singleton registration standard 4m
37 DriverHostActor bootstrap high-risk 5m 41-44
38 DriverHostActor dispatch high-risk 5m 41-44
39 DriverHostActor stale standard 4m 41-44
40 Runtime test scaffold small 3m
41 DriverInstanceActor high-risk 5m 42-44
42 VirtualTagActor standard 5m 41,43,44
43 ScriptedAlarmActor standard 5m 41,42,44
44 OpcUaPublishActor high-risk 5m 41-43
45 Health probe actors standard 5m
46 Extract OpcUaApplicationHost standard 5m
47 Phase7Composer purity standard 5m 48-52
48 Move Blazor → AdminUI standard 5m 47
49 Move hubs, rewire standard 5m 50-52
50 IAdminOperationsClient standard 5m 49,51,52
51 IFleetDiagnosticsClient standard 5m 49,50,52
52 Drift + Deploy UI standard 5m 49-51
53 Host Program.cs high-risk 5m 54,55
54 Health + appsettings standard 5m 53,55
55 DEV-STUB drivers standard 5m 53,54
56 Delete Server + Admin high-risk 5m
57 Build & test green trivial 3m
58 Integration harness standard 5m
59 Deploy + failover IT standard 5m 60
60 OPC UA IT standard 5m 59
61 E2E + CI standard 5m
62 Install-Services.ps1 standard 5m 63-65
63 Traefik + docker-dev standard 5m 62,64,65
64 Update existing docs standard 5m 62,63,65
65 New v2 docs standard 5m 62-64

Total estimated subagent time: ~5 hours of focused execution, well-suited to subagent-driven dispatch with parallel scheduling on independent tasks.