Files
lmxopcua/docs/plans/2026-05-26-akka-hosting-alignment-plan.md
Joseph Doherty 990ce343fe docs(plans): split Task 14 into 14a-14f (entity-model rewrite)
The original Task 14 (5-min EF migration that "drops ConfigGeneration") was
under-scoped: the design doc (live-edit model, ~line 208) requires removing
GenerationId from 13 entities (Equipment, DriverInstance, Device, Tag,
PollGroup, Namespace, UnsArea, UnsLine, NodeAcl, Script, VirtualTag,
ScriptedAlarm) and adding RowVersion columns for last-write-wins detection.
That cascades into GenerationApplier / GenerationDiff / GenerationSealedCache
and the legacy Server/Admin CRUD services.

New decomposition (~85 min total, replacing the original 5-min estimate):

  14a  standard   10m  Add RowVersion to live-edit entities
  14b  high-risk  30m  Drop GenerationId FK from those entities
  14c  high-risk  20m  Obsolete GenerationApplier/Diff/SealedCache
  14d  standard   5m   Drop ClusterNode.RedundancyRole
  14e  small      5m   Delete ConfigGeneration + ClusterNodeGenerationState
  14f  high-risk  15m  Consolidator: generate V2HostingAlignment migration

Policy decision (recorded with user): OtOpcUa.Server + OtOpcUa.Admin are
allowed to fail-to-compile between 14b and Task 56 - only the new v2 projects
need to stay green. Task 56 deletes the legacy projects.

Plan markdown: replaces the original Task 14 section with the 6-task
decomposition + a header explaining the rewrite. Task index table at the
bottom of the plan updated.

Tasks JSON: replaces the single Task 14 row with 6 string-id rows
("14a", "14b", ..., "14f"). Task 15 (Migrate-To-V2.ps1) and downstream
consumers re-pointed at "14f".

Verification step in 14f rewritten to use the shared docker host at
10.100.0.35 per CLAUDE.md (Docker is not installed on this Mac dev VM).
2026-05-26 03:55:48 -04:00

85 KiB

OtOpcUa v2 — Akka.NET + Fused Hosting Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.

Goal: Fuse OtOpcUa.Server and OtOpcUa.Admin into a single role-gated binary (OtOpcUa.Host), introduce an Akka.NET cluster (admin/driver roles) for control-plane singletons and per-node runtime actors, replace the draft/publish ConfigGeneration lifecycle with a live-edit + snapshot-deploy model, and drive OPC UA ServiceLevel from Akka cluster leadership while preserving the dual-endpoint warm-redundancy client behavior.

Architecture: Single solution with new component libraries (Cluster, Security, ControlPlane, Runtime, OpcUaServer, AdminUI, Commons) reused by one Host web binary. Akka 1.5.62 with Akka.Hosting + Akka.Cluster.Hosting + Akka.Cluster.Tools. Cluster singletons pinned to admin role; per-node actor trees on driver-role nodes. Existing ZB.MOM.WW.OtOpcUa.Configuration project keeps the EF Core DbContext (renamed-in-place, no project rename) and grows new tables for Deployment, NodeDeploymentState, ConfigEdit, DataProtectionKeys. EF migrations executed via auto-migration on dev + idempotent SQL script Migrate-To-V2.ps1 for prod.

Tech Stack: .NET 10, Akka.NET 1.5.62 (Akka.Hosting, Akka.Cluster.Hosting, Akka.Cluster.Tools, Akka.Remote.Hosting, Akka.Streams), EF Core 10.0.7 (SQL Server), Blazor Server, SignalR, OPCFoundation .NET Standard stack, LDAP (Novell.Directory.Ldap.NETStandard), Bootstrap 5 (vendored).

Design source: docs/plans/2026-05-26-akka-hosting-alignment-design.md. Always read it before starting a task; it is the spec.

Branch: v2-akka-fuse off master.

Reference project: Sister repo ~/Desktop/scadalink-design — copy patterns, not code (different domain). Pattern files to copy from:

  • ScadaLink HOCON: src/ScadaLink.Host/Akka/akka.conf
  • ScadaLink Security setup: src/ScadaLink.Security/ServiceCollectionExtensions.cs
  • ScadaLink Cluster bootstrap: src/ScadaLink.Host/Program.cs:60-228
  • ScadaLink ClusterSingleton pattern: src/ScadaLink.ManagementService/

Conventions for every task

  • Branch: Stay on v2-akka-fuse. Never commit to master while plan is running.
  • TDD where it makes sense: New actors, new domain logic — write the test first. Pure refactors / file moves — verify-by-build is enough.
  • Build command: dotnet build ZB.MOM.WW.OtOpcUa.slnx — must be green before commit.
  • Test command: dotnet test ZB.MOM.WW.OtOpcUa.slnx --no-build — relevant new/changed tests must pass.
  • Commit format: Conventional Commits — feat(scope):, refactor(scope):, chore(scope):, test(scope):. Scope examples: host, cluster, runtime, controlplane, security, adminui, configdb.
  • Mac compatibility: All code must build on macOS. Windows-only APIs (AddWindowsService, Galaxy/Wonderware drivers) must be gated by OperatingSystem.IsWindows() or [SupportedOSPlatform].

Phase 0 — Branch & scaffolding

Task 0: Create branch and central package management

Classification: small Estimated implement time: ~3 min Parallelizable with: none (first task)

Files:

  • Create: /Users/dohertj2/Desktop/OtOpcUa/Directory.Packages.props
  • Create: /Users/dohertj2/Desktop/OtOpcUa/Directory.Build.props

Step 1: Create branch

cd ~/Desktop/OtOpcUa
git checkout -b v2-akka-fuse

Step 2: Create Directory.Packages.props with central package management for Akka + EF Core + ASP.NET Core. Source versions from ~/Desktop/scadalink-design/Directory.Packages.props. At minimum include:

<Project>
  <PropertyGroup>
    <ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
  </PropertyGroup>
  <ItemGroup>
    <PackageVersion Include="Akka" Version="1.5.62" />
    <PackageVersion Include="Akka.Cluster" Version="1.5.62" />
    <PackageVersion Include="Akka.Cluster.Hosting" Version="1.5.62" />
    <PackageVersion Include="Akka.Cluster.Tools" Version="1.5.62" />
    <PackageVersion Include="Akka.Hosting" Version="1.5.62" />
    <PackageVersion Include="Akka.Remote" Version="1.5.62" />
    <PackageVersion Include="Akka.Remote.Hosting" Version="1.5.62" />
    <PackageVersion Include="Akka.Streams" Version="1.5.62" />
    <PackageVersion Include="Akka.Streams.TestKit" Version="1.5.62" />
    <PackageVersion Include="Akka.TestKit.Xunit2" Version="1.5.62" />
    <PackageVersion Include="Microsoft.AspNetCore.Authentication.JwtBearer" Version="10.0.7" />
    <PackageVersion Include="Microsoft.AspNetCore.DataProtection.EntityFrameworkCore" Version="10.0.7" />
    <PackageVersion Include="Microsoft.EntityFrameworkCore" Version="10.0.7" />
    <PackageVersion Include="Microsoft.EntityFrameworkCore.Design" Version="10.0.7" />
    <PackageVersion Include="Microsoft.EntityFrameworkCore.SqlServer" Version="10.0.7" />
    <PackageVersion Include="Microsoft.EntityFrameworkCore.InMemory" Version="10.0.7" />
  </ItemGroup>
</Project>

Audit the existing .csproj files for any package not listed; add it to Directory.Packages.props and strip the Version attribute from the csprojs.

Step 3: Create minimal Directory.Build.props:

<Project>
  <PropertyGroup>
    <TargetFramework>net10.0</TargetFramework>
    <Nullable>enable</Nullable>
    <ImplicitUsings>enable</ImplicitUsings>
    <TreatWarningsAsErrors>true</TreatWarningsAsErrors>
    <LangVersion>latest</LangVersion>
  </PropertyGroup>
</Project>

Step 4: Build green check

Run: dotnet build ZB.MOM.WW.OtOpcUa.slnx Expected: Build succeeded. If any csproj has a duplicate Version after centralization, fix.

Step 5: Commit

git add Directory.Packages.props Directory.Build.props
git commit -m "chore(build): introduce central package management for v2"

Task 1: Create OtOpcUa.Commons project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 2, 3, 4, 5, 6, 7, 8

Files:

  • Create: /Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/ZB.MOM.WW.OtOpcUa.Commons.csproj
  • Create: /Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/.gitkeep
  • Create: /Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/.gitkeep
  • Create: /Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/.gitkeep
  • Modify: /Users/dohertj2/Desktop/OtOpcUa/ZB.MOM.WW.OtOpcUa.slnx (add Commons project)

Step 1: Create csproj

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <RootNamespace>ZB.MOM.WW.OtOpcUa.Commons</RootNamespace>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Akka" />
  </ItemGroup>
</Project>

Step 2: Add to solution

Run: dotnet sln ZB.MOM.WW.OtOpcUa.slnx add src/Core/ZB.MOM.WW.OtOpcUa.Commons/ZB.MOM.WW.OtOpcUa.Commons.csproj

Step 3: Build green

Run: dotnet build ZB.MOM.WW.OtOpcUa.slnx Expected: Build succeeded.

Step 4: Commit

git add src/Core/ZB.MOM.WW.OtOpcUa.Commons/ ZB.MOM.WW.OtOpcUa.slnx
git commit -m "feat(commons): scaffold OtOpcUa.Commons project"

Task 2: Create OtOpcUa.Cluster project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 3, 4, 5, 6, 7, 8

Files:

  • Create: /Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ZB.MOM.WW.OtOpcUa.Cluster.csproj
  • Modify: ZB.MOM.WW.OtOpcUa.slnx

Step 1: Create csproj

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <RootNamespace>ZB.MOM.WW.OtOpcUa.Cluster</RootNamespace>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Akka.Hosting" />
    <PackageReference Include="Akka.Cluster" />
    <PackageReference Include="Akka.Cluster.Hosting" />
    <PackageReference Include="Akka.Cluster.Tools" />
    <PackageReference Include="Akka.Remote.Hosting" />
    <PackageReference Include="Microsoft.Extensions.Hosting" />
    <PackageReference Include="Microsoft.Extensions.Options.ConfigurationExtensions" />
  </ItemGroup>
  <ItemGroup>
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Commons\ZB.MOM.WW.OtOpcUa.Commons.csproj" />
  </ItemGroup>
</Project>

Step 2-4: add to solution, build, commit (feat(cluster): scaffold OtOpcUa.Cluster project).


Task 3: Create OtOpcUa.Security project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 2, 4, 5, 6, 7, 8

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/ZB.MOM.WW.OtOpcUa.Security.csproj
  • Modify: ZB.MOM.WW.OtOpcUa.slnx

csproj: classlib targeting net10.0, references OtOpcUa.Commons, OtOpcUa.Configuration. Packages: Microsoft.AspNetCore.Authentication.Cookies, Microsoft.AspNetCore.Authentication.JwtBearer, Microsoft.IdentityModel.Tokens, System.IdentityModel.Tokens.Jwt, Novell.Directory.Ldap.NETStandard.

Commit: feat(security): scaffold OtOpcUa.Security project.


Task 4: Create OtOpcUa.ControlPlane project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 2, 3, 5, 6, 7, 8

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/ZB.MOM.WW.OtOpcUa.ControlPlane.csproj

csproj: classlib, references OtOpcUa.Commons, OtOpcUa.Cluster, OtOpcUa.Configuration. Packages: Akka.Hosting, Akka.Cluster.Tools, Microsoft.AspNetCore.SignalR.Core.

Commit: feat(controlplane): scaffold OtOpcUa.ControlPlane project.


Task 5: Create OtOpcUa.Runtime project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 2, 3, 4, 6, 7, 8

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ZB.MOM.WW.OtOpcUa.Runtime.csproj

csproj: classlib, references OtOpcUa.Commons, OtOpcUa.Cluster, OtOpcUa.Configuration, OtOpcUa.OpcUaServer, all OtOpcUa.Driver.* abstraction projects (NOT concrete driver implementations — those are loaded reflectively). Packages: Akka.Hosting, Akka.Cluster.Tools.

Commit: feat(runtime): scaffold OtOpcUa.Runtime project.


Task 6: Create OtOpcUa.OpcUaServer project

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 2, 3, 4, 5, 7, 8

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/ZB.MOM.WW.OtOpcUa.OpcUaServer.csproj

csproj: classlib, references OtOpcUa.Commons, OtOpcUa.Configuration. Packages: OPCFoundation.NetStandard.Opc.Ua.Server, OPCFoundation.NetStandard.Opc.Ua.Configuration. Copy exact versions from current ZB.MOM.WW.OtOpcUa.Server.csproj.

Commit: feat(opcua): scaffold OtOpcUa.OpcUaServer project.


Task 7: Create OtOpcUa.AdminUI Razor class library

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 1, 2, 3, 4, 5, 6, 8

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/ZB.MOM.WW.OtOpcUa.AdminUI.csproj

csproj:

<Project Sdk="Microsoft.NET.Sdk.Razor">
  <PropertyGroup>
    <RootNamespace>ZB.MOM.WW.OtOpcUa.AdminUI</RootNamespace>
    <AddRazorSupportForMvc>true</AddRazorSupportForMvc>
  </PropertyGroup>
  <ItemGroup>
    <FrameworkReference Include="Microsoft.AspNetCore.App" />
  </ItemGroup>
  <ItemGroup>
    <ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Commons\ZB.MOM.WW.OtOpcUa.Commons.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Security\ZB.MOM.WW.OtOpcUa.Security.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.ControlPlane\ZB.MOM.WW.OtOpcUa.ControlPlane.csproj" />
    <ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Configuration\ZB.MOM.WW.OtOpcUa.Configuration.csproj" />
  </ItemGroup>
</Project>

Commit: feat(adminui): scaffold OtOpcUa.AdminUI Razor class library.


Task 8: Create OtOpcUa.Host Web SDK project

Classification: small Estimated implement time: ~5 min Parallelizable with: Task 1, 2, 3, 4, 5, 6, 7

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs (minimal "Hello, host" stub)
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Properties/launchSettings.json

csproj:

<Project Sdk="Microsoft.NET.Sdk.Web">
  <PropertyGroup>
    <RootNamespace>ZB.MOM.WW.OtOpcUa.Host</RootNamespace>
    <UserSecretsId>zb-mom-ww-otopcua-host</UserSecretsId>
    <AssemblyName>OtOpcUa.Host</AssemblyName>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.Extensions.Hosting.WindowsServices" Condition="'$([System.Runtime.InteropServices.RuntimeInformation]::IsOSPlatform($([System.Runtime.InteropServices.OSPlatform]::Windows)))' == 'true'" />
    <PackageReference Include="Serilog.AspNetCore" />
    <PackageReference Include="Akka.Hosting" />
  </ItemGroup>
  <ItemGroup>
    <ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Commons\ZB.MOM.WW.OtOpcUa.Commons.csproj" />
    <ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Cluster\ZB.MOM.WW.OtOpcUa.Cluster.csproj" />
    <ProjectReference Include="..\..\Core\ZB.MOM.WW.OtOpcUa.Configuration\ZB.MOM.WW.OtOpcUa.Configuration.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Security\ZB.MOM.WW.OtOpcUa.Security.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.ControlPlane\ZB.MOM.WW.OtOpcUa.ControlPlane.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Runtime\ZB.MOM.WW.OtOpcUa.Runtime.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.OpcUaServer\ZB.MOM.WW.OtOpcUa.OpcUaServer.csproj" />
    <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.AdminUI\ZB.MOM.WW.OtOpcUa.AdminUI.csproj" />
  </ItemGroup>
</Project>

Stub Program.cs:

var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();
app.MapGet("/", () => "OtOpcUa.Host scaffold");
await app.RunAsync();

appsettings.json: empty {} for now.

launchSettings.json: profile OtOpcUa.Host with applicationUrl=http://localhost:9000.

Commit: feat(host): scaffold OtOpcUa.Host web project.


Task 9: Build green smoke

Classification: trivial Estimated implement time: ~2 min Parallelizable with: none (depends on Tasks 0-8)

Step 1: Run dotnet build ZB.MOM.WW.OtOpcUa.slnx. Expected: succeeded, no warnings-as-errors. Fix anything that broke. No commit (verification only).


Phase 1 — ConfigDb schema (live-edit + deploy model)

Task 10: Add Deployment entity

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 11, 12, 13

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/Deployment.cs
  • Modify: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/OtOpcUaConfigDbContext.cs (add DbSet<Deployment> + OnModelCreating mapping)

Step 1: Create Deployment.cs:

namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;

public sealed class Deployment
{
    public Guid DeploymentId { get; init; } = Guid.NewGuid();
    public required string RevisionHash { get; init; }
    public DeploymentStatus Status { get; set; } = DeploymentStatus.Dispatching;
    public required string CreatedBy { get; init; }
    public DateTime CreatedAtUtc { get; init; } = DateTime.UtcNow;
    public byte[] ArtifactBlob { get; init; } = Array.Empty<byte>();
    public byte[] RowVersion { get; set; } = Array.Empty<byte>();
    public string? FailureReason { get; set; }
    public DateTime? SealedAtUtc { get; set; }
}

public enum DeploymentStatus
{
    Dispatching = 0,
    AwaitingApplyAcks = 1,
    Sealed = 2,
    PartiallyFailed = 3,
    TimedOut = 4
}

Step 2: Add mapping in OtOpcUaConfigDbContext.OnModelCreating:

modelBuilder.Entity<Deployment>(b =>
{
    b.ToTable("Deployment");
    b.HasKey(d => d.DeploymentId);
    b.Property(d => d.RevisionHash).HasMaxLength(64).IsRequired();
    b.Property(d => d.Status).HasConversion<int>();
    b.Property(d => d.CreatedBy).HasMaxLength(128).IsRequired();
    b.Property(d => d.FailureReason).HasMaxLength(2048);
    b.Property(d => d.RowVersion).IsRowVersion();
    b.HasIndex(d => d.Status);
    b.HasIndex(d => d.CreatedAtUtc);
});

Step 3: Build green. Commit: feat(configdb): add Deployment entity.


Task 11: Add NodeDeploymentState entity (replaces ClusterNodeGenerationState)

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 10, 12, 13

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/NodeDeploymentState.cs
  • Modify: OtOpcUaConfigDbContext.cs (add DbSet + mapping)

Schema: (NodeId, DeploymentId) composite key; Status enum Applying|Applied|Failed; StartedAtUtc, AppliedAtUtc?, FailureReason?, RowVersion.

Do NOT delete ClusterNodeGenerationState.cs yet — keep it for the migration step in Task 14.

Commit: feat(configdb): add NodeDeploymentState entity.


Task 12: Add ConfigEdit audit entity

Classification: small Estimated implement time: ~4 min Parallelizable with: Task 10, 11, 13

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigEdit.cs
  • Modify: OtOpcUaConfigDbContext.cs

Schema: (EditId GUID PK, EntityType string, EntityId GUID, FieldsJson nvarchar(max), ExecutionId GUID NULL, EditedBy, EditedAtUtc, SourceNode).

Captures per-row edits to Equipment, Driver, DriverInstance, Script, etc. Inserted by AdminOperationsActor on every mutating op.

Commit: feat(configdb): add ConfigEdit audit entity.


Task 13: Add DataProtection keys table

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 10, 11, 12

Files:

  • Modify: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/ZB.MOM.WW.OtOpcUa.Configuration.csproj — add Microsoft.AspNetCore.DataProtection.EntityFrameworkCore package
  • Modify: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/OtOpcUaConfigDbContext.cs — implement IDataProtectionKeyContext:
public DbSet<Microsoft.AspNetCore.DataProtection.EntityFrameworkCore.DataProtectionKey> DataProtectionKeys
    => Set<Microsoft.AspNetCore.DataProtection.EntityFrameworkCore.DataProtectionKey>();

Commit: feat(configdb): persist DataProtection keys in ConfigDb.


Tasks 14a-14f: Entity-model rewrite + V2HostingAlignment migration

Plan rewrite, 2026-05-26: the original single Task 14 (5-min EF migration) was under-scoped — it only listed the schema drops/adds without addressing the 13+ entities whose foreign keys + indexes are keyed on GenerationId. The design doc (§ live-edit model) requires removing GenerationId from Equipment, Driver, DriverInstance, Namespace, UnsArea, UnsLine, Device, Tag, PollGroup, NodeAcl, Script, VirtualTag, ScriptedAlarm and adding RowVersion columns for last-write-wins stale-write detection. That cascades into GenerationApplier/GenerationDiff/ GenerationSealedCache and the legacy Server/Admin CRUD services. Policy decision (recorded with the user): the legacy OtOpcUa.Server + OtOpcUa.Admin projects are allowed to fail-to-compile between Task 14c and Task 56 — only the new v2 projects need to stay green.

Task 14a: Add RowVersion to live-edit entities

Classification: standard Estimated implement time: ~10 min Parallelizable with: none (foundation for 14b)

Files: every live-edit entity class — Equipment, DriverInstance, Device, Tag, PollGroup, Namespace, UnsArea, UnsLine, NodeAcl, Script, VirtualTag, ScriptedAlarm. Add public byte[] RowVersion { get; set; } = Array.Empty<byte>(); and a e.Property(x => x.RowVersion).IsRowVersion(); mapping in OtOpcUaConfigDbContext.

Commit: feat(configdb): add RowVersion to live-edit entities for last-write-wins detection.


Task 14b: Decouple live-edit entities from ConfigGeneration

Classification: high-risk Estimated implement time: ~30 min Parallelizable with: none

Remove GenerationId property, Generation navigation property, and the HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId) mapping from each of the 13 live-edit entities listed above. Rewrite the UX_<Table>_Generation_LogicalId indexes to drop the GenerationId column (logical IDs become globally unique). Drop UX_*_Generation_* filtered indexes where the filter referenced generation scope.

Will break OtOpcUa.Server + OtOpcUa.Admin compilation — that is accepted (Task 56 deletes them).

Commit: refactor(configdb): drop GenerationId FK from live-edit entities.


Task 14c: Mark GenerationApplier / GenerationDiff / GenerationSealedCache obsolete

Classification: high-risk Estimated implement time: ~20 min Parallelizable with: none

src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Apply/ contains GenerationApplier.cs, GenerationDiff.cs, ApplyCallbacks.cs, ChangeKind.cs, IGenerationApplier.cs. These implement the v1 draft/publish lifecycle that v2 replaces with AdminOperationsActor + ConfigComposer.

Inventory callers via grep -rln 'GenerationApplier\|GenerationDiff' src tests. Either:

  • Mark types [Obsolete("Replaced by AdminOperationsActor in v2", error: true)] so surviving call sites become hard build errors (cleaner; surfaces the Server-breakage),
  • Or delete the files and accept the Server-side build break.

Sweep GenerationSealedCache similarly. Keep the LiteDb cache concept (it's repurposed in Task 39 for stale-config fallback) but rename references to use DeploymentArtifact.

Commit: refactor(configdb): obsolete GenerationApplier/Diff/SealedCache (replaced by AdminOperationsActor).


Task 14d: Drop RedundancyRole from ClusterNode

Classification: standard Estimated implement time: ~5 min Parallelizable with: none

Remove ClusterNode.RedundancyRole property + the e.Property(x => x.RedundancyRole).HasConversion<string>() mapping + the UX_ClusterNode_Primary_Per_Cluster filtered unique index from OtOpcUaConfigDbContext.ConfigureClusterNode. Akka cluster leader-of-driver-role becomes the source of truth (Phase 5, Task 35).

Commit: refactor(configdb): drop ClusterNode.RedundancyRole (replaced by Akka leader).


Task 14e: Delete ConfigGeneration + ClusterNodeGenerationState

Classification: small Estimated implement time: ~5 min Parallelizable with: none (depends on 14b clearing the FKs)

Delete Entities/ConfigGeneration.cs and Entities/ClusterNodeGenerationState.cs. Remove the corresponding DbSet<> entries and Configure* methods from OtOpcUaConfigDbContext. Drop GenerationStatus and NodeApplyStatus enums.

Commit: refactor(configdb): delete ConfigGeneration + ClusterNodeGenerationState.


Task 14f: Generate V2HostingAlignment EF migration

Classification: high-risk Estimated implement time: ~15 min Parallelizable with: none (consolidates 14a-14e)

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/<timestamp>_V2HostingAlignment.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/<timestamp>_V2HostingAlignment.Designer.cs
  • Modify: tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests/SchemaComplianceTests.cs — update the expected table list (remove ConfigGeneration + ClusterNodeGenerationState; add Deployment + NodeDeploymentState + ConfigEdit + DataProtectionKeys).

Step 1: Generate migration

dotnet ef migrations add V2HostingAlignment \
  --project src/Core/ZB.MOM.WW.OtOpcUa.Configuration \
  --startup-project src/Server/ZB.MOM.WW.OtOpcUa.Host

If dotnet-ef isn't installed: dotnet tool install --global dotnet-ef --version 10.0.7.

Step 2: Audit the generated migration — it should:

  • DropTable("ConfigGeneration") and DropTable("ClusterNodeGenerationState")
  • DropColumn("RedundancyRole", "ClusterNode")
  • For each of the 13 live-edit tables: DropForeignKey on GenerationId, DropIndex on UX_*_Generation_LogicalId (and any UX_*_Generation_*), DropColumn on GenerationId, AddColumn("RowVersion", "rowversion"), CreateIndex on the new globally-unique logical-id pattern.
  • CreateTable("Deployment", ...), CreateTable("NodeDeploymentState", ...), CreateTable("ConfigEdit", ...), CreateTable("DataProtectionKeys", ...).

If extra changes appear (e.g., column-type drift), reconcile by editing the entity classes — do not edit the migration directly.

Step 3: Verify on a scratch SQL Server (per CLAUDE.md, Docker is on the shared host 10.100.0.35, not local).

# from this Mac dev:
ssh dohertj2@10.100.0.35 'docker run --rm -d --name v2-migration-test \
  -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=Pass@word123" \
  -p 14333:1433 mcr.microsoft.com/mssql/server:2022-latest'
# Wait ~10s for SQL Server to start
ConnectionStrings__ConfigDb="Server=10.100.0.35,14333;Database=OtOpcUaV2Test;User Id=sa;Password=Pass@word123;TrustServerCertificate=true" \
  dotnet ef database update \
    --project src/Core/ZB.MOM.WW.OtOpcUa.Configuration \
    --startup-project src/Server/ZB.MOM.WW.OtOpcUa.Host
ssh dohertj2@10.100.0.35 'docker exec v2-migration-test /opt/mssql-tools/bin/sqlcmd \
  -S localhost -U sa -P Pass@word123 -d OtOpcUaV2Test \
  -Q "SELECT name FROM sys.tables ORDER BY name"'
ssh dohertj2@10.100.0.35 'docker stop v2-migration-test'

Expected: migration completes; sys.tables contains the 4 new tables and not the 2 dropped ones; live-edit tables have RowVersion column.

Step 4: Update SchemaComplianceTests so its expected array matches the new schema.

Step 5: Commit

git add src/Core/ZB.MOM.WW.OtOpcUa.Configuration/ \
        tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests/SchemaComplianceTests.cs
git commit -m "feat(configdb): V2HostingAlignment migration — drop generation lifecycle, add deploy tables"

Task 15: Migrate-To-V2.ps1 idempotent prod migration script

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 16, 17, 18 (Phase 2)

Files:

  • Create: scripts/migration/Migrate-To-V2.ps1
  • Create: scripts/migration/Migrate-To-V2.sql (the idempotent SQL output)

Step 1: Generate idempotent SQL from EF

Run: dotnet ef migrations script --idempotent --project src/Core/ZB.MOM.WW.OtOpcUa.Configuration --startup-project src/Server/ZB.MOM.WW.OtOpcUa.Host --output scripts/migration/Migrate-To-V2.sql

Step 2: PowerShell wrapper:

[CmdletBinding()]
param(
  [Parameter(Mandatory)][string] $ConnectionString,
  [string] $BackupPath = "$env:TEMP\OtOpcUa-V1-Backup-$(Get-Date -Format yyyyMMddHHmmss).bak"
)

Write-Host "Step 1/4 — Backup ConfigDb to $BackupPath"
Invoke-Sqlcmd -ConnectionString $ConnectionString -Query "BACKUP DATABASE [OtOpcUaConfigDb] TO DISK = '$BackupPath' WITH FORMAT, COMPRESSION"

Write-Host "Step 2/4 — Row counts before"
$beforeCounts = Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\count-rows.sql"
$beforeCounts | Format-Table

Write-Host "Step 3/4 — Apply Migrate-To-V2.sql"
Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\Migrate-To-V2.sql"

Write-Host "Step 4/4 — Row counts after + validation"
$afterCounts = Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\count-rows.sql"
$afterCounts | Format-Table

# Validation gates
$tablesNow = (Invoke-Sqlcmd -ConnectionString $ConnectionString -Query "SELECT name FROM sys.tables ORDER BY name").name
foreach ($t in 'Deployment','NodeDeploymentState','ConfigEdit','DataProtectionKeys') {
  if ($tablesNow -notcontains $t) { throw "Expected table $t missing." }
}
foreach ($t in 'ConfigGeneration','ClusterNodeGenerationState') {
  if ($tablesNow -contains $t) { throw "Legacy table $t still present." }
}
Write-Host "Migration complete. Backup at $BackupPath"

Also create scripts/migration/count-rows.sql listing per-table row counts for the audit.

Commit: feat(migration): add Migrate-To-V2.ps1 idempotent migration runner.


Phase 2 — Commons types and contracts

Task 16: Common types

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 17, 18

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/CorrelationId.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/ExecutionId.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/NodeId.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/DeploymentId.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/RevisionHash.cs

Each is a readonly record struct wrapping a Guid (IDs) or string (hash). Implement ToString(), parse, IEquatable<T>.

Example (CorrelationId.cs):

namespace ZB.MOM.WW.OtOpcUa.Commons.Types;

public readonly record struct CorrelationId(Guid Value)
{
    public static CorrelationId NewId() => new(Guid.NewGuid());
    public override string ToString() => Value.ToString("N");
    public static CorrelationId Parse(string s) => new(Guid.ParseExact(s, "N"));
}

Same pattern for ExecutionId, DeploymentId, NodeId (string), RevisionHash (string).

Commit: feat(commons): add correlation/execution/node/deployment/revisionhash types.


Task 17: Akka message contracts

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 16, 18

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/DispatchDeployment.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/ApplyAck.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/DeploymentSealed.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/DeploymentFailed.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Admin/StartDeployment.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Admin/StartDeploymentResult.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Audit/AuditEvent.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Redundancy/RedundancyStateChanged.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Redundancy/NodeRedundancyState.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Fleet/FleetStatusChanged.cs

All as sealed record with CorrelationId field. Example:

namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Deploy;

public sealed record DispatchDeployment(
    DeploymentId DeploymentId,
    RevisionHash RevisionHash,
    CorrelationId CorrelationId);

public sealed record ApplyAck(
    DeploymentId DeploymentId,
    NodeId NodeId,
    ApplyAckOutcome Outcome,
    string? FailureReason,
    CorrelationId CorrelationId);

public enum ApplyAckOutcome { Applied, Failed }

Commit: feat(commons): add deploy/admin/audit/redundancy/fleet message contracts.


Task 18: Common interfaces

Classification: small Estimated implement time: ~4 min Parallelizable with: Task 16, 17

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/IClusterRoleInfo.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/IAdminOperationsClient.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/IFleetDiagnosticsClient.cs
public interface IClusterRoleInfo
{
    NodeId LocalNode { get; }
    IReadOnlySet<string> LocalRoles { get; }
    bool HasRole(string role);
    IReadOnlyList<NodeId> MembersWithRole(string role);
    NodeId? RoleLeader(string role);
    event EventHandler<RoleLeaderChangedEventArgs>? RoleLeaderChanged;
}

public interface IAdminOperationsClient
{
    Task<StartDeploymentResult> StartDeploymentAsync(string createdBy, CancellationToken ct);
    // … other mutating ops added in later tasks
}

public interface IFleetDiagnosticsClient
{
    Task<NodeDiagnosticsSnapshot> GetDiagnosticsAsync(NodeId nodeId, CancellationToken ct);
}

Commit: feat(commons): add cluster/admin/diagnostics client interfaces.


Phase 3 — Cluster library

Task 19: HOCON config

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 20, 21, 22

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/Resources/akka.conf
  • Modify: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ZB.MOM.WW.OtOpcUa.Cluster.csproj (embed resource)

Step 1: Copy ~/Desktop/scadalink-design/src/ScadaLink.Host/Akka/akka.conf (or equivalent path — check what ScadaLink actually has) as a starting template, then adapt:

  • actor.provider = cluster
  • remote.dot-netty.tcp { hostname = "0.0.0.0", port = 4053 }
  • cluster.roles = [] (populated dynamically by Task 21)
  • cluster.split-brain-resolver.active-strategy = keep-oldest
  • cluster.split-brain-resolver.stable-after = 15s
  • cluster.down-removal-margin = 15s
  • cluster.failure-detector.heartbeat-interval = 2s
  • cluster.failure-detector.threshold = 10.0
  • cluster.singleton.singleton-name = "singleton"
  • cluster.singleton-proxy.singleton-identification-interval = 1s
  • Synchronized dispatcher for OPC UA actors (Task 44):
    opcua-synchronized-dispatcher {
      type = "PinnedDispatcher"
      executor = "thread-pool-executor"
    }
    

If ScadaLink puts HOCON inline in Program.cs rather than a .conf file, embed it the same way — but a separate .conf file is preferred for editability.

Step 2: Mark as embedded resource in csproj:

<ItemGroup>
  <EmbeddedResource Include="Resources\akka.conf" />
</ItemGroup>

Step 3: Add a loader helper src/Core/ZB.MOM.WW.OtOpcUa.Cluster/HoconLoader.cs:

public static class HoconLoader
{
    public static string LoadBaseConfig()
    {
        using var stream = typeof(HoconLoader).Assembly
            .GetManifestResourceStream("ZB.MOM.WW.OtOpcUa.Cluster.Resources.akka.conf")
            ?? throw new InvalidOperationException("akka.conf resource not found");
        using var reader = new StreamReader(stream);
        return reader.ReadToEnd();
    }
}

Commit: feat(cluster): embed Akka HOCON config matching ScadaLink tuning.


Task 20: AkkaHostedService implementation

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 19, 21, 22

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/AkkaHostedService.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/AkkaClusterOptions.cs
  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ServiceCollectionExtensions.cs

AkkaClusterOptions.cs:

public sealed class AkkaClusterOptions
{
    public string SystemName { get; set; } = "otopcua";
    public string Hostname { get; set; } = "0.0.0.0";
    public int Port { get; set; } = 4053;
    public string PublicHostname { get; set; } = "127.0.0.1";
    public string[] SeedNodes { get; set; } = Array.Empty<string>();
    public string[] Roles { get; set; } = Array.Empty<string>();
}

AkkaHostedService.cs: Implements IHostedService. On Start, builds ActorSystem from HoconLoader.LoadBaseConfig() + overlay from AkkaClusterOptions. Joins cluster (Cluster.Get(system).Join against seed nodes). On Stop, calls CoordinatedShutdown.Get(system).Run(CoordinatedShutdown.ClusterLeavingReason.Instance) with a 30s timeout.

ServiceCollectionExtensions.AddOtOpcUaCluster(IConfiguration): binds AkkaClusterOptions, registers AkkaHostedService as IHostedService, registers ActorSystem as a singleton resolved from the hosted service.

Mirror the wiring in ~/Desktop/scadalink-design/src/ScadaLink.Host/Program.cs Akka block. Don't deviate on tuning.

Commit: feat(cluster): AkkaHostedService and DI extension.


Task 21: Role parsing from OTOPCUA_ROLES env

Classification: small Estimated implement time: ~3 min Parallelizable with: Task 19, 20, 22

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/RoleParser.cs
  • Create: tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/RoleParserTests.cs (also creates the test project — see Task 23 for the csproj)
public static class RoleParser
{
    public static string[] Parse(string? raw)
    {
        if (string.IsNullOrWhiteSpace(raw)) return Array.Empty<string>();
        var roles = raw.Split(',', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
                       .Select(r => r.ToLowerInvariant())
                       .Distinct()
                       .ToArray();
        foreach (var r in roles)
            if (r is not ("admin" or "driver" or "dev"))
                throw new ArgumentException($"Unknown role '{r}'. Allowed: admin, driver, dev.");
        return roles;
    }
}

Tests cover: empty input → empty; "admin"["admin"]; "admin,driver" → both; whitespace tolerant; case-insensitive; throws on unknown role.

Commit: feat(cluster): parse OTOPCUA_ROLES env var with validation.


Task 22: IClusterRoleInfo implementation

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 19, 20, 21

Files:

  • Create: src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ClusterRoleInfo.cs

Implements IClusterRoleInfo (from Task 18). Wraps Akka.Cluster.Cluster.Get(ActorSystem). Subscribes to ClusterEvent.LeaderChanged, ClusterEvent.RoleLeaderChanged, ClusterEvent.IMemberEvent via an internal subscriber actor, raises CLR event.

Commit: feat(cluster): ClusterRoleInfo wraps Akka.Cluster for app-facing role queries.


Task 23: Cluster test project + initial tests

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (verification task — depends on Tasks 19-22)

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/HoconLoaderTests.cs — asserts HOCON parses and key values present
  • Move: tests/.../RoleParserTests.cs if Task 21 dropped it elsewhere

csproj: xUnit test project, references OtOpcUa.Cluster, OtOpcUa.Commons. Packages: xunit, xunit.runner.visualstudio, Microsoft.NET.Test.Sdk, FluentAssertions.

HoconLoaderTests.cs: parses HOCON via Akka.Configuration.ConfigurationFactory.ParseString(HoconLoader.LoadBaseConfig()), asserts actor.provider == "cluster", cluster.split-brain-resolver.active-strategy == "keep-oldest", etc.

Run: dotnet test tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/. Expected: all green.

Add to solution: dotnet sln ZB.MOM.WW.OtOpcUa.slnx add tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests.csproj.

Commit: test(cluster): HOCON parses, role parser truth table.


Phase 4 — Security library

Task 24: Move LdapAuthService into OtOpcUa.Security

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 25 (different file)

Files:

  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Admin/Security/LdapAuthService.cssrc/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapAuthService.cs
  • Rename namespace: ZB.MOM.WW.OtOpcUa.Admin.SecurityZB.MOM.WW.OtOpcUa.Security.Ldap
  • Update all callers (use grep -rl 'OtOpcUa.Admin.Security' to find them; update with sed or by hand)

Commit: refactor(security): move LdapAuthService into OtOpcUa.Security library.


Task 25: JwtTokenService

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 24

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/Jwt/JwtTokenService.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/Jwt/JwtOptions.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Security.Tests/JwtTokenServiceTests.cs (also creates test csproj)

Mirror ~/Desktop/scadalink-design/src/ScadaLink.Security/JwtTokenService.cs. Options: SigningKey (HS256, ≥32 bytes), Issuer, Audience, ExpiryMinutes (default 15). Issue(claims) → string. TryValidate(token, out principal) → bool.

Tests cover: valid token roundtrip; expired token rejected; tampered token rejected; missing required claim rejected.

Commit: feat(security): JwtTokenService with HS256 + 15-min expiry.


Task 26: Cookie+JWT hybrid registration extension

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 27, 28

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs

AddOtOpcUaAuth(IConfiguration):

  1. Bind JwtOptions from Security:Jwt, bind CookieOptions from Security:Cookie.
  2. services.AddDataProtection().PersistKeysToDbContext<OtOpcUaConfigDbContext>().SetApplicationName("OtOpcUa").
  3. services.AddAuthentication(CookieAuthenticationDefaults.AuthenticationScheme)
    • .AddCookie(o => { o.Cookie.Name = "OtOpcUa.Auth"; o.Cookie.HttpOnly = true; o.Cookie.SameSite = SameSiteMode.Strict; o.Cookie.SecurePolicy = CookieSecurePolicy.SameAsRequest; o.SlidingExpiration = true; o.ExpireTimeSpan = TimeSpan.FromMinutes(30); })
    • .AddJwtBearer(JwtBearerDefaults.AuthenticationScheme, o => { /* HS256 with JwtOptions.SigningKey */ }).
  4. services.AddAuthorization() + fallback policy requiring authenticated user.
  5. Register LdapAuthService, JwtTokenService, RoleMapper.

Mirror the wiring in ~/Desktop/scadalink-design/src/ScadaLink.Security/ServiceCollectionExtensions.cs exactly for the cookie/JWT/DataProtection plumbing.

Commit: feat(security): cookie+JWT hybrid auth via AddOtOpcUaAuth.


Task 27: /auth/login, /auth/ping, /auth/token endpoints

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 26, 28

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/Endpoints/AuthEndpoints.cs

Mirror ~/Desktop/scadalink-design/src/ScadaLink.Security/Endpoints/AuthEndpoints.cs. Three minimal-API endpoints:

  • POST /auth/login — accepts {username, password}, calls LdapAuthService.AuthenticateAsync, builds claims (sub, roles), issues cookie via HttpContext.SignInAsync AND embeds JWT in cookie. Returns 204 on success / 401 on bad creds / 503 on LDAP unreachable.
  • GET /auth/ping[AllowAnonymous], returns 200 if User.Identity.IsAuthenticated, 401 otherwise.
  • POST /auth/token — authenticated, returns {token: "..."} JWT bearer for external clients.

Extension method MapOtOpcUaAuth(this IEndpointRouteBuilder). Wire in Host Program.cs at Task 53.

Commit: feat(security): /auth/login, /auth/ping, /auth/token endpoints.


Task 28: CookieAuthenticationStateProvider for Blazor circuits

Classification: small Estimated implement time: ~4 min Parallelizable with: Task 26, 27

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Security/Blazor/CookieAuthenticationStateProvider.cs

Standard pattern: snapshots HttpContext.User at circuit construction, polls /auth/ping every 60s to detect expiry, calls NotifyAuthenticationStateChanged on transition. Mirror ScadaLink's equivalent — search ~/Desktop/scadalink-design/src/ScadaLink.CentralUI/ for the *AuthenticationStateProvider* file.

Commit: feat(security): CookieAuthenticationStateProvider for Blazor circuit expiry detection.


Task 29: Security test project + tests

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (verification — depends on Tasks 24-28)

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.Security.Tests/ZB.MOM.WW.OtOpcUa.Security.Tests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.Security.Tests/JwtTokenServiceTests.cs (moved from Task 25 if dropped elsewhere)
  • Create: tests/ZB.MOM.WW.OtOpcUa.Security.Tests/AuthEndpointsTests.cs — uses Microsoft.AspNetCore.Mvc.Testing with a WebApplicationFactory<Program> against a stubbed LDAP

Tests cover: login happy path issues cookie+JWT; login bad password returns 401; login with LDAP outage returns 503; /auth/ping after expired cookie returns 401; /auth/token issues a valid JWT for authenticated user.

Add to solution. Run: dotnet test tests/ZB.MOM.WW.OtOpcUa.Security.Tests/. Expected: all green.

Commit: test(security): cookie+JWT roundtrip, login/ping/token endpoint tests.


Phase 5 — ControlPlane cluster singletons

Task 30: ConfigPublishCoordinator — happy path

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 32 (different files; sibling singletons)

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Coordinators/ConfigPublishCoordinator.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ConfigPublishCoordinatorTests.cs (also creates test csproj)

Step 1: Write failing test (Akka.TestKit.Xunit2)

[Fact]
public async Task HappyPath_AllNodesAck_SealsDeployment()
{
    using var harness = new ControlPlaneHarness();
    var coord = harness.Sys.ActorOf(ConfigPublishCoordinator.Props(harness.DbFactory));

    var ack1 = new ApplyAck(harness.DeploymentId, NodeId.Of("node-a"), ApplyAckOutcome.Applied, null, CorrelationId.NewId());
    var ack2 = new ApplyAck(harness.DeploymentId, NodeId.Of("node-b"), ApplyAckOutcome.Applied, null, CorrelationId.NewId());

    coord.Tell(new DispatchDeployment(harness.DeploymentId, harness.RevisionHash, CorrelationId.NewId()));
    coord.Tell(ack1);
    coord.Tell(ack2);

    await harness.WaitUntil(() => harness.LoadDeploymentStatus() == DeploymentStatus.Sealed, TimeSpan.FromSeconds(5));
}

ControlPlaneHarness is a helper that spins up Akka TestKit + in-memory EF Core ConfigDb seeded with a Deployment row in Dispatching and two ClusterNode rows.

Step 2: Run test, expect FAIL (class doesn't exist).

Step 3: Implement ConfigPublishCoordinator minimal:

public sealed class ConfigPublishCoordinator : ReceiveActor
{
    public static Props Props(IDbContextFactory<OtOpcUaConfigDbContext> dbFactory) =>
        Akka.Actor.Props.Create(() => new ConfigPublishCoordinator(dbFactory));

    private readonly IDbContextFactory<OtOpcUaConfigDbContext> _dbFactory;
    private readonly HashSet<NodeId> _expectedAcks = new();
    private DeploymentId _current;
    private readonly Dictionary<NodeId, ApplyAckOutcome> _acks = new();

    public ConfigPublishCoordinator(IDbContextFactory<OtOpcUaConfigDbContext> dbFactory)
    {
        _dbFactory = dbFactory;
        Receive<DispatchDeployment>(HandleDispatch);
        Receive<ApplyAck>(HandleAck);
    }

    private void HandleDispatch(DispatchDeployment msg)
    {
        _current = msg.DeploymentId;
        using var ctx = _dbFactory.CreateDbContext();
        _expectedAcks.UnionWith(ctx.ClusterNodes.Where(n => n.RolesCsv.Contains("driver")).Select(n => NodeId.Of(n.NodeId)).ToList());
        DistributedPubSub.Get(Context.System).Mediator.Tell(new Publish("deployments", msg));
    }

    private void HandleAck(ApplyAck msg)
    {
        if (msg.DeploymentId != _current) return; // stale
        _acks[msg.NodeId] = msg.Outcome;
        if (_acks.Count == _expectedAcks.Count && _acks.Values.All(o => o == ApplyAckOutcome.Applied))
            SealDeployment();
    }

    private void SealDeployment()
    {
        using var ctx = _dbFactory.CreateDbContext();
        var d = ctx.Deployments.Single(x => x.DeploymentId == _current.Value);
        d.Status = DeploymentStatus.Sealed;
        d.SealedAtUtc = DateTime.UtcNow;
        ctx.SaveChanges();
    }
}

Step 4: Run test, expect PASS.

Step 5: Commit: feat(controlplane): ConfigPublishCoordinator happy path.


Task 31: ConfigPublishCoordinator — timeout + failover recovery

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 32

Files:

  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Coordinators/ConfigPublishCoordinator.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ConfigPublishCoordinatorTimeoutTests.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ConfigPublishCoordinatorRecoveryTests.cs

Step 1: Add tests for:

  • Deadline elapses with one node unacked → Deployment.Status = TimedOut.
  • New Coordinator started with in-flight Dispatching deployment recovers state via PreStart (queries Deployment + NodeDeploymentState).

Step 2: Extend Coordinator with:

  • Context.System.Scheduler.ScheduleTellOnce(applyMaxDuration, Self, new DeadlineElapsed(_current)) after dispatch.
  • Receive<DeadlineElapsed> handler that marks TimedOut if any node unacked.
  • protected override void PreStart(): read Deployment rows where Status{Dispatching, AwaitingApplyAcks}; for each, repopulate _current, _expectedAcks, _acks from NodeDeploymentState; schedule remaining deadline.

Step 3: Run all ConfigPublishCoordinatorTests. Expected: all green.

Commit: feat(controlplane): ConfigPublishCoordinator deadline timeout + failover recovery.


Task 32: AdminOperationsActor + StartDeployment handler

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 30, 31, 33, 34, 35

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/AdminOperations/AdminOperationsActor.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/AdminOperations/ConfigComposer.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/AdminOperationsActorTests.cs

Responsibilities:

  1. Receive StartDeployment(createdBy, correlationId).
  2. ConfigComposer.SnapshotAndFlatten(dbContext) → byte[] ArtifactBlob (DataContract-serialized or System.Text.Json over the flat artifact). Pure function.
  3. Compute RevisionHash = SHA256(artifactBlob).ToHexString().
  4. Insert Deployment row (Status = Dispatching).
  5. Insert one ConfigEdit audit row marking the deployment snapshot.
  6. coordinator.Tell(new DispatchDeployment(deploymentId, revisionHash, correlationId)).
  7. Reply StartDeploymentResult(deploymentId, revisionHash) to sender.

For now stub CRUD ops as TODO comments — they'll be filled in Task 51 (UI wiring).

Tests: snapshot is deterministic given a fixed seed of equipment rows; hash matches; Deployment row inserted; DispatchDeployment dispatched to mocked coordinator.

Commit: feat(controlplane): AdminOperationsActor + ConfigComposer + StartDeployment flow.


Task 33: AuditWriterActor

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 30, 31, 32, 34, 35

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/AuditWriterActorTests.cs

Receives AuditEvent messages, batches into in-memory buffer (cap 500 events / 5s flush window), bulk-inserts to ConfigAuditLog. Idempotent on EventId (INSERT IF NOT EXISTS or MERGE). On PreRestart flushes buffer.

Tests: 1000 events with random duplicates → ConfigAuditLog has correct count, no duplicates; PreRestart simulates supervisor restart and verifies buffer is flushed before death.

Commit: feat(controlplane): AuditWriterActor with batched idempotent insert.


Task 34: FleetStatusBroadcaster

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 30, 31, 32, 33, 35

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Fleet/FleetStatusBroadcaster.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/FleetStatusBroadcasterTests.cs

Subscribes to ClusterEvent.MemberUp, MemberRemoved, UnreachableMember, ReachableMember, LeaderChanged, RoleLeaderChanged. Receives per-node DriverHostStatusHeartbeat Tells. Maintains in-memory FleetSnapshot. Pushes diffs via injected IHubContext<FleetStatusHub> and IHubContext<AlertHub>.

Hubs themselves are not built yet — at this stage inject mock IHubContext for tests. UI rewiring happens in Task 50.

Tests: cluster member up → diff broadcast; heartbeat staleness → unreachable broadcast; full snapshot on OnConnectedAsync request.

Commit: feat(controlplane): FleetStatusBroadcaster push-driven from Akka cluster events.


Task 35: RedundancyStateActor

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 30, 31, 32, 33, 34

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Redundancy/RedundancyStateActor.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Redundancy/ServiceLevelCalculator.cs (pure function)
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ServiceLevelCalculatorTests.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/RedundancyStateActorTests.cs

ServiceLevelCalculator: pure static function per design §6:

public static byte Compute(NodeHealthInputs h)
{
    if (h.MemberState is not (MemberStatus.Up or MemberStatus.Joining))
        return 0;
    byte basis = (h.DbReachable, h.OpcUaProbeOk, h.Stale) switch
    {
        (true,  true,  false) => 240,
        (true,  _,     true)  => 200,
        (false, _,     true)  => 100,
        _ => 0
    };
    return (byte)Math.Clamp(basis + (h.IsDriverRoleLeader ? 10 : 0), 0, 255);
}

Tests: every combination of inputs → expected byte (FsCheck or table-driven).

RedundancyStateActor: subscribes to cluster events, debounces 250ms, recomputes per-node NodeRedundancyState, publishes RedundancyStateChanged via DistributedPubSub topic redundancy-state.

Commit: feat(controlplane): RedundancyStateActor + pure ServiceLevelCalculator.


Task 36: Singleton registration extension

Classification: standard Estimated implement time: ~4 min Parallelizable with: none (depends on Tasks 30-35)

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/ServiceCollectionExtensions.cs

AddOtOpcUaControlPlane(): registers all five singletons via Akka.Cluster.Hosting WithClusterSingletonProxy<TActor> extension methods, all pinned to admin role.

Pattern (mirror ~/Desktop/scadalink-design/src/ScadaLink.ManagementService/ServiceCollectionExtensions.cs):

public static IServiceCollection AddOtOpcUaControlPlane(this IServiceCollection services)
{
    services.AddSingleton<IControlPlaneStartup, ControlPlaneStartup>();
    return services;
}

internal sealed class ControlPlaneStartup : IControlPlaneStartup
{
    public void Configure(AkkaConfigurationBuilder cb)
    {
        cb.WithClusterSingleton<ConfigPublishCoordinator>("config-publish", new ClusterSingletonOptions { Role = "admin" });
        cb.WithClusterSingleton<AdminOperationsActor>("admin-ops",         new ClusterSingletonOptions { Role = "admin" });
        cb.WithClusterSingleton<AuditWriterActor>("audit-writer",          new ClusterSingletonOptions { Role = "admin" });
        cb.WithClusterSingleton<FleetStatusBroadcaster>("fleet-status",    new ClusterSingletonOptions { Role = "admin" });
        cb.WithClusterSingleton<RedundancyStateActor>("redundancy-state",  new ClusterSingletonOptions { Role = "admin" });
    }
}

Verify against ScadaLink's actual API surface — Akka.Hosting syntax may differ slightly across versions.

Commit: feat(controlplane): singleton registration extension pinned to admin role.


Phase 6 — Runtime per-node actors

Task 37: DriverHostActor scaffolding + bootstrap

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 41, 42, 43, 44 (different actors)

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverHostActorBootstrapTests.cs (also creates test csproj)

DriverHostActor responsibilities (this task):

  • PreStart: read NodeDeploymentState for self; if Applied → Become Steady(currentDeployment); if Applying (orphan) → discard, replay; if no row + ConfigDb unreachable → fall back to LiteDb cache → Become Stale.
  • Subscribe to DistributedPubSub topic deployments.

State machine via Become: Bootstrapping → Steady | Applying(id) | Stale.

Tests: orphan Applying row → re-runs apply on PreStart; missing row + DB unreachable → Stale state.

Commit: feat(runtime): DriverHostActor scaffolding + PreStart recovery.


Task 38: DriverHostActor DispatchDeployment handler

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 41, 42, 43, 44 (different actors)

Files:

  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverHostActorDispatchTests.cs

Add:

  • Receive<DispatchDeployment>:
    • If currentRevisionHash == msg.RevisionHash → reply ApplyAck(Applied) immediately.
    • Else → write NodeDeploymentState(Applying), Become Applying(msg.DeploymentId), fetch artifact, compute delta, dispatch ApplyDelta to children, collect acks, write NodeDeploymentState(Applied|Failed), reply ApplyAck to coordinator, Become Steady.

For now children dispatch is mocked — actual DriverInstanceActor integration in Task 41.

Tests: idempotent dispatch (same hash → ack, no work); successful apply → ack Applied; child failure → ack Failed.

Commit: feat(runtime): DriverHostActor handles DispatchDeployment idempotently.


Task 39: DriverHostActor stale-config fallback

Classification: standard Estimated implement time: ~4 min Parallelizable with: Task 41, 42, 43, 44

Files:

  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs (background reconnect)
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverHostActorStaleTests.cs

Background Context.System.Scheduler.ScheduleTellRepeatedly(30s, 30s, Self, RetryConfigDbConnection.Instance). On RetryConfigDbConnection: try ConfigDb; on success and current state is Stale, pull latest sealed deployment, apply, Become Steady; publish NodeRedundancyState(Stale=false) to redundancy-state topic.

Tests: simulated DB outage → Stale published; DB recovery → state advances + Stale=false published.

Commit: feat(runtime): DriverHostActor stale-config fallback + reconnect.


Task 40: Runtime test project bootstrap

Classification: small Estimated implement time: ~3 min Parallelizable with: none (depends on Tasks 37-39)

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/RuntimeHarness.cs — TestKit base with EF in-memory + driver mocks

Confirm all DriverHostActor tests from Tasks 37-39 pass. Add to solution.

Commit: test(runtime): test project scaffold + DriverHostActor tests passing.


Task 41: DriverInstanceActor state machine

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 37-39 already done; parallel with Task 42, 43, 44

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverInstanceActorTests.cs

States via Become: Connecting → Connected → Reconnecting → Failed.

  • PreStart → enter Connecting; call IDriver.InitializeAsync.
  • On connect success → Become Connected; subscribe tags; publish OpcUaPublishActor.AttributeValueUpdate.
  • On disconnect → Become Reconnecting; publish bad quality to all subscribed tags; schedule retry at fixed interval (driver.ReconnectIntervalSeconds, default 10).
  • On ApplyDelta(plan) → idempotent diff against current state; only changed attributes update; reply ApplyResult to parent.
  • On write request via Ask<WriteAttribute> → synchronous; failure returned to caller.
  • Restart with exponential backoff supervises via parent.

Reuse existing IDriver interface (from current OtOpcUa.Driver.* projects).

Tests: connecting transitions to Connected on success; disconnect triggers bad-quality publish + Reconnecting; write failure returned to Ask caller; ApplyDelta diffs correctly.

Commit: feat(runtime): DriverInstanceActor with Connecting/Connected/Reconnecting/Failed.


Task 42: VirtualTagActor

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 41, 43, 44

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/VirtualTags/VirtualTagActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/VirtualTagActorTests.cs

Wraps existing VirtualTagEngine from ~/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/. On subscribe-to-dependencies value update, recomputes expression, publishes result to OpcUaPublishActor.

Restart with backoff; expression compile errors fail the actor (parent restarts with backoff).

Commit: feat(runtime): VirtualTagActor wrapping VirtualTagEngine.


Task 43: ScriptedAlarmActor

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 41, 42, 44

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ScriptedAlarms/ScriptedAlarmActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ScriptedAlarmActorTests.cs

Wraps existing AlarmConditionService. State machine Inactive → Active → Acknowledged → Inactive. On state change, emits history row to HistorianAdapterActor. PreRestart hook serializes current alarm state to ScriptedAlarmState ConfigDb table; PostStop/PreStart rehydrates from it.

Commit: feat(runtime): ScriptedAlarmActor with state preservation across restart.


Task 44: OpcUaPublishActor on synchronized dispatcher

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 41, 42, 43

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/OpcUa/OpcUaPublishActor.cs
  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/OpcUaPublishActorTests.cs

Bridge between Akka messages and OPCFoundation address space. Pinned dispatcher: opcua-synchronized-dispatcher (from HOCON, Task 19) — Props.WithDispatcher("opcua-synchronized-dispatcher").

Responsibilities:

  • Receive AttributeValueUpdate(nodeId, value, quality, timestampUtc) → write to OPC UA address space.
  • Receive AlarmStateUpdate(...) → write alarm node.
  • Subscribe to DistributedPubSub topic redundancy-state → on NodeRedundancyState for this node, write ServiceLevel byte + ServerUriArray nodes.
  • Receive RebuildAddressSpace → marshal address-space rebuild via OPC UA SDK API; bump sequence number.

OPC UA SDK objects are NEVER exposed in message payloads — actor owns them internally.

Tests: receive update → SDK write invoked; ServiceLevel update → ServiceLevel node written with correct byte.

Commit: feat(runtime): OpcUaPublishActor bridges Akka and OPCFoundation address space.


Task 45: HistorianAdapterActor, PeerOpcUaProbeActor, DbHealthProbeActor

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (last Phase 6 task — combines three small actors)

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Historian/HistorianAdapterActor.cs

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Health/PeerOpcUaProbeActor.cs

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Health/DbHealthProbeActor.cs

  • Test: tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/HealthProbeActorTests.cs

  • HistorianAdapterActor: wraps existing named-pipe IPC to Wonderware sidecar. Buffers writes to SQLite store-and-forward on pipe disconnect. Reuses existing SqliteStoreAndForwardSink from current OtOpcUa code (find via grep -rln SqliteStoreAndForwardSink ~/Desktop/OtOpcUa/src).

  • PeerOpcUaProbeActor: per-peer-node periodic OPC UA opc.tcp://peer:4840 ping. Publishes OpcUaProbeResult(nodeId, ok) to redundancy-state topic input.

  • DbHealthProbeActor: cached DB probe (single-flight) feeding /health/ready + RedundancyStateActor. Reuses DbHealthCache if present.

Wrap all three actors as children under DriverHostActor.

Commit: feat(runtime): HistorianAdapter + PeerOpcUaProbe + DbHealthProbe actors.


Phase 7 — OpcUaServer extraction

Task 46: Move OpcUaApplicationHost + Phase7Composer

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (large file move with namespace rename)

Files:

  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaApplicationHost.cssrc/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OpcUaApplicationHost.cs
  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7Composer.cssrc/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Phase7Composer.cs
  • Update all callers (namespace rename to ZB.MOM.WW.OtOpcUa.OpcUaServer)

Use grep -rln 'ZB.MOM.WW.OtOpcUa.Server.OpcUa' ~/Desktop/OtOpcUa/src and grep -rln 'ZB.MOM.WW.OtOpcUa.Server.Phase7' ~/Desktop/OtOpcUa/src to find imports; update them.

Build green check.

Commit: refactor(opcua): extract OpcUaApplicationHost and Phase7Composer to OpcUaServer library.


Task 47: Make Phase7Composer pure + property test

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 48-52 (Phase 8)

Files:

  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Phase7Composer.cs (remove side effects; take inputs as parameters)
  • Test: tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests/Phase7ComposerPurityTests.cs

Refactor: remove static state, remove logging side effects (or make logging optional via injected ILogger?), return a Phase7CompositionResult record. Same inputs must always produce identical output.

Property test (FsCheck or hand-rolled): generate random EquipmentRow[], DriverInstanceRow[], ScriptRow[] arrays; call ComposeAsync twice; assert results structurally equal.

Commit: refactor(opcua): make Phase7Composer pure + property tests.


Phase 8 — AdminUI library migration

Task 48: Move Blazor components into AdminUI library

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 47

Files:

  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Admin/Components/*src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/*
  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Admin/wwwroot/*src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/wwwroot/*

Namespace rename across all .razor + .razor.cs: ZB.MOM.WW.OtOpcUa.Admin.ComponentsZB.MOM.WW.OtOpcUa.AdminUI.Components.

MapAdminUI<TApp>(this IEndpointRouteBuilder, IServiceCollection) extension method in src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/EndpointRouteBuilderExtensions.cs that maps Razor components and static assets. Mirror ScadaLink's MapCentralUI<TApp> exactly.

Build green check.

Commit: refactor(adminui): move Blazor components from Admin into AdminUI Razor class library.


Task 49: Move SignalR hubs into AdminUI; rewire to FleetStatusBroadcaster

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 50, 51, 52

Files:

  • Move: src/Server/ZB.MOM.WW.OtOpcUa.Admin/Hubs/*.cssrc/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Hubs/*.cs
  • Delete: src/Server/ZB.MOM.WW.OtOpcUa.Admin/Hubs/FleetStatusPoller.cs (replaced by FleetStatusBroadcaster)
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Fleet/FleetStatusBroadcaster.cs — inject IHubContext<FleetStatusHub>, push diffs to it; do same for AlertHub, ScriptLogHub

Note: hubs in AdminUI reference ControlPlane only for telemetry types; ControlPlane references hub interfaces via DI'd IHubContext<T> — no project-reference cycle.

Build green check.

Commit: refactor(adminui): SignalR hubs fed by FleetStatusBroadcaster push, no polling.


Task 50: IAdminOperationsClient wrapper

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 49, 51, 52

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Clients/AdminOperationsClient.cs

Implements IAdminOperationsClient (from Task 18) via ClusterSingletonProxy to admin-ops. Each method does proxy.Ask<TResult>(message, timeout) with 10s timeout + propagated cancellation.

Register in DI: services.AddScoped<IAdminOperationsClient, AdminOperationsClient>() (scoped because per-circuit HttpContext.User flows in claims).

Commit: feat(adminui): IAdminOperationsClient backed by ClusterSingletonProxy.


Task 51: Replace DriverDiagnosticsClient with IFleetDiagnosticsClient

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 49, 50, 52

Files:

  • Delete: src/Server/ZB.MOM.WW.OtOpcUa.Admin/Services/DriverDiagnosticsClient.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Clients/FleetDiagnosticsClient.cs
  • Modify: any Blazor pages that referenced DriverDiagnosticsClient (use grep -rln DriverDiagnosticsClient ~/Desktop/OtOpcUa/src)

FleetDiagnosticsClient uses ClusterClient (or ActorSelection if same cluster) to send GetDiagnosticsRequest(nodeId) to /user/driver-host at the target node and await response.

Pages updated to inject IFleetDiagnosticsClient instead.

Commit: refactor(adminui): replace HTTP DriverDiagnosticsClient with actor-based IFleetDiagnosticsClient.


Task 52: Drift indicator + Deploy button

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 49, 50, 51

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Deployments.razor
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Layout/MainLayout.razor (add drift badge if applicable)

Deployments.razor:

  • Table of Deployment rows (most recent first), columns: DeploymentId (short), RevisionHash (short), Status, CreatedBy, CreatedAtUtc, SealedAtUtc.
  • "Deploy current configuration" button (requires FleetAdmin or ConfigEditor role) → calls IAdminOperationsClient.StartDeploymentAsync(User.Identity.Name, ct) → toast + auto-refresh table.
  • Drift badge: green "in sync" if latest sealed Deployment's revision hash matches ConfigComposer.SnapshotAndFlatten() of current ConfigDb state; yellow "drift" otherwise.

Use frontend-design skill aesthetic: clean corporate Bootstrap, vertical stacking per feedback_form_layout.md.

Commit: feat(adminui): Deployments page with drift indicator and Deploy button.


Phase 9 — Host entry point

Task 53: Host/Program.cs role-gated startup

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 54, 55

Files:

  • Replace: src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs

Mirror ~/Desktop/scadalink-design/src/ScadaLink.Host/Program.cs structure. Pseudocode:

var roles = RoleParser.Parse(Environment.GetEnvironmentVariable("OTOPCUA_ROLES"));
var builder = WebApplication.CreateBuilder(args);
builder.Configuration.AddJsonFile($"appsettings.{string.Join('-', roles.OrderBy(r=>r))}.json", optional: true);
builder.Host.UseSerilog(...);
if (OperatingSystem.IsWindows()) builder.Host.UseWindowsService();

builder.Services.AddOtOpcUaConfigDb(builder.Configuration);
builder.Services.AddOtOpcUaCluster(builder.Configuration);
builder.Services.AddOtOpcUaSecurity(builder.Configuration);
builder.Services.AddAkka("otopcua", (ab, sp) => {
    ab.AddOtOpcUaClusterConfig(roles);
    if (roles.Contains("admin"))  sp.GetRequiredService<IControlPlaneStartup>().Configure(ab);
    if (roles.Contains("driver")) sp.GetRequiredService<IRuntimeStartup>().Configure(ab);
});
if (roles.Contains("admin"))
{
    builder.Services.AddRazorComponents().AddInteractiveServerComponents();
    builder.Services.AddSignalR();
    builder.Services.AddOtOpcUaAdminUI();
}

var app = builder.Build();
app.UseSerilogRequestLogging();
if (roles.Contains("admin"))
{
    app.UseAuthentication();
    app.UseAuthorization();
    app.UseAntiforgery();
    app.MapOtOpcUaAuth();
    app.MapAdminUI<App>();
    app.MapHub<FleetStatusHub>("/hubs/fleet");
    app.MapHub<AlertHub>("/hubs/alerts");
    app.MapHub<ScriptLogHub>("/hubs/script-log");
}
app.MapHealthEndpoints();
await app.RunAsync();

Reads Roles from env; binds Akka cluster config; conditionally maps Blazor + hubs only if admin role.

Commit: feat(host): role-gated Program.cs composes all components.


Task 54: Health endpoints + appsettings layout

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 53, 55

Files:

  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/HealthEndpoints.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/DatabaseHealthCheck.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AkkaClusterHealthCheck.cs
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AdminRoleLeaderHealthCheck.cs
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json (full Cluster/Security/ConfigDb/OpcUa/Drivers/Historian sections)
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.admin-driver.json (combined-role default)
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.admin.json
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.driver.json

Three endpoints (mirror ScadaLink's pattern):

  • MapHealthChecks("/health/ready", new { Predicate = c => c.Tags.Contains("ready") })
  • MapHealthChecks("/health/active", new { Predicate = c => c.Tags.Contains("active") })
  • /healthz on port 4841 — preserve current OPC UA stack health probe semantics

Commit: feat(host): health endpoints + per-role appsettings split.


Task 55: Mac dev mode + dev-stub drivers

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 53, 54

Files:

  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs (add Stubbed Become state)
  • Create: src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.Development.json
  • Modify: src/Server/ZB.MOM.WW.OtOpcUa.Cluster/RoleParser.cs (already allows "dev" role per Task 21)

DriverInstanceActor: at PreStart, if any of:

  • roles.Contains("dev") AND driverType is "Galaxy" or "Historian.Wonderware"
  • !OperatingSystem.IsWindows() AND driverType is Windows-only

→ Become Stubbed immediately; log INFO [DEV-STUB] driver={Name} reason={dev-role|non-windows}. Stubbed state returns deterministic test values for read; no-op for write.

appsettings.Development.json sets Security:Ldap:DevStubMode = true.

Commit: feat(runtime): DEV-STUB mode for Galaxy/Wonderware on non-Windows or dev role.


Phase 10 — Cleanup & deletions

Task 56: Delete OtOpcUa.Server and OtOpcUa.Admin projects

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: none (depends on Tasks 0-55)

Files:

  • Delete (directory): src/Server/ZB.MOM.WW.OtOpcUa.Server/
  • Delete (directory): src/Server/ZB.MOM.WW.OtOpcUa.Admin/
  • Modify: ZB.MOM.WW.OtOpcUa.slnx (remove the two project entries)
  • Sweep & delete files referenced in design §10 step 12:
    • DriverInstanceBootstrapper.cs (should be in Server, already deleted)
    • Redundancy/RedundancyCoordinator.cs
    • Redundancy/RedundancyStatePublisher.cs
    • Redundancy/ApplyLeaseRegistry.cs
    • Hosting/PeerHttpProbeLoop.cs
    • Hosting/PeerUaProbeLoop.cs — if not yet ported to PeerOpcUaProbeActor, port it now
    • Hubs/FleetStatusPoller.cs (should be moved/deleted in Task 49)
    • Security/HubTokenService.cs
  • Grep sweep: grep -rln 'RedundancyRole\|ConfigGeneration\|ApplyLeaseRegistry\|PeerHttpProbeLoop\|FleetStatusPoller\|HubTokenService' ~/Desktop/OtOpcUa/src — if any reference survives, fix it.
  • Delete corresponding tests/ZB.MOM.WW.OtOpcUa.Server.Tests/ and tests/ZB.MOM.WW.OtOpcUa.Admin.Tests/ (or keep and gut, depending on what's salvageable — recommend full delete and rebuild from Phase 11)

Build green:

dotnet build ZB.MOM.WW.OtOpcUa.slnx

Run all surviving tests:

dotnet test ZB.MOM.WW.OtOpcUa.slnx --no-build

Commit: chore(cleanup): delete OtOpcUa.Server, OtOpcUa.Admin, and obsoleted v1 services.


Task 57: Build & test green check

Classification: trivial Estimated implement time: ~3 min Parallelizable with: none

Verify. No commit unless cleanup needed.


Phase 11 — Integration & E2E tests

Task 58: Host integration test harness (2-node in-process cluster)

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (foundational)

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/TwoNodeClusterHarness.cs
  • Create: tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/docker-compose.yml (SQL Server + OpenLDAP for Mac-friendly local runs)

TwoNodeClusterHarness spins up two WebApplicationFactory<Program> instances on different ports + different Akka ports + shared SQL Server. Forms a 2-member cluster (both admin+driver). Exposes AdminA, AdminB, DriverA, DriverB references (in this harness, A==A and B==B since both roles on both nodes).

Commit: test(host): 2-node integration test harness.


Task 59: Deploy happy path + failover integration tests

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 60

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/DeployHappyPathTests.cs
  • Create: tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/FailoverDuringDeployTests.cs

Test cases mirror design §8 "Failover-specific test cases" 1-7. Each test spins up the 2-node harness, performs the scenario, asserts final ConfigDb + actor state.

Commit: test(host): deploy happy path + failover-during-deploy integration tests.


Task 60: OPC UA integration tests

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 59

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests/DualEndpointTests.cs
  • Create: tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests/ServiceLevelTests.cs

Tests: real OPCFoundation client → both endpoints visible in ServerUriArray; ServiceLevel byte = 250 on leader, 240 on follower (with the +10 leader bonus); write through OpcUaPublishActor returns synchronous failure on driver write error.

Commit: test(opcua): dual-endpoint visibility + ServiceLevel leader-bonus tests.


Task 61: E2E test infrastructure + CI

Classification: standard Estimated implement time: ~5 min Parallelizable with: none

Files:

  • Create: tests/ZB.MOM.WW.OtOpcUa.E2ETests/ZB.MOM.WW.OtOpcUa.E2ETests.csproj
  • Create: tests/ZB.MOM.WW.OtOpcUa.E2ETests/docker-compose.yml (4 Host processes — 2 admin+driver + 2 driver-only + Traefik + SQL + LDAP)
  • Create: .github/workflows/v2-ci.yml — unit + integration jobs; nightly E2E job

CI runs dotnet build, dotnet test --filter Category!=E2E, dotnet test --filter Category=E2E nightly only.

Commit: ci(v2): integration test workflow + nightly E2E.


Phase 12 — Deploy scripts & docs

Task 62: Rewrite Install-Services.ps1

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 63, 64, 65

Files:

  • Replace: scripts/install/Install-Services.ps1

New script installs a single Windows Service OtOpcUaHost per node; takes -Roles parameter, writes OTOPCUA_ROLES to service env; binds to a configurable port (default 9000). Uses sc.exe create with restart-on-failure.

Update Refresh-Services.ps1 and Uninstall-Services.ps1 to match.

Commit: feat(install): single-service Install-Services.ps1 with -Roles parameter.


Task 63: Traefik config + docker-dev/

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 62, 64, 65

Files:

  • Create: scripts/install/Install-Traefik.ps1
  • Create: scripts/install/traefik.toml (or traefik.yml)
  • Create: docker-dev/docker-compose.yml
  • Create: docker-dev/README.md
  • Create: docker-dev/Dockerfile

traefik.toml: one entrypoint :80, one router host=otopcua.*, one service load-balancing admin-a:9000 + admin-b:9000 with /health/active health check (interval 5s, timeout 2s, expected 200).

docker-dev/ runs four Host containers (admin-a, admin-b, driver-a, driver-b) + SQL Server + OpenLDAP + Traefik. Mac-friendly. README walks through docker compose up -d and access at http://localhost.

Commit: feat(deploy): Traefik config + docker-dev Mac dev compose.


Task 64: Update existing docs

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 62, 63, 65

Files:

  • Rewrite: docs/Redundancy.md
  • Rewrite: docs/ServiceHosting.md
  • Update: docs/security.md
  • Update: docs/README.md

Redundancy.md: replace operator-managed RedundancyRole story with Akka-leader-driven ServiceLevel. Document the ServiceLevelCalculator truth table. ServiceHosting.md: single fused service, role gating, Traefik, health endpoints. security.md: cookie+JWT hybrid, DataProtection keys in ConfigDb, /auth/ping polling.

Commit: docs: rewrite Redundancy + ServiceHosting + security for v2.


Task 65: New v2 architecture docs

Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 62, 63, 64

Files:

  • Create: docs/Architecture-v2.md (high-level summary, references design doc)
  • Create: docs/Cluster.md (Akka HOCON, roles, split-brain, failure detector)
  • Create: docs/ControlPlane.md (singletons, their state machines, ConfigDb tables)
  • Create: docs/Runtime.md (per-node actor tree, OPC UA bridge, dev-stub mode)

Each ~1-2 pages. Link to design doc as source of truth.

Commit: docs: v2 architecture overview + Cluster/ControlPlane/Runtime guides.


Final verification

After Task 65:

  1. dotnet build ZB.MOM.WW.OtOpcUa.slnx — green
  2. dotnet test ZB.MOM.WW.OtOpcUa.slnx — all green (unit + integration)
  3. cd docker-dev && docker compose up -d — manual smoke: login at http://localhost, deploy from UI, verify OPC UA dual endpoint via UaExpert
  4. Run scripts/migration/Migrate-To-V2.ps1 against a copy of a real ConfigDb backup; verify row counts match expectations.
  5. Tag v1.x.x-final on master for backport-only fixes.
  6. Open PR v2-akka-fusemaster titled "v2: Akka.NET cluster + fused hosting alignment".

Task index

# Title Class Time Parallel with
0 Branch + Directory.Packages.props small 3m
1 Commons project small 3m 2-8
2 Cluster project small 3m 1,3-8
3 Security project small 3m 1,2,4-8
4 ControlPlane project small 3m 1-3,5-8
5 Runtime project small 3m 1-4,6-8
6 OpcUaServer project small 3m 1-5,7,8
7 AdminUI project small 3m 1-6,8
8 Host project small 5m 1-7
9 Build green trivial 2m
10 Deployment entity standard 5m 11-13
11 NodeDeploymentState entity standard 5m 10,12,13
12 ConfigEdit entity small 4m 10,11,13
13 DataProtection keys small 3m 10-12
14a RowVersion on live-edit entities standard 10m
14b Drop GenerationId FK from entities high-risk 30m
14c Obsolete GenerationApplier/Diff/Cache high-risk 20m
14d Drop ClusterNode.RedundancyRole standard 5m
14e Delete ConfigGeneration + ClusterNodeGenerationState small 5m
14f V2HostingAlignment migration (consolidator) high-risk 15m
15 Migrate-To-V2.ps1 standard 5m 16-18
16 Common types standard 5m 17,18
17 Message contracts standard 5m 16,18
18 Common interfaces small 4m 16,17
19 HOCON standard 5m 20-22
20 AkkaHostedService standard 5m 19,21,22
21 Role parser small 3m 19,20,22
22 ClusterRoleInfo standard 5m 19-21
23 Cluster tests standard 5m
24 Move LdapAuthService standard 5m 25
25 JwtTokenService standard 5m 24
26 AddOtOpcUaAuth standard 5m 27,28
27 Auth endpoints standard 5m 26,28
28 CookieAuthStateProvider small 4m 26,27
29 Security tests standard 5m
30 ConfigPublishCoordinator happy high-risk 5m 32-35
31 Coordinator timeout/recovery high-risk 5m 32-35
32 AdminOperationsActor standard 5m 30,31,33-35
33 AuditWriterActor standard 5m 30-32,34,35
34 FleetStatusBroadcaster standard 5m 30-33,35
35 RedundancyStateActor high-risk 5m 30-34
36 Singleton registration standard 4m
37 DriverHostActor bootstrap high-risk 5m 41-44
38 DriverHostActor dispatch high-risk 5m 41-44
39 DriverHostActor stale standard 4m 41-44
40 Runtime test scaffold small 3m
41 DriverInstanceActor high-risk 5m 42-44
42 VirtualTagActor standard 5m 41,43,44
43 ScriptedAlarmActor standard 5m 41,42,44
44 OpcUaPublishActor high-risk 5m 41-43
45 Health probe actors standard 5m
46 Extract OpcUaApplicationHost standard 5m
47 Phase7Composer purity standard 5m 48-52
48 Move Blazor → AdminUI standard 5m 47
49 Move hubs, rewire standard 5m 50-52
50 IAdminOperationsClient standard 5m 49,51,52
51 IFleetDiagnosticsClient standard 5m 49,50,52
52 Drift + Deploy UI standard 5m 49-51
53 Host Program.cs high-risk 5m 54,55
54 Health + appsettings standard 5m 53,55
55 DEV-STUB drivers standard 5m 53,54
56 Delete Server + Admin high-risk 5m
57 Build & test green trivial 3m
58 Integration harness standard 5m
59 Deploy + failover IT standard 5m 60
60 OPC UA IT standard 5m 59
61 E2E + CI standard 5m
62 Install-Services.ps1 standard 5m 63-65
63 Traefik + docker-dev standard 5m 62,64,65
64 Update existing docs standard 5m 62,63,65
65 New v2 docs standard 5m 62-64

Total estimated subagent time: ~5 hours of focused execution, well-suited to subagent-driven dispatch with parallel scheduling on independent tasks.