# OtOpcUa v2 — Akka.NET + Fused Hosting Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use `superpowers-extended-cc:executing-plans` to implement this plan task-by-task.
**Goal:** Fuse `OtOpcUa.Server` and `OtOpcUa.Admin` into a single role-gated binary (`OtOpcUa.Host`), introduce an Akka.NET cluster (admin/driver roles) for control-plane singletons and per-node runtime actors, replace the draft/publish `ConfigGeneration` lifecycle with a live-edit + snapshot-deploy model, and drive OPC UA `ServiceLevel` from Akka cluster leadership while preserving the dual-endpoint warm-redundancy client behavior.
**Architecture:** Single solution with new component libraries (`Cluster`, `Security`, `ControlPlane`, `Runtime`, `OpcUaServer`, `AdminUI`, `Commons`) reused by one `Host` web binary. Akka 1.5.62 with `Akka.Hosting` + `Akka.Cluster.Hosting` + `Akka.Cluster.Tools`. Cluster singletons pinned to `admin` role; per-node actor trees on `driver`-role nodes. Existing `ZB.MOM.WW.OtOpcUa.Configuration` project keeps the EF Core `DbContext` (renamed-in-place, no project rename) and grows new tables for `Deployment`, `NodeDeploymentState`, `ConfigEdit`, `DataProtectionKeys`. EF migrations executed via auto-migration on dev + idempotent SQL script `Migrate-To-V2.ps1` for prod.
**Tech Stack:** .NET 10, Akka.NET 1.5.62 (`Akka.Hosting`, `Akka.Cluster.Hosting`, `Akka.Cluster.Tools`, `Akka.Remote.Hosting`, `Akka.Streams`), EF Core 10.0.7 (SQL Server), Blazor Server, SignalR, OPCFoundation .NET Standard stack, LDAP (`Novell.Directory.Ldap.NETStandard`), Bootstrap 5 (vendored).
**Design source:** `docs/plans/2026-05-26-akka-hosting-alignment-design.md`. Always read it before starting a task; it is the spec.
**Branch:** `v2-akka-fuse` off `master`.
**Reference project:** Sister repo `~/Desktop/scadalink-design` — copy patterns, not code (different domain). Pattern files to copy from:
- ScadaLink HOCON: `src/ScadaLink.Host/Akka/akka.conf`
- ScadaLink Security setup: `src/ScadaLink.Security/ServiceCollectionExtensions.cs`
- ScadaLink Cluster bootstrap: `src/ScadaLink.Host/Program.cs:60-228`
- ScadaLink ClusterSingleton pattern: `src/ScadaLink.ManagementService/`
---
## Conventions for every task
- **Branch:** Stay on `v2-akka-fuse`. Never commit to `master` while plan is running.
- **TDD where it makes sense:** New actors, new domain logic — write the test first. Pure refactors / file moves — verify-by-build is enough.
- **Build command:** `dotnet build ZB.MOM.WW.OtOpcUa.slnx` — must be green before commit.
- **Test command:** `dotnet test ZB.MOM.WW.OtOpcUa.slnx --no-build` — relevant new/changed tests must pass.
- **Commit format:** Conventional Commits — `feat(scope):`, `refactor(scope):`, `chore(scope):`, `test(scope):`. Scope examples: `host`, `cluster`, `runtime`, `controlplane`, `security`, `adminui`, `configdb`.
- **Mac compatibility:** All code must build on macOS. Windows-only APIs (`AddWindowsService`, Galaxy/Wonderware drivers) must be gated by `OperatingSystem.IsWindows()` or `[SupportedOSPlatform]`.
---
## Phase 0 — Branch & scaffolding
### Task 0: Create branch and central package management
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** none (first task)
**Files:**
- Create: `/Users/dohertj2/Desktop/OtOpcUa/Directory.Packages.props`
- Create: `/Users/dohertj2/Desktop/OtOpcUa/Directory.Build.props`
**Step 1: Create branch**
```bash
cd ~/Desktop/OtOpcUa
git checkout -b v2-akka-fuse
```
**Step 2: Create `Directory.Packages.props`** with central package management for Akka + EF Core + ASP.NET Core. Source versions from `~/Desktop/scadalink-design/Directory.Packages.props`. At minimum include:
```xml
true
```
Audit the existing `.csproj` files for any package not listed; add it to `Directory.Packages.props` and strip the `Version` attribute from the csprojs.
**Step 3: Create minimal `Directory.Build.props`:**
```xml
net10.0
enable
enable
true
latest
```
**Step 4: Build green check**
Run: `dotnet build ZB.MOM.WW.OtOpcUa.slnx`
Expected: Build succeeded. If any csproj has a duplicate `Version` after centralization, fix.
**Step 5: Commit**
```bash
git add Directory.Packages.props Directory.Build.props
git commit -m "chore(build): introduce central package management for v2"
```
---
### Task 1: Create `OtOpcUa.Commons` project
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 2, 3, 4, 5, 6, 7, 8
**Files:**
- Create: `/Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/ZB.MOM.WW.OtOpcUa.Commons.csproj`
- Create: `/Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/.gitkeep`
- Create: `/Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/.gitkeep`
- Create: `/Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/.gitkeep`
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/ZB.MOM.WW.OtOpcUa.slnx` (add Commons project)
**Step 1: Create csproj**
```xml
ZB.MOM.WW.OtOpcUa.Commons
```
**Step 2: Add to solution**
Run: `dotnet sln ZB.MOM.WW.OtOpcUa.slnx add src/Core/ZB.MOM.WW.OtOpcUa.Commons/ZB.MOM.WW.OtOpcUa.Commons.csproj`
**Step 3: Build green**
Run: `dotnet build ZB.MOM.WW.OtOpcUa.slnx`
Expected: Build succeeded.
**Step 4: Commit**
```bash
git add src/Core/ZB.MOM.WW.OtOpcUa.Commons/ ZB.MOM.WW.OtOpcUa.slnx
git commit -m "feat(commons): scaffold OtOpcUa.Commons project"
```
---
### Task 2: Create `OtOpcUa.Cluster` project
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 1, 3, 4, 5, 6, 7, 8
**Files:**
- Create: `/Users/dohertj2/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ZB.MOM.WW.OtOpcUa.Cluster.csproj`
- Modify: `ZB.MOM.WW.OtOpcUa.slnx`
**Step 1: Create csproj**
```xml
ZB.MOM.WW.OtOpcUa.Cluster
```
**Step 2-4:** add to solution, build, commit (`feat(cluster): scaffold OtOpcUa.Cluster project`).
---
### Task 3: Create `OtOpcUa.Security` project
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 1, 2, 4, 5, 6, 7, 8
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Security/ZB.MOM.WW.OtOpcUa.Security.csproj`
- Modify: `ZB.MOM.WW.OtOpcUa.slnx`
**csproj:** classlib targeting `net10.0`, references `OtOpcUa.Commons`, `OtOpcUa.Configuration`. Packages: `Microsoft.AspNetCore.Authentication.Cookies`, `Microsoft.AspNetCore.Authentication.JwtBearer`, `Microsoft.IdentityModel.Tokens`, `System.IdentityModel.Tokens.Jwt`, `Novell.Directory.Ldap.NETStandard`.
Commit: `feat(security): scaffold OtOpcUa.Security project`.
---
### Task 4: Create `OtOpcUa.ControlPlane` project
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 1, 2, 3, 5, 6, 7, 8
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/ZB.MOM.WW.OtOpcUa.ControlPlane.csproj`
**csproj:** classlib, references `OtOpcUa.Commons`, `OtOpcUa.Cluster`, `OtOpcUa.Configuration`. Packages: `Akka.Hosting`, `Akka.Cluster.Tools`, `Microsoft.AspNetCore.SignalR.Core`.
Commit: `feat(controlplane): scaffold OtOpcUa.ControlPlane project`.
---
### Task 5: Create `OtOpcUa.Runtime` project
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 1, 2, 3, 4, 6, 7, 8
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ZB.MOM.WW.OtOpcUa.Runtime.csproj`
**csproj:** classlib, references `OtOpcUa.Commons`, `OtOpcUa.Cluster`, `OtOpcUa.Configuration`, `OtOpcUa.OpcUaServer`, all `OtOpcUa.Driver.*` abstraction projects (NOT concrete driver implementations — those are loaded reflectively). Packages: `Akka.Hosting`, `Akka.Cluster.Tools`.
Commit: `feat(runtime): scaffold OtOpcUa.Runtime project`.
---
### Task 6: Create `OtOpcUa.OpcUaServer` project
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 1, 2, 3, 4, 5, 7, 8
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/ZB.MOM.WW.OtOpcUa.OpcUaServer.csproj`
**csproj:** classlib, references `OtOpcUa.Commons`, `OtOpcUa.Configuration`. Packages: `OPCFoundation.NetStandard.Opc.Ua.Server`, `OPCFoundation.NetStandard.Opc.Ua.Configuration`. Copy exact versions from current `ZB.MOM.WW.OtOpcUa.Server.csproj`.
Commit: `feat(opcua): scaffold OtOpcUa.OpcUaServer project`.
---
### Task 7: Create `OtOpcUa.AdminUI` Razor class library
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 1, 2, 3, 4, 5, 6, 8
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/ZB.MOM.WW.OtOpcUa.AdminUI.csproj`
**csproj:**
```xml
ZB.MOM.WW.OtOpcUa.AdminUI
true
```
Commit: `feat(adminui): scaffold OtOpcUa.AdminUI Razor class library`.
---
### Task 8: Create `OtOpcUa.Host` Web SDK project
**Classification:** small
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 1, 2, 3, 4, 5, 6, 7
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs` (minimal "Hello, host" stub)
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/Properties/launchSettings.json`
**csproj:**
```xml
ZB.MOM.WW.OtOpcUa.Host
zb-mom-ww-otopcua-host
OtOpcUa.Host
```
**Stub Program.cs:**
```csharp
var builder = WebApplication.CreateBuilder(args);
var app = builder.Build();
app.MapGet("/", () => "OtOpcUa.Host scaffold");
await app.RunAsync();
```
**appsettings.json:** empty `{}` for now.
**launchSettings.json:** profile `OtOpcUa.Host` with `applicationUrl=http://localhost:9000`.
Commit: `feat(host): scaffold OtOpcUa.Host web project`.
---
### Task 9: Build green smoke
**Classification:** trivial
**Estimated implement time:** ~2 min
**Parallelizable with:** none (depends on Tasks 0-8)
**Step 1:** Run `dotnet build ZB.MOM.WW.OtOpcUa.slnx`. Expected: succeeded, no warnings-as-errors. Fix anything that broke. No commit (verification only).
---
## Phase 1 — ConfigDb schema (live-edit + deploy model)
### Task 10: Add `Deployment` entity
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 11, 12, 13
**Files:**
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/Deployment.cs`
- Modify: `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/OtOpcUaConfigDbContext.cs` (add `DbSet` + `OnModelCreating` mapping)
**Step 1: Create `Deployment.cs`:**
```csharp
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
public sealed class Deployment
{
public Guid DeploymentId { get; init; } = Guid.NewGuid();
public required string RevisionHash { get; init; }
public DeploymentStatus Status { get; set; } = DeploymentStatus.Dispatching;
public required string CreatedBy { get; init; }
public DateTime CreatedAtUtc { get; init; } = DateTime.UtcNow;
public byte[] ArtifactBlob { get; init; } = Array.Empty();
public byte[] RowVersion { get; set; } = Array.Empty();
public string? FailureReason { get; set; }
public DateTime? SealedAtUtc { get; set; }
}
public enum DeploymentStatus
{
Dispatching = 0,
AwaitingApplyAcks = 1,
Sealed = 2,
PartiallyFailed = 3,
TimedOut = 4
}
```
**Step 2: Add mapping in `OtOpcUaConfigDbContext.OnModelCreating`:**
```csharp
modelBuilder.Entity(b =>
{
b.ToTable("Deployment");
b.HasKey(d => d.DeploymentId);
b.Property(d => d.RevisionHash).HasMaxLength(64).IsRequired();
b.Property(d => d.Status).HasConversion();
b.Property(d => d.CreatedBy).HasMaxLength(128).IsRequired();
b.Property(d => d.FailureReason).HasMaxLength(2048);
b.Property(d => d.RowVersion).IsRowVersion();
b.HasIndex(d => d.Status);
b.HasIndex(d => d.CreatedAtUtc);
});
```
**Step 3:** Build green. Commit: `feat(configdb): add Deployment entity`.
---
### Task 11: Add `NodeDeploymentState` entity (replaces ClusterNodeGenerationState)
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 10, 12, 13
**Files:**
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/NodeDeploymentState.cs`
- Modify: `OtOpcUaConfigDbContext.cs` (add DbSet + mapping)
**Schema:** `(NodeId, DeploymentId)` composite key; `Status` enum `Applying|Applied|Failed`; `StartedAtUtc`, `AppliedAtUtc?`, `FailureReason?`, `RowVersion`.
Do NOT delete `ClusterNodeGenerationState.cs` yet — keep it for the migration step in Task 14.
Commit: `feat(configdb): add NodeDeploymentState entity`.
---
### Task 12: Add `ConfigEdit` audit entity
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 10, 11, 13
**Files:**
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigEdit.cs`
- Modify: `OtOpcUaConfigDbContext.cs`
**Schema:** `(EditId GUID PK, EntityType string, EntityId GUID, FieldsJson nvarchar(max), ExecutionId GUID NULL, EditedBy, EditedAtUtc, SourceNode)`.
Captures per-row edits to `Equipment`, `Driver`, `DriverInstance`, `Script`, etc. Inserted by `AdminOperationsActor` on every mutating op.
Commit: `feat(configdb): add ConfigEdit audit entity`.
---
### Task 13: Add DataProtection keys table
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 10, 11, 12
**Files:**
- Modify: `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/ZB.MOM.WW.OtOpcUa.Configuration.csproj` — add `Microsoft.AspNetCore.DataProtection.EntityFrameworkCore` package
- Modify: `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/OtOpcUaConfigDbContext.cs` — implement `IDataProtectionKeyContext`:
```csharp
public DbSet DataProtectionKeys
=> Set();
```
Commit: `feat(configdb): persist DataProtection keys in ConfigDb`.
---
### Tasks 14a-14f: Entity-model rewrite + V2HostingAlignment migration
> **Plan rewrite, 2026-05-26**: the original single Task 14 (5-min EF migration) was
> under-scoped — it only listed the schema drops/adds without addressing the 13+ entities
> whose foreign keys + indexes are keyed on `GenerationId`. The design doc (§ live-edit
> model) requires removing `GenerationId` from `Equipment`, `Driver`, `DriverInstance`,
> `Namespace`, `UnsArea`, `UnsLine`, `Device`, `Tag`, `PollGroup`, `NodeAcl`, `Script`,
> `VirtualTag`, `ScriptedAlarm` and adding `RowVersion` columns for last-write-wins
> stale-write detection. That cascades into `GenerationApplier`/`GenerationDiff`/
> `GenerationSealedCache` and the legacy Server/Admin CRUD services. Policy decision
> (recorded with the user): the legacy `OtOpcUa.Server` + `OtOpcUa.Admin` projects are
> allowed to fail-to-compile between Task 14c and Task 56 — only the new v2 projects need
> to stay green.
#### Task 14a: Add `RowVersion` to live-edit entities
**Classification:** standard
**Estimated implement time:** ~10 min
**Parallelizable with:** none (foundation for 14b)
**Files:** every live-edit entity class — `Equipment`, `DriverInstance`, `Device`, `Tag`,
`PollGroup`, `Namespace`, `UnsArea`, `UnsLine`, `NodeAcl`, `Script`, `VirtualTag`,
`ScriptedAlarm`. Add `public byte[] RowVersion { get; set; } = Array.Empty();` and a
`e.Property(x => x.RowVersion).IsRowVersion();` mapping in `OtOpcUaConfigDbContext`.
Commit: `feat(configdb): add RowVersion to live-edit entities for last-write-wins detection`.
---
#### Task 14b: Decouple live-edit entities from `ConfigGeneration`
**Classification:** high-risk
**Estimated implement time:** ~30 min
**Parallelizable with:** none
Remove `GenerationId` property, `Generation` navigation property, and the
`HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId)` mapping from each
of the 13 live-edit entities listed above. Rewrite the `UX__Generation_LogicalId`
indexes to drop the `GenerationId` column (logical IDs become globally unique). Drop
`UX_*_Generation_*` filtered indexes where the filter referenced generation scope.
Will break `OtOpcUa.Server` + `OtOpcUa.Admin` compilation — that is accepted (Task 56
deletes them).
Commit: `refactor(configdb): drop GenerationId FK from live-edit entities`.
---
#### Task 14c: Mark `GenerationApplier` / `GenerationDiff` / `GenerationSealedCache` obsolete
**Classification:** high-risk
**Estimated implement time:** ~20 min
**Parallelizable with:** none
`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Apply/` contains `GenerationApplier.cs`,
`GenerationDiff.cs`, `ApplyCallbacks.cs`, `ChangeKind.cs`, `IGenerationApplier.cs`. These
implement the v1 draft/publish lifecycle that v2 replaces with `AdminOperationsActor` +
`ConfigComposer`.
Inventory callers via `grep -rln 'GenerationApplier\|GenerationDiff' src tests`. Either:
- Mark types `[Obsolete("Replaced by AdminOperationsActor in v2", error: true)]` so
surviving call sites become hard build errors (cleaner; surfaces the Server-breakage),
- Or delete the files and accept the Server-side build break.
Sweep `GenerationSealedCache` similarly. Keep the LiteDb cache concept (it's repurposed
in Task 39 for stale-config fallback) but rename references to use `DeploymentArtifact`.
Commit: `refactor(configdb): obsolete GenerationApplier/Diff/SealedCache (replaced by AdminOperationsActor)`.
---
#### Task 14d: Drop `RedundancyRole` from `ClusterNode`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none
Remove `ClusterNode.RedundancyRole` property + the
`e.Property(x => x.RedundancyRole).HasConversion()` mapping + the
`UX_ClusterNode_Primary_Per_Cluster` filtered unique index from
`OtOpcUaConfigDbContext.ConfigureClusterNode`. Akka cluster leader-of-driver-role becomes
the source of truth (Phase 5, Task 35).
Commit: `refactor(configdb): drop ClusterNode.RedundancyRole (replaced by Akka leader)`.
---
#### Task 14e: Delete `ConfigGeneration` + `ClusterNodeGenerationState`
**Classification:** small
**Estimated implement time:** ~5 min
**Parallelizable with:** none (depends on 14b clearing the FKs)
Delete `Entities/ConfigGeneration.cs` and `Entities/ClusterNodeGenerationState.cs`. Remove
the corresponding `DbSet<>` entries and `Configure*` methods from
`OtOpcUaConfigDbContext`. Drop `GenerationStatus` and `NodeApplyStatus` enums.
Commit: `refactor(configdb): delete ConfigGeneration + ClusterNodeGenerationState`.
---
#### Task 14f: Generate `V2HostingAlignment` EF migration
**Classification:** high-risk
**Estimated implement time:** ~15 min
**Parallelizable with:** none (consolidates 14a-14e)
**Files:**
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/_V2HostingAlignment.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/_V2HostingAlignment.Designer.cs`
- Modify: `tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests/SchemaComplianceTests.cs` — update
the `expected` table list (remove ConfigGeneration + ClusterNodeGenerationState; add
Deployment + NodeDeploymentState + ConfigEdit + DataProtectionKeys).
**Step 1: Generate migration**
```bash
dotnet ef migrations add V2HostingAlignment \
--project src/Core/ZB.MOM.WW.OtOpcUa.Configuration \
--startup-project src/Server/ZB.MOM.WW.OtOpcUa.Host
```
If `dotnet-ef` isn't installed: `dotnet tool install --global dotnet-ef --version 10.0.7`.
**Step 2: Audit the generated migration** — it should:
- `DropTable("ConfigGeneration")` and `DropTable("ClusterNodeGenerationState")`
- `DropColumn("RedundancyRole", "ClusterNode")`
- For each of the 13 live-edit tables: `DropForeignKey` on `GenerationId`,
`DropIndex` on `UX_*_Generation_LogicalId` (and any `UX_*_Generation_*`), `DropColumn` on
`GenerationId`, `AddColumn("RowVersion", "rowversion")`, `CreateIndex` on the new
globally-unique logical-id pattern.
- `CreateTable("Deployment", ...)`, `CreateTable("NodeDeploymentState", ...)`,
`CreateTable("ConfigEdit", ...)`, `CreateTable("DataProtectionKeys", ...)`.
If extra changes appear (e.g., column-type drift), reconcile by editing the entity classes
— do not edit the migration directly.
**Step 3: Verify on a scratch SQL Server** (per CLAUDE.md, Docker is on the shared host
`10.100.0.35`, not local).
```bash
# from this Mac dev:
ssh dohertj2@10.100.0.35 'docker run --rm -d --name v2-migration-test \
-e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=Pass@word123" \
-p 14333:1433 mcr.microsoft.com/mssql/server:2022-latest'
# Wait ~10s for SQL Server to start
ConnectionStrings__ConfigDb="Server=10.100.0.35,14333;Database=OtOpcUaV2Test;User Id=sa;Password=Pass@word123;TrustServerCertificate=true" \
dotnet ef database update \
--project src/Core/ZB.MOM.WW.OtOpcUa.Configuration \
--startup-project src/Server/ZB.MOM.WW.OtOpcUa.Host
ssh dohertj2@10.100.0.35 'docker exec v2-migration-test /opt/mssql-tools/bin/sqlcmd \
-S localhost -U sa -P Pass@word123 -d OtOpcUaV2Test \
-Q "SELECT name FROM sys.tables ORDER BY name"'
ssh dohertj2@10.100.0.35 'docker stop v2-migration-test'
```
Expected: migration completes; sys.tables contains the 4 new tables and not the 2 dropped
ones; live-edit tables have `RowVersion` column.
**Step 4: Update `SchemaComplianceTests`** so its `expected` array matches the new schema.
**Step 5: Commit**
```bash
git add src/Core/ZB.MOM.WW.OtOpcUa.Configuration/ \
tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests/SchemaComplianceTests.cs
git commit -m "feat(configdb): V2HostingAlignment migration — drop generation lifecycle, add deploy tables"
```
---
### Task 15: `Migrate-To-V2.ps1` idempotent prod migration script
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 16, 17, 18 (Phase 2)
**Files:**
- Create: `scripts/migration/Migrate-To-V2.ps1`
- Create: `scripts/migration/Migrate-To-V2.sql` (the idempotent SQL output)
**Step 1: Generate idempotent SQL from EF**
Run: `dotnet ef migrations script --idempotent --project src/Core/ZB.MOM.WW.OtOpcUa.Configuration --startup-project src/Server/ZB.MOM.WW.OtOpcUa.Host --output scripts/migration/Migrate-To-V2.sql`
**Step 2: PowerShell wrapper:**
```powershell
[CmdletBinding()]
param(
[Parameter(Mandatory)][string] $ConnectionString,
[string] $BackupPath = "$env:TEMP\OtOpcUa-V1-Backup-$(Get-Date -Format yyyyMMddHHmmss).bak"
)
Write-Host "Step 1/4 — Backup ConfigDb to $BackupPath"
Invoke-Sqlcmd -ConnectionString $ConnectionString -Query "BACKUP DATABASE [OtOpcUaConfigDb] TO DISK = '$BackupPath' WITH FORMAT, COMPRESSION"
Write-Host "Step 2/4 — Row counts before"
$beforeCounts = Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\count-rows.sql"
$beforeCounts | Format-Table
Write-Host "Step 3/4 — Apply Migrate-To-V2.sql"
Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\Migrate-To-V2.sql"
Write-Host "Step 4/4 — Row counts after + validation"
$afterCounts = Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\count-rows.sql"
$afterCounts | Format-Table
# Validation gates
$tablesNow = (Invoke-Sqlcmd -ConnectionString $ConnectionString -Query "SELECT name FROM sys.tables ORDER BY name").name
foreach ($t in 'Deployment','NodeDeploymentState','ConfigEdit','DataProtectionKeys') {
if ($tablesNow -notcontains $t) { throw "Expected table $t missing." }
}
foreach ($t in 'ConfigGeneration','ClusterNodeGenerationState') {
if ($tablesNow -contains $t) { throw "Legacy table $t still present." }
}
Write-Host "Migration complete. Backup at $BackupPath"
```
Also create `scripts/migration/count-rows.sql` listing per-table row counts for the audit.
Commit: `feat(migration): add Migrate-To-V2.ps1 idempotent migration runner`.
---
## Phase 2 — Commons types and contracts
### Task 16: Common types
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 17, 18
**Files:**
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/CorrelationId.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/ExecutionId.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/NodeId.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/DeploymentId.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/RevisionHash.cs`
Each is a readonly record struct wrapping a `Guid` (IDs) or `string` (hash). Implement `ToString()`, parse, `IEquatable`.
**Example (`CorrelationId.cs`):**
```csharp
namespace ZB.MOM.WW.OtOpcUa.Commons.Types;
public readonly record struct CorrelationId(Guid Value)
{
public static CorrelationId NewId() => new(Guid.NewGuid());
public override string ToString() => Value.ToString("N");
public static CorrelationId Parse(string s) => new(Guid.ParseExact(s, "N"));
}
```
Same pattern for `ExecutionId`, `DeploymentId`, `NodeId` (string), `RevisionHash` (string).
Commit: `feat(commons): add correlation/execution/node/deployment/revisionhash types`.
---
### Task 17: Akka message contracts
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 16, 18
**Files:**
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/DispatchDeployment.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/ApplyAck.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/DeploymentSealed.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Deploy/DeploymentFailed.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Admin/StartDeployment.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Admin/StartDeploymentResult.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Audit/AuditEvent.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Redundancy/RedundancyStateChanged.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Redundancy/NodeRedundancyState.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Fleet/FleetStatusChanged.cs`
All as `sealed record` with `CorrelationId` field. Example:
```csharp
namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Deploy;
public sealed record DispatchDeployment(
DeploymentId DeploymentId,
RevisionHash RevisionHash,
CorrelationId CorrelationId);
public sealed record ApplyAck(
DeploymentId DeploymentId,
NodeId NodeId,
ApplyAckOutcome Outcome,
string? FailureReason,
CorrelationId CorrelationId);
public enum ApplyAckOutcome { Applied, Failed }
```
Commit: `feat(commons): add deploy/admin/audit/redundancy/fleet message contracts`.
---
### Task 18: Common interfaces
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 16, 17
**Files:**
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/IClusterRoleInfo.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/IAdminOperationsClient.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Interfaces/IFleetDiagnosticsClient.cs`
```csharp
public interface IClusterRoleInfo
{
NodeId LocalNode { get; }
IReadOnlySet LocalRoles { get; }
bool HasRole(string role);
IReadOnlyList MembersWithRole(string role);
NodeId? RoleLeader(string role);
event EventHandler? RoleLeaderChanged;
}
public interface IAdminOperationsClient
{
Task StartDeploymentAsync(string createdBy, CancellationToken ct);
// … other mutating ops added in later tasks
}
public interface IFleetDiagnosticsClient
{
Task GetDiagnosticsAsync(NodeId nodeId, CancellationToken ct);
}
```
Commit: `feat(commons): add cluster/admin/diagnostics client interfaces`.
---
## Phase 3 — Cluster library
### Task 19: HOCON config
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 20, 21, 22
**Files:**
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/Resources/akka.conf`
- Modify: `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ZB.MOM.WW.OtOpcUa.Cluster.csproj` (embed resource)
**Step 1:** Copy `~/Desktop/scadalink-design/src/ScadaLink.Host/Akka/akka.conf` (or equivalent path — check what ScadaLink actually has) as a starting template, then adapt:
- `actor.provider = cluster`
- `remote.dot-netty.tcp { hostname = "0.0.0.0", port = 4053 }`
- `cluster.roles = []` (populated dynamically by Task 21)
- `cluster.split-brain-resolver.active-strategy = keep-oldest`
- `cluster.split-brain-resolver.stable-after = 15s`
- `cluster.down-removal-margin = 15s`
- `cluster.failure-detector.heartbeat-interval = 2s`
- `cluster.failure-detector.threshold = 10.0`
- `cluster.singleton.singleton-name = "singleton"`
- `cluster.singleton-proxy.singleton-identification-interval = 1s`
- Synchronized dispatcher for OPC UA actors (Task 44):
```hocon
opcua-synchronized-dispatcher {
type = "PinnedDispatcher"
executor = "thread-pool-executor"
}
```
If ScadaLink puts HOCON inline in Program.cs rather than a .conf file, embed it the same way — but a separate .conf file is preferred for editability.
**Step 2:** Mark as embedded resource in csproj:
```xml
```
**Step 3:** Add a loader helper `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/HoconLoader.cs`:
```csharp
public static class HoconLoader
{
public static string LoadBaseConfig()
{
using var stream = typeof(HoconLoader).Assembly
.GetManifestResourceStream("ZB.MOM.WW.OtOpcUa.Cluster.Resources.akka.conf")
?? throw new InvalidOperationException("akka.conf resource not found");
using var reader = new StreamReader(stream);
return reader.ReadToEnd();
}
}
```
Commit: `feat(cluster): embed Akka HOCON config matching ScadaLink tuning`.
---
### Task 20: `AkkaHostedService` implementation
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 19, 21, 22
**Files:**
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/AkkaHostedService.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/AkkaClusterOptions.cs`
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ServiceCollectionExtensions.cs`
**`AkkaClusterOptions.cs`:**
```csharp
public sealed class AkkaClusterOptions
{
public string SystemName { get; set; } = "otopcua";
public string Hostname { get; set; } = "0.0.0.0";
public int Port { get; set; } = 4053;
public string PublicHostname { get; set; } = "127.0.0.1";
public string[] SeedNodes { get; set; } = Array.Empty();
public string[] Roles { get; set; } = Array.Empty();
}
```
**`AkkaHostedService.cs`:** Implements `IHostedService`. On Start, builds `ActorSystem` from `HoconLoader.LoadBaseConfig()` + overlay from `AkkaClusterOptions`. Joins cluster (`Cluster.Get(system).Join` against seed nodes). On Stop, calls `CoordinatedShutdown.Get(system).Run(CoordinatedShutdown.ClusterLeavingReason.Instance)` with a 30s timeout.
**`ServiceCollectionExtensions.AddOtOpcUaCluster(IConfiguration)`:** binds `AkkaClusterOptions`, registers `AkkaHostedService` as `IHostedService`, registers `ActorSystem` as a singleton resolved from the hosted service.
Mirror the wiring in `~/Desktop/scadalink-design/src/ScadaLink.Host/Program.cs` Akka block. Don't deviate on tuning.
Commit: `feat(cluster): AkkaHostedService and DI extension`.
---
### Task 21: Role parsing from `OTOPCUA_ROLES` env
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 19, 20, 22
**Files:**
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/RoleParser.cs`
- Create: `tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/RoleParserTests.cs` (also creates the test project — see Task 23 for the csproj)
```csharp
public static class RoleParser
{
public static string[] Parse(string? raw)
{
if (string.IsNullOrWhiteSpace(raw)) return Array.Empty();
var roles = raw.Split(',', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
.Select(r => r.ToLowerInvariant())
.Distinct()
.ToArray();
foreach (var r in roles)
if (r is not ("admin" or "driver" or "dev"))
throw new ArgumentException($"Unknown role '{r}'. Allowed: admin, driver, dev.");
return roles;
}
}
```
Tests cover: empty input → empty; `"admin"` → `["admin"]`; `"admin,driver"` → both; whitespace tolerant; case-insensitive; throws on unknown role.
Commit: `feat(cluster): parse OTOPCUA_ROLES env var with validation`.
---
### Task 22: `IClusterRoleInfo` implementation
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 19, 20, 21
**Files:**
- Create: `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ClusterRoleInfo.cs`
Implements `IClusterRoleInfo` (from Task 18). Wraps `Akka.Cluster.Cluster.Get(ActorSystem)`. Subscribes to `ClusterEvent.LeaderChanged`, `ClusterEvent.RoleLeaderChanged`, `ClusterEvent.IMemberEvent` via an internal subscriber actor, raises CLR event.
Commit: `feat(cluster): ClusterRoleInfo wraps Akka.Cluster for app-facing role queries`.
---
### Task 23: Cluster test project + initial tests
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (verification task — depends on Tasks 19-22)
**Files:**
- Create: `tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests.csproj`
- Create: `tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/HoconLoaderTests.cs` — asserts HOCON parses and key values present
- Move: `tests/.../RoleParserTests.cs` if Task 21 dropped it elsewhere
**csproj:** xUnit test project, references `OtOpcUa.Cluster`, `OtOpcUa.Commons`. Packages: `xunit`, `xunit.runner.visualstudio`, `Microsoft.NET.Test.Sdk`, `FluentAssertions`.
**`HoconLoaderTests.cs`:** parses HOCON via `Akka.Configuration.ConfigurationFactory.ParseString(HoconLoader.LoadBaseConfig())`, asserts `actor.provider == "cluster"`, `cluster.split-brain-resolver.active-strategy == "keep-oldest"`, etc.
Run: `dotnet test tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/`. Expected: all green.
Add to solution: `dotnet sln ZB.MOM.WW.OtOpcUa.slnx add tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests.csproj`.
Commit: `test(cluster): HOCON parses, role parser truth table`.
---
## Phase 4 — Security library
### Task 24: Move `LdapAuthService` into `OtOpcUa.Security`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 25 (different file)
**Files:**
- Move: `src/Server/ZB.MOM.WW.OtOpcUa.Admin/Security/LdapAuthService.cs` → `src/Server/ZB.MOM.WW.OtOpcUa.Security/Ldap/LdapAuthService.cs`
- Rename namespace: `ZB.MOM.WW.OtOpcUa.Admin.Security` → `ZB.MOM.WW.OtOpcUa.Security.Ldap`
- Update all callers (use `grep -rl 'OtOpcUa.Admin.Security'` to find them; update with `sed` or by hand)
Commit: `refactor(security): move LdapAuthService into OtOpcUa.Security library`.
---
### Task 25: `JwtTokenService`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 24
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Security/Jwt/JwtTokenService.cs`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Security/Jwt/JwtOptions.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.Security.Tests/JwtTokenServiceTests.cs` (also creates test csproj)
Mirror `~/Desktop/scadalink-design/src/ScadaLink.Security/JwtTokenService.cs`. Options: `SigningKey` (HS256, ≥32 bytes), `Issuer`, `Audience`, `ExpiryMinutes` (default 15). `Issue(claims)` → string. `TryValidate(token, out principal)` → bool.
Tests cover: valid token roundtrip; expired token rejected; tampered token rejected; missing required claim rejected.
Commit: `feat(security): JwtTokenService with HS256 + 15-min expiry`.
---
### Task 26: Cookie+JWT hybrid registration extension
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 27, 28
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Security/ServiceCollectionExtensions.cs`
`AddOtOpcUaAuth(IConfiguration)`:
1. Bind `JwtOptions` from `Security:Jwt`, bind `CookieOptions` from `Security:Cookie`.
2. `services.AddDataProtection().PersistKeysToDbContext().SetApplicationName("OtOpcUa")`.
3. `services.AddAuthentication(CookieAuthenticationDefaults.AuthenticationScheme)`
- `.AddCookie(o => { o.Cookie.Name = "OtOpcUa.Auth"; o.Cookie.HttpOnly = true; o.Cookie.SameSite = SameSiteMode.Strict; o.Cookie.SecurePolicy = CookieSecurePolicy.SameAsRequest; o.SlidingExpiration = true; o.ExpireTimeSpan = TimeSpan.FromMinutes(30); })`
- `.AddJwtBearer(JwtBearerDefaults.AuthenticationScheme, o => { /* HS256 with JwtOptions.SigningKey */ })`.
4. `services.AddAuthorization()` + fallback policy requiring authenticated user.
5. Register `LdapAuthService`, `JwtTokenService`, `RoleMapper`.
Mirror the wiring in `~/Desktop/scadalink-design/src/ScadaLink.Security/ServiceCollectionExtensions.cs` exactly for the cookie/JWT/DataProtection plumbing.
Commit: `feat(security): cookie+JWT hybrid auth via AddOtOpcUaAuth`.
---
### Task 27: `/auth/login`, `/auth/ping`, `/auth/token` endpoints
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 26, 28
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Security/Endpoints/AuthEndpoints.cs`
Mirror `~/Desktop/scadalink-design/src/ScadaLink.Security/Endpoints/AuthEndpoints.cs`. Three minimal-API endpoints:
- `POST /auth/login` — accepts `{username, password}`, calls `LdapAuthService.AuthenticateAsync`, builds claims (sub, roles), issues cookie via `HttpContext.SignInAsync` AND embeds JWT in cookie. Returns 204 on success / 401 on bad creds / 503 on LDAP unreachable.
- `GET /auth/ping` — `[AllowAnonymous]`, returns 200 if `User.Identity.IsAuthenticated`, 401 otherwise.
- `POST /auth/token` — authenticated, returns `{token: "..."}` JWT bearer for external clients.
Extension method `MapOtOpcUaAuth(this IEndpointRouteBuilder)`. Wire in Host Program.cs at Task 53.
Commit: `feat(security): /auth/login, /auth/ping, /auth/token endpoints`.
---
### Task 28: `CookieAuthenticationStateProvider` for Blazor circuits
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 26, 27
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Security/Blazor/CookieAuthenticationStateProvider.cs`
Standard pattern: snapshots `HttpContext.User` at circuit construction, polls `/auth/ping` every 60s to detect expiry, calls `NotifyAuthenticationStateChanged` on transition. Mirror ScadaLink's equivalent — search `~/Desktop/scadalink-design/src/ScadaLink.CentralUI/` for the `*AuthenticationStateProvider*` file.
Commit: `feat(security): CookieAuthenticationStateProvider for Blazor circuit expiry detection`.
---
### Task 29: Security test project + tests
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (verification — depends on Tasks 24-28)
**Files:**
- Create: `tests/ZB.MOM.WW.OtOpcUa.Security.Tests/ZB.MOM.WW.OtOpcUa.Security.Tests.csproj`
- Create: `tests/ZB.MOM.WW.OtOpcUa.Security.Tests/JwtTokenServiceTests.cs` (moved from Task 25 if dropped elsewhere)
- Create: `tests/ZB.MOM.WW.OtOpcUa.Security.Tests/AuthEndpointsTests.cs` — uses `Microsoft.AspNetCore.Mvc.Testing` with a `WebApplicationFactory` against a stubbed LDAP
Tests cover: login happy path issues cookie+JWT; login bad password returns 401; login with LDAP outage returns 503; `/auth/ping` after expired cookie returns 401; `/auth/token` issues a valid JWT for authenticated user.
Add to solution. Run: `dotnet test tests/ZB.MOM.WW.OtOpcUa.Security.Tests/`. Expected: all green.
Commit: `test(security): cookie+JWT roundtrip, login/ping/token endpoint tests`.
---
## Phase 5 — ControlPlane cluster singletons
### Task 30: `ConfigPublishCoordinator` — happy path
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 32 (different files; sibling singletons)
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Coordinators/ConfigPublishCoordinator.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ConfigPublishCoordinatorTests.cs` (also creates test csproj)
**Step 1: Write failing test (`Akka.TestKit.Xunit2`)**
```csharp
[Fact]
public async Task HappyPath_AllNodesAck_SealsDeployment()
{
using var harness = new ControlPlaneHarness();
var coord = harness.Sys.ActorOf(ConfigPublishCoordinator.Props(harness.DbFactory));
var ack1 = new ApplyAck(harness.DeploymentId, NodeId.Of("node-a"), ApplyAckOutcome.Applied, null, CorrelationId.NewId());
var ack2 = new ApplyAck(harness.DeploymentId, NodeId.Of("node-b"), ApplyAckOutcome.Applied, null, CorrelationId.NewId());
coord.Tell(new DispatchDeployment(harness.DeploymentId, harness.RevisionHash, CorrelationId.NewId()));
coord.Tell(ack1);
coord.Tell(ack2);
await harness.WaitUntil(() => harness.LoadDeploymentStatus() == DeploymentStatus.Sealed, TimeSpan.FromSeconds(5));
}
```
`ControlPlaneHarness` is a helper that spins up Akka TestKit + in-memory EF Core ConfigDb seeded with a Deployment row in `Dispatching` and two `ClusterNode` rows.
**Step 2: Run test, expect FAIL (class doesn't exist).**
**Step 3: Implement `ConfigPublishCoordinator` minimal:**
```csharp
public sealed class ConfigPublishCoordinator : ReceiveActor
{
public static Props Props(IDbContextFactory dbFactory) =>
Akka.Actor.Props.Create(() => new ConfigPublishCoordinator(dbFactory));
private readonly IDbContextFactory _dbFactory;
private readonly HashSet _expectedAcks = new();
private DeploymentId _current;
private readonly Dictionary _acks = new();
public ConfigPublishCoordinator(IDbContextFactory dbFactory)
{
_dbFactory = dbFactory;
Receive(HandleDispatch);
Receive(HandleAck);
}
private void HandleDispatch(DispatchDeployment msg)
{
_current = msg.DeploymentId;
using var ctx = _dbFactory.CreateDbContext();
_expectedAcks.UnionWith(ctx.ClusterNodes.Where(n => n.RolesCsv.Contains("driver")).Select(n => NodeId.Of(n.NodeId)).ToList());
DistributedPubSub.Get(Context.System).Mediator.Tell(new Publish("deployments", msg));
}
private void HandleAck(ApplyAck msg)
{
if (msg.DeploymentId != _current) return; // stale
_acks[msg.NodeId] = msg.Outcome;
if (_acks.Count == _expectedAcks.Count && _acks.Values.All(o => o == ApplyAckOutcome.Applied))
SealDeployment();
}
private void SealDeployment()
{
using var ctx = _dbFactory.CreateDbContext();
var d = ctx.Deployments.Single(x => x.DeploymentId == _current.Value);
d.Status = DeploymentStatus.Sealed;
d.SealedAtUtc = DateTime.UtcNow;
ctx.SaveChanges();
}
}
```
**Step 4: Run test, expect PASS.**
**Step 5: Commit:** `feat(controlplane): ConfigPublishCoordinator happy path`.
---
### Task 31: `ConfigPublishCoordinator` — timeout + failover recovery
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 32
**Files:**
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Coordinators/ConfigPublishCoordinator.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ConfigPublishCoordinatorTimeoutTests.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ConfigPublishCoordinatorRecoveryTests.cs`
**Step 1: Add tests** for:
- Deadline elapses with one node unacked → `Deployment.Status = TimedOut`.
- New Coordinator started with in-flight `Dispatching` deployment recovers state via `PreStart` (queries `Deployment` + `NodeDeploymentState`).
**Step 2: Extend Coordinator** with:
- `Context.System.Scheduler.ScheduleTellOnce(applyMaxDuration, Self, new DeadlineElapsed(_current))` after dispatch.
- `Receive` handler that marks `TimedOut` if any node unacked.
- `protected override void PreStart()`: read `Deployment` rows where `Status` ∈ `{Dispatching, AwaitingApplyAcks}`; for each, repopulate `_current`, `_expectedAcks`, `_acks` from `NodeDeploymentState`; schedule remaining deadline.
**Step 3: Run all `ConfigPublishCoordinatorTests`.** Expected: all green.
Commit: `feat(controlplane): ConfigPublishCoordinator deadline timeout + failover recovery`.
---
### Task 32: `AdminOperationsActor` + `StartDeployment` handler
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 30, 31, 33, 34, 35
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/AdminOperations/AdminOperationsActor.cs`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/AdminOperations/ConfigComposer.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/AdminOperationsActorTests.cs`
**Responsibilities:**
1. Receive `StartDeployment(createdBy, correlationId)`.
2. `ConfigComposer.SnapshotAndFlatten(dbContext)` → byte[] `ArtifactBlob` (DataContract-serialized or `System.Text.Json` over the flat artifact). Pure function.
3. Compute `RevisionHash = SHA256(artifactBlob).ToHexString()`.
4. Insert `Deployment` row (`Status = Dispatching`).
5. Insert one `ConfigEdit` audit row marking the deployment snapshot.
6. `coordinator.Tell(new DispatchDeployment(deploymentId, revisionHash, correlationId))`.
7. Reply `StartDeploymentResult(deploymentId, revisionHash)` to sender.
For now stub CRUD ops as TODO comments — they'll be filled in Task 51 (UI wiring).
Tests: snapshot is deterministic given a fixed seed of equipment rows; hash matches; Deployment row inserted; DispatchDeployment dispatched to mocked coordinator.
Commit: `feat(controlplane): AdminOperationsActor + ConfigComposer + StartDeployment flow`.
---
### Task 33: `AuditWriterActor`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 30, 31, 32, 34, 35
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/AuditWriterActorTests.cs`
Receives `AuditEvent` messages, batches into in-memory buffer (cap 500 events / 5s flush window), bulk-inserts to `ConfigAuditLog`. Idempotent on `EventId` (INSERT IF NOT EXISTS or `MERGE`). On `PreRestart` flushes buffer.
Tests: 1000 events with random duplicates → ConfigAuditLog has correct count, no duplicates; PreRestart simulates supervisor restart and verifies buffer is flushed before death.
Commit: `feat(controlplane): AuditWriterActor with batched idempotent insert`.
---
### Task 34: `FleetStatusBroadcaster`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 30, 31, 32, 33, 35
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Fleet/FleetStatusBroadcaster.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/FleetStatusBroadcasterTests.cs`
Subscribes to `ClusterEvent.MemberUp`, `MemberRemoved`, `UnreachableMember`, `ReachableMember`, `LeaderChanged`, `RoleLeaderChanged`. Receives per-node `DriverHostStatusHeartbeat` Tells. Maintains in-memory `FleetSnapshot`. Pushes diffs via injected `IHubContext` and `IHubContext`.
Hubs themselves are not built yet — at this stage inject mock `IHubContext` for tests. UI rewiring happens in Task 50.
Tests: cluster member up → diff broadcast; heartbeat staleness → unreachable broadcast; full snapshot on `OnConnectedAsync` request.
Commit: `feat(controlplane): FleetStatusBroadcaster push-driven from Akka cluster events`.
---
### Task 35: `RedundancyStateActor`
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 30, 31, 32, 33, 34
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Redundancy/RedundancyStateActor.cs`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Redundancy/ServiceLevelCalculator.cs` (pure function)
- Test: `tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ServiceLevelCalculatorTests.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/RedundancyStateActorTests.cs`
**`ServiceLevelCalculator`:** pure static function per design §6:
```csharp
public static byte Compute(NodeHealthInputs h)
{
if (h.MemberState is not (MemberStatus.Up or MemberStatus.Joining))
return 0;
byte basis = (h.DbReachable, h.OpcUaProbeOk, h.Stale) switch
{
(true, true, false) => 240,
(true, _, true) => 200,
(false, _, true) => 100,
_ => 0
};
return (byte)Math.Clamp(basis + (h.IsDriverRoleLeader ? 10 : 0), 0, 255);
}
```
**Tests:** every combination of inputs → expected byte (FsCheck or table-driven).
**`RedundancyStateActor`:** subscribes to cluster events, debounces 250ms, recomputes per-node `NodeRedundancyState`, publishes `RedundancyStateChanged` via `DistributedPubSub` topic `redundancy-state`.
Commit: `feat(controlplane): RedundancyStateActor + pure ServiceLevelCalculator`.
---
### Task 36: Singleton registration extension
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** none (depends on Tasks 30-35)
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/ServiceCollectionExtensions.cs`
`AddOtOpcUaControlPlane()`: registers all five singletons via `Akka.Cluster.Hosting` `WithClusterSingletonProxy` extension methods, all pinned to `admin` role.
Pattern (mirror `~/Desktop/scadalink-design/src/ScadaLink.ManagementService/ServiceCollectionExtensions.cs`):
```csharp
public static IServiceCollection AddOtOpcUaControlPlane(this IServiceCollection services)
{
services.AddSingleton();
return services;
}
internal sealed class ControlPlaneStartup : IControlPlaneStartup
{
public void Configure(AkkaConfigurationBuilder cb)
{
cb.WithClusterSingleton("config-publish", new ClusterSingletonOptions { Role = "admin" });
cb.WithClusterSingleton("admin-ops", new ClusterSingletonOptions { Role = "admin" });
cb.WithClusterSingleton("audit-writer", new ClusterSingletonOptions { Role = "admin" });
cb.WithClusterSingleton("fleet-status", new ClusterSingletonOptions { Role = "admin" });
cb.WithClusterSingleton("redundancy-state", new ClusterSingletonOptions { Role = "admin" });
}
}
```
Verify against ScadaLink's actual API surface — `Akka.Hosting` syntax may differ slightly across versions.
Commit: `feat(controlplane): singleton registration extension pinned to admin role`.
---
## Phase 6 — Runtime per-node actors
### Task 37: `DriverHostActor` scaffolding + bootstrap
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 41, 42, 43, 44 (different actors)
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverHostActorBootstrapTests.cs` (also creates test csproj)
**`DriverHostActor` responsibilities (this task):**
- `PreStart`: read `NodeDeploymentState` for self; if `Applied` → Become `Steady(currentDeployment)`; if `Applying` (orphan) → discard, replay; if no row + ConfigDb unreachable → fall back to LiteDb cache → Become `Stale`.
- Subscribe to `DistributedPubSub` topic `deployments`.
State machine via Become: `Bootstrapping → Steady | Applying(id) | Stale`.
Tests: orphan `Applying` row → re-runs apply on PreStart; missing row + DB unreachable → Stale state.
Commit: `feat(runtime): DriverHostActor scaffolding + PreStart recovery`.
---
### Task 38: `DriverHostActor` `DispatchDeployment` handler
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 41, 42, 43, 44 (different actors)
**Files:**
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverHostActorDispatchTests.cs`
Add:
- `Receive`:
- If `currentRevisionHash == msg.RevisionHash` → reply `ApplyAck(Applied)` immediately.
- Else → write `NodeDeploymentState(Applying)`, Become `Applying(msg.DeploymentId)`, fetch artifact, compute delta, dispatch `ApplyDelta` to children, collect acks, write `NodeDeploymentState(Applied|Failed)`, reply `ApplyAck` to coordinator, Become `Steady`.
For now children dispatch is mocked — actual `DriverInstanceActor` integration in Task 41.
Tests: idempotent dispatch (same hash → ack, no work); successful apply → ack `Applied`; child failure → ack `Failed`.
Commit: `feat(runtime): DriverHostActor handles DispatchDeployment idempotently`.
---
### Task 39: `DriverHostActor` stale-config fallback
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 41, 42, 43, 44
**Files:**
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs`
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverHostActor.cs` (background reconnect)
- Test: `tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverHostActorStaleTests.cs`
Background `Context.System.Scheduler.ScheduleTellRepeatedly(30s, 30s, Self, RetryConfigDbConnection.Instance)`. On `RetryConfigDbConnection`: try ConfigDb; on success and current state is `Stale`, pull latest sealed deployment, apply, Become `Steady`; publish `NodeRedundancyState(Stale=false)` to `redundancy-state` topic.
Tests: simulated DB outage → Stale published; DB recovery → state advances + Stale=false published.
Commit: `feat(runtime): DriverHostActor stale-config fallback + reconnect`.
---
### Task 40: Runtime test project bootstrap
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** none (depends on Tasks 37-39)
**Files:**
- Create: `tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests.csproj`
- Create: `tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/RuntimeHarness.cs` — TestKit base with EF in-memory + driver mocks
Confirm all DriverHostActor tests from Tasks 37-39 pass. Add to solution.
Commit: `test(runtime): test project scaffold + DriverHostActor tests passing`.
---
### Task 41: `DriverInstanceActor` state machine
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 37-39 already done; parallel with Task 42, 43, 44
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/DriverInstanceActorTests.cs`
States via Become: `Connecting → Connected → Reconnecting → Failed`.
- `PreStart` → enter Connecting; call `IDriver.InitializeAsync`.
- On connect success → Become Connected; subscribe tags; publish `OpcUaPublishActor.AttributeValueUpdate`.
- On disconnect → Become Reconnecting; publish bad quality to all subscribed tags; schedule retry at fixed interval (driver.ReconnectIntervalSeconds, default 10).
- On `ApplyDelta(plan)` → idempotent diff against current state; only changed attributes update; reply `ApplyResult` to parent.
- On write request via `Ask` → synchronous; failure returned to caller.
- Restart with exponential backoff supervises via parent.
Reuse existing `IDriver` interface (from current `OtOpcUa.Driver.*` projects).
Tests: connecting transitions to Connected on success; disconnect triggers bad-quality publish + Reconnecting; write failure returned to Ask caller; ApplyDelta diffs correctly.
Commit: `feat(runtime): DriverInstanceActor with Connecting/Connected/Reconnecting/Failed`.
---
### Task 42: `VirtualTagActor`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 41, 43, 44
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/VirtualTags/VirtualTagActor.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/VirtualTagActorTests.cs`
Wraps existing `VirtualTagEngine` from `~/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/`. On subscribe-to-dependencies value update, recomputes expression, publishes result to OpcUaPublishActor.
Restart with backoff; expression compile errors fail the actor (parent restarts with backoff).
Commit: `feat(runtime): VirtualTagActor wrapping VirtualTagEngine`.
---
### Task 43: `ScriptedAlarmActor`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 41, 42, 44
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ScriptedAlarms/ScriptedAlarmActor.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ScriptedAlarmActorTests.cs`
Wraps existing `AlarmConditionService`. State machine `Inactive → Active → Acknowledged → Inactive`. On state change, emits history row to `HistorianAdapterActor`. `PreRestart` hook serializes current alarm state to `ScriptedAlarmState` ConfigDb table; `PostStop`/`PreStart` rehydrates from it.
Commit: `feat(runtime): ScriptedAlarmActor with state preservation across restart`.
---
### Task 44: `OpcUaPublishActor` on synchronized dispatcher
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 41, 42, 43
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/OpcUa/OpcUaPublishActor.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/OpcUaPublishActorTests.cs`
Bridge between Akka messages and OPCFoundation address space. Pinned dispatcher: `opcua-synchronized-dispatcher` (from HOCON, Task 19) — `Props.WithDispatcher("opcua-synchronized-dispatcher")`.
Responsibilities:
- Receive `AttributeValueUpdate(nodeId, value, quality, timestampUtc)` → write to OPC UA address space.
- Receive `AlarmStateUpdate(...)` → write alarm node.
- Subscribe to DistributedPubSub topic `redundancy-state` → on `NodeRedundancyState` for this node, write `ServiceLevel` byte + `ServerUriArray` nodes.
- Receive `RebuildAddressSpace` → marshal address-space rebuild via OPC UA SDK API; bump sequence number.
OPC UA SDK objects are NEVER exposed in message payloads — actor owns them internally.
Tests: receive update → SDK write invoked; ServiceLevel update → ServiceLevel node written with correct byte.
Commit: `feat(runtime): OpcUaPublishActor bridges Akka and OPCFoundation address space`.
---
### Task 45: `HistorianAdapterActor`, `PeerOpcUaProbeActor`, `DbHealthProbeActor`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (last Phase 6 task — combines three small actors)
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Historian/HistorianAdapterActor.cs`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Health/PeerOpcUaProbeActor.cs`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Health/DbHealthProbeActor.cs`
- Test: `tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests/HealthProbeActorTests.cs`
- **`HistorianAdapterActor`**: wraps existing named-pipe IPC to Wonderware sidecar. Buffers writes to SQLite store-and-forward on pipe disconnect. Reuses existing `SqliteStoreAndForwardSink` from current OtOpcUa code (find via `grep -rln SqliteStoreAndForwardSink ~/Desktop/OtOpcUa/src`).
- **`PeerOpcUaProbeActor`**: per-peer-node periodic OPC UA `opc.tcp://peer:4840` ping. Publishes `OpcUaProbeResult(nodeId, ok)` to `redundancy-state` topic input.
- **`DbHealthProbeActor`**: cached DB probe (single-flight) feeding `/health/ready` + `RedundancyStateActor`. Reuses `DbHealthCache` if present.
Wrap all three actors as children under `DriverHostActor`.
Commit: `feat(runtime): HistorianAdapter + PeerOpcUaProbe + DbHealthProbe actors`.
---
## Phase 7 — OpcUaServer extraction
### Task 46: Move `OpcUaApplicationHost` + `Phase7Composer`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (large file move with namespace rename)
**Files:**
- Move: `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaApplicationHost.cs` → `src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/OpcUaApplicationHost.cs`
- Move: `src/Server/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7Composer.cs` → `src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Phase7Composer.cs`
- Update all callers (namespace rename to `ZB.MOM.WW.OtOpcUa.OpcUaServer`)
Use `grep -rln 'ZB.MOM.WW.OtOpcUa.Server.OpcUa' ~/Desktop/OtOpcUa/src` and `grep -rln 'ZB.MOM.WW.OtOpcUa.Server.Phase7' ~/Desktop/OtOpcUa/src` to find imports; update them.
Build green check.
Commit: `refactor(opcua): extract OpcUaApplicationHost and Phase7Composer to OpcUaServer library`.
---
### Task 47: Make `Phase7Composer` pure + property test
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 48-52 (Phase 8)
**Files:**
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/Phase7Composer.cs` (remove side effects; take inputs as parameters)
- Test: `tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests/Phase7ComposerPurityTests.cs`
Refactor: remove static state, remove logging side effects (or make logging optional via injected `ILogger?`), return a `Phase7CompositionResult` record. Same inputs must always produce identical output.
Property test (FsCheck or hand-rolled): generate random `EquipmentRow[]`, `DriverInstanceRow[]`, `ScriptRow[]` arrays; call `ComposeAsync` twice; assert results structurally equal.
Commit: `refactor(opcua): make Phase7Composer pure + property tests`.
---
## Phase 8 — AdminUI library migration
### Task 48: Move Blazor components into AdminUI library
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 47
**Files:**
- Move: `src/Server/ZB.MOM.WW.OtOpcUa.Admin/Components/*` → `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/*`
- Move: `src/Server/ZB.MOM.WW.OtOpcUa.Admin/wwwroot/*` → `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/wwwroot/*`
Namespace rename across all .razor + .razor.cs: `ZB.MOM.WW.OtOpcUa.Admin.Components` → `ZB.MOM.WW.OtOpcUa.AdminUI.Components`.
`MapAdminUI(this IEndpointRouteBuilder, IServiceCollection)` extension method in `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/EndpointRouteBuilderExtensions.cs` that maps Razor components and static assets. Mirror ScadaLink's `MapCentralUI` exactly.
Build green check.
Commit: `refactor(adminui): move Blazor components from Admin into AdminUI Razor class library`.
---
### Task 49: Move SignalR hubs into AdminUI; rewire to FleetStatusBroadcaster
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 50, 51, 52
**Files:**
- Move: `src/Server/ZB.MOM.WW.OtOpcUa.Admin/Hubs/*.cs` → `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Hubs/*.cs`
- Delete: `src/Server/ZB.MOM.WW.OtOpcUa.Admin/Hubs/FleetStatusPoller.cs` (replaced by FleetStatusBroadcaster)
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Fleet/FleetStatusBroadcaster.cs` — inject `IHubContext`, push diffs to it; do same for `AlertHub`, `ScriptLogHub`
Note: hubs in AdminUI reference `ControlPlane` only for telemetry types; `ControlPlane` references hub interfaces via DI'd `IHubContext` — no project-reference cycle.
Build green check.
Commit: `refactor(adminui): SignalR hubs fed by FleetStatusBroadcaster push, no polling`.
---
### Task 50: `IAdminOperationsClient` wrapper
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 49, 51, 52
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Clients/AdminOperationsClient.cs`
Implements `IAdminOperationsClient` (from Task 18) via `ClusterSingletonProxy` to `admin-ops`. Each method does `proxy.Ask(message, timeout)` with 10s timeout + propagated cancellation.
Register in DI: `services.AddScoped()` (scoped because per-circuit `HttpContext.User` flows in claims).
Commit: `feat(adminui): IAdminOperationsClient backed by ClusterSingletonProxy`.
---
### Task 51: Replace `DriverDiagnosticsClient` with `IFleetDiagnosticsClient`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 49, 50, 52
**Files:**
- Delete: `src/Server/ZB.MOM.WW.OtOpcUa.Admin/Services/DriverDiagnosticsClient.cs`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Clients/FleetDiagnosticsClient.cs`
- Modify: any Blazor pages that referenced `DriverDiagnosticsClient` (use `grep -rln DriverDiagnosticsClient ~/Desktop/OtOpcUa/src`)
`FleetDiagnosticsClient` uses `ClusterClient` (or `ActorSelection` if same cluster) to send `GetDiagnosticsRequest(nodeId)` to `/user/driver-host` at the target node and await response.
Pages updated to inject `IFleetDiagnosticsClient` instead.
Commit: `refactor(adminui): replace HTTP DriverDiagnosticsClient with actor-based IFleetDiagnosticsClient`.
---
### Task 52: Drift indicator + Deploy button
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 49, 50, 51
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Deployments.razor`
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Layout/MainLayout.razor` (add drift badge if applicable)
**`Deployments.razor`:**
- Table of `Deployment` rows (most recent first), columns: DeploymentId (short), RevisionHash (short), Status, CreatedBy, CreatedAtUtc, SealedAtUtc.
- "Deploy current configuration" button (requires `FleetAdmin` or `ConfigEditor` role) → calls `IAdminOperationsClient.StartDeploymentAsync(User.Identity.Name, ct)` → toast + auto-refresh table.
- Drift badge: green "in sync" if latest sealed Deployment's revision hash matches `ConfigComposer.SnapshotAndFlatten()` of current ConfigDb state; yellow "drift" otherwise.
Use frontend-design skill aesthetic: clean corporate Bootstrap, vertical stacking per `feedback_form_layout.md`.
Commit: `feat(adminui): Deployments page with drift indicator and Deploy button`.
---
## Phase 9 — Host entry point
### Task 53: `Host/Program.cs` role-gated startup
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 54, 55
**Files:**
- Replace: `src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs`
Mirror `~/Desktop/scadalink-design/src/ScadaLink.Host/Program.cs` structure. Pseudocode:
```csharp
var roles = RoleParser.Parse(Environment.GetEnvironmentVariable("OTOPCUA_ROLES"));
var builder = WebApplication.CreateBuilder(args);
builder.Configuration.AddJsonFile($"appsettings.{string.Join('-', roles.OrderBy(r=>r))}.json", optional: true);
builder.Host.UseSerilog(...);
if (OperatingSystem.IsWindows()) builder.Host.UseWindowsService();
builder.Services.AddOtOpcUaConfigDb(builder.Configuration);
builder.Services.AddOtOpcUaCluster(builder.Configuration);
builder.Services.AddOtOpcUaSecurity(builder.Configuration);
builder.Services.AddAkka("otopcua", (ab, sp) => {
ab.AddOtOpcUaClusterConfig(roles);
if (roles.Contains("admin")) sp.GetRequiredService().Configure(ab);
if (roles.Contains("driver")) sp.GetRequiredService().Configure(ab);
});
if (roles.Contains("admin"))
{
builder.Services.AddRazorComponents().AddInteractiveServerComponents();
builder.Services.AddSignalR();
builder.Services.AddOtOpcUaAdminUI();
}
var app = builder.Build();
app.UseSerilogRequestLogging();
if (roles.Contains("admin"))
{
app.UseAuthentication();
app.UseAuthorization();
app.UseAntiforgery();
app.MapOtOpcUaAuth();
app.MapAdminUI();
app.MapHub("/hubs/fleet");
app.MapHub("/hubs/alerts");
app.MapHub("/hubs/script-log");
}
app.MapHealthEndpoints();
await app.RunAsync();
```
Reads Roles from env; binds Akka cluster config; conditionally maps Blazor + hubs only if `admin` role.
Commit: `feat(host): role-gated Program.cs composes all components`.
---
### Task 54: Health endpoints + appsettings layout
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 53, 55
**Files:**
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/HealthEndpoints.cs`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/DatabaseHealthCheck.cs`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AkkaClusterHealthCheck.cs`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AdminRoleLeaderHealthCheck.cs`
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json` (full Cluster/Security/ConfigDb/OpcUa/Drivers/Historian sections)
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.admin-driver.json` (combined-role default)
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.admin.json`
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.driver.json`
Three endpoints (mirror ScadaLink's pattern):
- `MapHealthChecks("/health/ready", new { Predicate = c => c.Tags.Contains("ready") })`
- `MapHealthChecks("/health/active", new { Predicate = c => c.Tags.Contains("active") })`
- `/healthz` on port 4841 — preserve current OPC UA stack health probe semantics
Commit: `feat(host): health endpoints + per-role appsettings split`.
---
### Task 55: Mac dev mode + dev-stub drivers
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 53, 54
**Files:**
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/Drivers/DriverInstanceActor.cs` (add `Stubbed` Become state)
- Create: `src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.Development.json`
- Modify: `src/Server/ZB.MOM.WW.OtOpcUa.Cluster/RoleParser.cs` (already allows "dev" role per Task 21)
`DriverInstanceActor`: at PreStart, if any of:
- `roles.Contains("dev")` AND `driverType is "Galaxy" or "Historian.Wonderware"`
- `!OperatingSystem.IsWindows()` AND `driverType` is Windows-only
→ Become `Stubbed` immediately; log `INFO [DEV-STUB] driver={Name} reason={dev-role|non-windows}`. Stubbed state returns deterministic test values for read; no-op for write.
`appsettings.Development.json` sets `Security:Ldap:DevStubMode = true`.
Commit: `feat(runtime): DEV-STUB mode for Galaxy/Wonderware on non-Windows or dev role`.
---
## Phase 10 — Cleanup & deletions
### Task 56: Delete `OtOpcUa.Server` and `OtOpcUa.Admin` projects
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** none (depends on Tasks 0-55)
**Files:**
- Delete (directory): `src/Server/ZB.MOM.WW.OtOpcUa.Server/`
- Delete (directory): `src/Server/ZB.MOM.WW.OtOpcUa.Admin/`
- Modify: `ZB.MOM.WW.OtOpcUa.slnx` (remove the two project entries)
- Sweep & delete files referenced in design §10 step 12:
- `DriverInstanceBootstrapper.cs` (should be in Server, already deleted)
- `Redundancy/RedundancyCoordinator.cs`
- `Redundancy/RedundancyStatePublisher.cs`
- `Redundancy/ApplyLeaseRegistry.cs`
- `Hosting/PeerHttpProbeLoop.cs`
- `Hosting/PeerUaProbeLoop.cs` — if not yet ported to `PeerOpcUaProbeActor`, port it now
- `Hubs/FleetStatusPoller.cs` (should be moved/deleted in Task 49)
- `Security/HubTokenService.cs`
- Grep sweep: `grep -rln 'RedundancyRole\|ConfigGeneration\|ApplyLeaseRegistry\|PeerHttpProbeLoop\|FleetStatusPoller\|HubTokenService' ~/Desktop/OtOpcUa/src` — if any reference survives, fix it.
- Delete corresponding `tests/ZB.MOM.WW.OtOpcUa.Server.Tests/` and `tests/ZB.MOM.WW.OtOpcUa.Admin.Tests/` (or keep and gut, depending on what's salvageable — recommend full delete and rebuild from Phase 11)
Build green:
```bash
dotnet build ZB.MOM.WW.OtOpcUa.slnx
```
Run all surviving tests:
```bash
dotnet test ZB.MOM.WW.OtOpcUa.slnx --no-build
```
Commit: `chore(cleanup): delete OtOpcUa.Server, OtOpcUa.Admin, and obsoleted v1 services`.
---
### Task 57: Build & test green check
**Classification:** trivial
**Estimated implement time:** ~3 min
**Parallelizable with:** none
Verify. No commit unless cleanup needed.
---
## Phase 11 — Integration & E2E tests
### Task 58: Host integration test harness (2-node in-process cluster)
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (foundational)
**Files:**
- Create: `tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests.csproj`
- Create: `tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/TwoNodeClusterHarness.cs`
- Create: `tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/docker-compose.yml` (SQL Server + OpenLDAP for Mac-friendly local runs)
`TwoNodeClusterHarness` spins up two `WebApplicationFactory` instances on different ports + different Akka ports + shared SQL Server. Forms a 2-member cluster (both admin+driver). Exposes `AdminA`, `AdminB`, `DriverA`, `DriverB` references (in this harness, A==A and B==B since both roles on both nodes).
Commit: `test(host): 2-node integration test harness`.
---
### Task 59: Deploy happy path + failover integration tests
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 60
**Files:**
- Create: `tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/DeployHappyPathTests.cs`
- Create: `tests/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/FailoverDuringDeployTests.cs`
Test cases mirror design §8 "Failover-specific test cases" 1-7. Each test spins up the 2-node harness, performs the scenario, asserts final ConfigDb + actor state.
Commit: `test(host): deploy happy path + failover-during-deploy integration tests`.
---
### Task 60: OPC UA integration tests
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 59
**Files:**
- Create: `tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests.csproj`
- Create: `tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests/DualEndpointTests.cs`
- Create: `tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests/ServiceLevelTests.cs`
Tests: real OPCFoundation client → both endpoints visible in ServerUriArray; `ServiceLevel` byte = 250 on leader, 240 on follower (with the +10 leader bonus); write through OpcUaPublishActor returns synchronous failure on driver write error.
Commit: `test(opcua): dual-endpoint visibility + ServiceLevel leader-bonus tests`.
---
### Task 61: E2E test infrastructure + CI
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none
**Files:**
- Create: `tests/ZB.MOM.WW.OtOpcUa.E2ETests/ZB.MOM.WW.OtOpcUa.E2ETests.csproj`
- Create: `tests/ZB.MOM.WW.OtOpcUa.E2ETests/docker-compose.yml` (4 Host processes — 2 admin+driver + 2 driver-only + Traefik + SQL + LDAP)
- Create: `.github/workflows/v2-ci.yml` — unit + integration jobs; nightly E2E job
CI runs `dotnet build`, `dotnet test --filter Category!=E2E`, `dotnet test --filter Category=E2E` nightly only.
Commit: `ci(v2): integration test workflow + nightly E2E`.
---
## Phase 12 — Deploy scripts & docs
### Task 62: Rewrite `Install-Services.ps1`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 63, 64, 65
**Files:**
- Replace: `scripts/install/Install-Services.ps1`
New script installs a single Windows Service `OtOpcUaHost` per node; takes `-Roles` parameter, writes `OTOPCUA_ROLES` to service env; binds to a configurable port (default 9000). Uses `sc.exe create` with restart-on-failure.
Update `Refresh-Services.ps1` and `Uninstall-Services.ps1` to match.
Commit: `feat(install): single-service Install-Services.ps1 with -Roles parameter`.
---
### Task 63: Traefik config + `docker-dev/`
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 62, 64, 65
**Files:**
- Create: `scripts/install/Install-Traefik.ps1`
- Create: `scripts/install/traefik.toml` (or `traefik.yml`)
- Create: `docker-dev/docker-compose.yml`
- Create: `docker-dev/README.md`
- Create: `docker-dev/Dockerfile`
`traefik.toml`: one entrypoint `:80`, one router `host=otopcua.*`, one service load-balancing `admin-a:9000` + `admin-b:9000` with `/health/active` health check (interval 5s, timeout 2s, expected 200).
`docker-dev/` runs four Host containers (admin-a, admin-b, driver-a, driver-b) + SQL Server + OpenLDAP + Traefik. Mac-friendly. README walks through `docker compose up -d` and access at `http://localhost`.
Commit: `feat(deploy): Traefik config + docker-dev Mac dev compose`.
---
### Task 64: Update existing docs
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 62, 63, 65
**Files:**
- Rewrite: `docs/Redundancy.md`
- Rewrite: `docs/ServiceHosting.md`
- Update: `docs/security.md`
- Update: `docs/README.md`
`Redundancy.md`: replace operator-managed `RedundancyRole` story with Akka-leader-driven `ServiceLevel`. Document the `ServiceLevelCalculator` truth table.
`ServiceHosting.md`: single fused service, role gating, Traefik, health endpoints.
`security.md`: cookie+JWT hybrid, DataProtection keys in ConfigDb, `/auth/ping` polling.
Commit: `docs: rewrite Redundancy + ServiceHosting + security for v2`.
---
### Task 65: New v2 architecture docs
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 62, 63, 64
**Files:**
- Create: `docs/Architecture-v2.md` (high-level summary, references design doc)
- Create: `docs/Cluster.md` (Akka HOCON, roles, split-brain, failure detector)
- Create: `docs/ControlPlane.md` (singletons, their state machines, ConfigDb tables)
- Create: `docs/Runtime.md` (per-node actor tree, OPC UA bridge, dev-stub mode)
Each ~1-2 pages. Link to design doc as source of truth.
Commit: `docs: v2 architecture overview + Cluster/ControlPlane/Runtime guides`.
---
## Final verification
After Task 65:
1. `dotnet build ZB.MOM.WW.OtOpcUa.slnx` — green
2. `dotnet test ZB.MOM.WW.OtOpcUa.slnx` — all green (unit + integration)
3. `cd docker-dev && docker compose up -d` — manual smoke: login at `http://localhost`, deploy from UI, verify OPC UA dual endpoint via UaExpert
4. Run `scripts/migration/Migrate-To-V2.ps1` against a copy of a real ConfigDb backup; verify row counts match expectations.
5. Tag `v1.x.x-final` on `master` for backport-only fixes.
6. Open PR `v2-akka-fuse` → `master` titled "v2: Akka.NET cluster + fused hosting alignment".
---
## Task index
| # | Title | Class | Time | Parallel with |
|---|---|---|---|---|
| 0 | Branch + Directory.Packages.props | small | 3m | — |
| 1 | Commons project | small | 3m | 2-8 |
| 2 | Cluster project | small | 3m | 1,3-8 |
| 3 | Security project | small | 3m | 1,2,4-8 |
| 4 | ControlPlane project | small | 3m | 1-3,5-8 |
| 5 | Runtime project | small | 3m | 1-4,6-8 |
| 6 | OpcUaServer project | small | 3m | 1-5,7,8 |
| 7 | AdminUI project | small | 3m | 1-6,8 |
| 8 | Host project | small | 5m | 1-7 |
| 9 | Build green | trivial | 2m | — |
| 10 | Deployment entity | standard | 5m | 11-13 |
| 11 | NodeDeploymentState entity | standard | 5m | 10,12,13 |
| 12 | ConfigEdit entity | small | 4m | 10,11,13 |
| 13 | DataProtection keys | small | 3m | 10-12 |
| 14a | RowVersion on live-edit entities | standard | 10m | — |
| 14b | Drop GenerationId FK from entities | high-risk | 30m | — |
| 14c | Obsolete GenerationApplier/Diff/Cache | high-risk | 20m | — |
| 14d | Drop ClusterNode.RedundancyRole | standard | 5m | — |
| 14e | Delete ConfigGeneration + ClusterNodeGenerationState | small | 5m | — |
| 14f | V2HostingAlignment migration (consolidator) | high-risk | 15m | — |
| 15 | Migrate-To-V2.ps1 | standard | 5m | 16-18 |
| 16 | Common types | standard | 5m | 17,18 |
| 17 | Message contracts | standard | 5m | 16,18 |
| 18 | Common interfaces | small | 4m | 16,17 |
| 19 | HOCON | standard | 5m | 20-22 |
| 20 | AkkaHostedService | standard | 5m | 19,21,22 |
| 21 | Role parser | small | 3m | 19,20,22 |
| 22 | ClusterRoleInfo | standard | 5m | 19-21 |
| 23 | Cluster tests | standard | 5m | — |
| 24 | Move LdapAuthService | standard | 5m | 25 |
| 25 | JwtTokenService | standard | 5m | 24 |
| 26 | AddOtOpcUaAuth | standard | 5m | 27,28 |
| 27 | Auth endpoints | standard | 5m | 26,28 |
| 28 | CookieAuthStateProvider | small | 4m | 26,27 |
| 29 | Security tests | standard | 5m | — |
| 30 | ConfigPublishCoordinator happy | high-risk | 5m | 32-35 |
| 31 | Coordinator timeout/recovery | high-risk | 5m | 32-35 |
| 32 | AdminOperationsActor | standard | 5m | 30,31,33-35 |
| 33 | AuditWriterActor | standard | 5m | 30-32,34,35 |
| 34 | FleetStatusBroadcaster | standard | 5m | 30-33,35 |
| 35 | RedundancyStateActor | high-risk | 5m | 30-34 |
| 36 | Singleton registration | standard | 4m | — |
| 37 | DriverHostActor bootstrap | high-risk | 5m | 41-44 |
| 38 | DriverHostActor dispatch | high-risk | 5m | 41-44 |
| 39 | DriverHostActor stale | standard | 4m | 41-44 |
| 40 | Runtime test scaffold | small | 3m | — |
| 41 | DriverInstanceActor | high-risk | 5m | 42-44 |
| 42 | VirtualTagActor | standard | 5m | 41,43,44 |
| 43 | ScriptedAlarmActor | standard | 5m | 41,42,44 |
| 44 | OpcUaPublishActor | high-risk | 5m | 41-43 |
| 45 | Health probe actors | standard | 5m | — |
| 46 | Extract OpcUaApplicationHost | standard | 5m | — |
| 47 | Phase7Composer purity | standard | 5m | 48-52 |
| 48 | Move Blazor → AdminUI | standard | 5m | 47 |
| 49 | Move hubs, rewire | standard | 5m | 50-52 |
| 50 | IAdminOperationsClient | standard | 5m | 49,51,52 |
| 51 | IFleetDiagnosticsClient | standard | 5m | 49,50,52 |
| 52 | Drift + Deploy UI | standard | 5m | 49-51 |
| 53 | Host Program.cs | high-risk | 5m | 54,55 |
| 54 | Health + appsettings | standard | 5m | 53,55 |
| 55 | DEV-STUB drivers | standard | 5m | 53,54 |
| 56 | Delete Server + Admin | high-risk | 5m | — |
| 57 | Build & test green | trivial | 3m | — |
| 58 | Integration harness | standard | 5m | — |
| 59 | Deploy + failover IT | standard | 5m | 60 |
| 60 | OPC UA IT | standard | 5m | 59 |
| 61 | E2E + CI | standard | 5m | — |
| 62 | Install-Services.ps1 | standard | 5m | 63-65 |
| 63 | Traefik + docker-dev | standard | 5m | 62,64,65 |
| 64 | Update existing docs | standard | 5m | 62,63,65 |
| 65 | New v2 docs | standard | 5m | 62-64 |
**Total estimated subagent time:** ~5 hours of focused execution, well-suited to subagent-driven dispatch with parallel scheduling on independent tasks.