70 Commits

Author SHA1 Message Date
Joseph Doherty 698709a578 docs(plans): mark Tasks 56+57 complete 2026-05-26 05:39:07 -04:00
Joseph Doherty 76310b8829 chore(cleanup): delete OtOpcUa.Server, OtOpcUa.Admin, and obsolete v1 tests
Task 56: removes the legacy in-process Server + Admin Web project + their test
projects (Server.Tests, Admin.Tests, Admin.E2ETests). The fused OtOpcUa.Host
binary built across Phases 1-9 is now the sole production entry point.

What happened to the 47 legacy Admin Blazor pages: per follow-up F15, the
v1 architecture's draft/publish UX is replaced by v2's live-edit + snapshot-
deploy model, so a 1:1 migration is not meaningful. The mechanical move via
git mv preserves the history; service classes + page bodies that referenced
removed v1 types (ConfigGeneration, RedundancyRole, GenerationId) were
deleted. AdminUI now ships a minimal Home page + the v2 Deployments page.

Per-page rebuild against the v2 surface is tracked as F15. The v2 Deployments
page (Task 52) is the only first-party UI shipping in this PR.

Task 57: solution build green; 84+ tests green across active v2 + legacy
driver test projects.
2026-05-26 05:38:31 -04:00
Joseph Doherty 2b75ce3876 docs(plans): mark Phase 9 tasks 53-55 complete; track F19/F20 follow-ups 2026-05-26 05:23:48 -04:00
Joseph Doherty 8b4de8080b feat(runtime): DEV-STUB mode for Windows-only drivers on non-Windows or dev role 2026-05-26 05:23:02 -04:00
Joseph Doherty fa1d685ccd feat(host): health endpoints + per-environment appsettings layout 2026-05-26 05:23:01 -04:00
Joseph Doherty e2b357f89a feat(host): role-gated Program.cs composes all v2 components 2026-05-26 05:22:59 -04:00
Joseph Doherty eb4280b7eb docs(plans): add F15-F18 follow-ups for Phase 8 deferred scope 2026-05-26 05:19:02 -04:00
Joseph Doherty 8a1f97b27f docs(plans): mark Phase 8 tasks 48-52 complete; track F15-F18 follow-ups 2026-05-26 05:18:37 -04:00
Joseph Doherty f167808a2c feat(adminui): Deployments page with drift indicator and Deploy button 2026-05-26 05:18:00 -04:00
Joseph Doherty b83f099394 feat(adminui): IFleetDiagnosticsClient skeleton (Akka round-trip tracked as F17) 2026-05-26 05:17:59 -04:00
Joseph Doherty f022499e7f feat(adminui): IAdminOperationsClient backed by ClusterSingletonProxy 2026-05-26 05:17:58 -04:00
Joseph Doherty 26d8f2f620 feat(adminui): FleetStatusHub + AlertHub + MapOtOpcUaHubs (broadcaster bridge tracked as F16) 2026-05-26 05:17:56 -04:00
Joseph Doherty 1a067e609c refactor(adminui): MapAdminUI extension + AddAdminUI DI (47-component migration tracked as F15) 2026-05-26 05:17:55 -04:00
Joseph Doherty 5e31449529 docs(plans): mark Phase 7 tasks 46+47 complete; track F13/F14 full-extraction follow-ups 2026-05-26 05:15:18 -04:00
Joseph Doherty b7c117ab31 feat(opcua): pure Phase7Composer + purity tests (side-effects tracked as F14) 2026-05-26 05:14:45 -04:00
Joseph Doherty 2877a883cd feat(opcua): OpcUaApplicationHost facade in OpcUaServer (full extraction tracked as F13) 2026-05-26 05:14:39 -04:00
Joseph Doherty 2e4f1399bb docs(plans): mark Phase 6 tasks 39-45 complete (race-recovered commit) 2026-05-26 05:10:21 -04:00
Joseph Doherty e31547d00e docs(plans): mark Phase 6 tasks 37-45 complete; track F7-F12 engine-wiring follow-ups 2026-05-26 05:09:52 -04:00
Joseph Doherty 28639cb14d feat(runtime): HistorianAdapter + PeerOpcUaProbe + DbHealthProbe actors (engine wiring tracked as F11/F12) 2026-05-26 05:09:06 -04:00
Joseph Doherty e115f13104 feat(runtime): OpcUaPublishActor on synchronized dispatcher (SDK wiring tracked as F10) 2026-05-26 05:09:04 -04:00
Joseph Doherty 95ef533822 feat(runtime): ScriptedAlarmActor state machine (engine wiring tracked as F9) 2026-05-26 05:09:03 -04:00
Joseph Doherty 39729bfe21 feat(runtime): VirtualTagActor skeleton (engine wiring tracked as F8) 2026-05-26 05:09:01 -04:00
Joseph Doherty 64c627f8d6 feat(runtime): DriverInstanceActor state machine with Connecting/Connected/Reconnecting 2026-05-26 05:05:36 -04:00
Joseph Doherty ed130135ca feat(runtime): DriverHostActor state machine with PreStart recovery + DispatchDeployment + stale fallback 2026-05-26 05:02:42 -04:00
Joseph Doherty ea6f972e96 docs(plans): mark Phase 5 tasks 30-36 complete with commit hashes 2026-05-26 04:57:37 -04:00
Joseph Doherty 52bf4b3371 feat(controlplane): WithOtOpcUaControlPlaneSingletons registration extension (admin role) 2026-05-26 04:57:09 -04:00
Joseph Doherty dd122c4ca9 feat(controlplane): FleetStatusBroadcaster push-driven from cluster events + heartbeats 2026-05-26 04:57:07 -04:00
Joseph Doherty f193872891 feat(controlplane): ConfigPublishCoordinator deadline timeout + failover PreStart recovery 2026-05-26 04:57:05 -04:00
Joseph Doherty bad2aef137 docs(plans): track F5 multi-node coordinator test + F6 RedundancyState publisher refactor 2026-05-26 04:53:32 -04:00
Joseph Doherty 6b37f997ad feat(controlplane): RedundancyStateActor with debounced topology publish 2026-05-26 04:53:31 -04:00
Joseph Doherty 62e12dab95 feat(controlplane): ConfigPublishCoordinator happy path with NodeDeploymentState seeding 2026-05-26 04:53:29 -04:00
Joseph Doherty ef683f5073 feat(controlplane): AdminOperationsActor + ConfigComposer + StartDeployment flow 2026-05-26 04:53:28 -04:00
Joseph Doherty 9f61cd5989 test(controlplane): self-join cluster + DistributedPubSub extension in test harness 2026-05-26 04:53:25 -04:00
Joseph Doherty 9582e448d5 docs(plans): track F4 WrapDetails JSON hardening follow-up 2026-05-26 04:46:18 -04:00
Joseph Doherty 1955bc5f4d docs(plans): mark Task 33+35 partial complete; track F3 audit-idempotency follow-up 2026-05-26 04:44:37 -04:00
Joseph Doherty 23f669c376 feat(controlplane): AuditWriterActor with batched in-buffer-dedup insert 2026-05-26 04:44:01 -04:00
Joseph Doherty 14acab5a58 feat(controlplane): ServiceLevelCalculator + ControlPlane.Tests harness 2026-05-26 04:43:59 -04:00
Joseph Doherty 32574b3e4e docs(plans): track JwtBearer DI antipattern as follow-up F2 2026-05-26 04:39:10 -04:00
Joseph Doherty fc22d4f7b6 docs(plans): track AuthEndpoints integration tests as follow-up F1 2026-05-26 04:37:49 -04:00
Joseph Doherty 973a3d1b9a docs(plans): mark Tasks 24-29 complete in tasks.json 2026-05-26 04:36:17 -04:00
Joseph Doherty 38ea0c5086 test(security): cookie+JWT roundtrip, role mapper, LDAP escape/RDN helpers 2026-05-26 04:35:51 -04:00
Joseph Doherty e38f22e3c2 feat(security): CookieAuthenticationStateProvider for Blazor circuit expiry detection 2026-05-26 04:35:50 -04:00
Joseph Doherty 8be84ba27b feat(security): /auth/login, /auth/ping, /auth/token endpoints 2026-05-26 04:35:49 -04:00
Joseph Doherty 207fc6aba9 feat(security): cookie+JWT hybrid auth via AddOtOpcUaAuth 2026-05-26 04:35:48 -04:00
Joseph Doherty 93316e3431 feat(security): JwtTokenService with HS256 + 15-min expiry 2026-05-26 04:35:46 -04:00
Joseph Doherty 567b8cac1d refactor(security): move LdapAuthService into OtOpcUa.Security library 2026-05-26 04:35:42 -04:00
Joseph Doherty f35925b57e docs(plans): mark Tasks 19-23 complete in tasks.json 2026-05-26 04:31:28 -04:00
Joseph Doherty e0b6d5680b test(cluster): HOCON parses, role parser truth table 2026-05-26 04:31:08 -04:00
Joseph Doherty c217c49f69 feat(cluster): ClusterRoleInfo wraps Akka.Cluster for app-facing role queries 2026-05-26 04:31:07 -04:00
Joseph Doherty dfb06368cd feat(cluster): parse OTOPCUA_ROLES env var with validation 2026-05-26 04:31:06 -04:00
Joseph Doherty f184f8ed1b feat(cluster): AkkaHostedService and DI extension 2026-05-26 04:31:05 -04:00
Joseph Doherty 3d0f4dc168 feat(cluster): embed Akka HOCON config matching ScadaLink tuning 2026-05-26 04:31:03 -04:00
Joseph Doherty fdb4ac7051 docs(plans): mark Tasks 15-18 complete in tasks.json 2026-05-26 04:27:41 -04:00
Joseph Doherty 136234e7f2 feat(commons): add cluster/admin/diagnostics client interfaces 2026-05-26 04:27:19 -04:00
Joseph Doherty 5d3a5a40d7 feat(commons): add deploy/admin/audit/redundancy/fleet message contracts 2026-05-26 04:27:18 -04:00
Joseph Doherty fee4a8c008 feat(commons): add correlation/execution/node/deployment/revisionhash types 2026-05-26 04:26:01 -04:00
Joseph Doherty c168c1c9c6 feat(migration): add Migrate-To-V2.ps1 idempotent migration runner 2026-05-26 04:26:01 -04:00
Joseph Doherty 605dbf3dcc feat(configdb): V2HostingAlignment migration consolidating Phase 1a-1e
Phase 1f — the consolidator migration. Closes out the v2 entity-model
rewrite by emitting a single EF migration that captures the cumulative
schema delta from 14a (RowVersion) through 14e (drop generation entities).

Generated: src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/
              20260526081556_V2HostingAlignment.cs           (1562 lines)
              20260526081556_V2HostingAlignment.Designer.cs

Migration shape (per `grep -nE migrationBuilder.\(...)`):

  Drop  12 ForeignKey constraints (one per live-edit entity's GenerationId FK)
  Drop  2  Tables  (ConfigGeneration, ClusterNodeGenerationState)
  Drop  45 Indexes (every UX_*_Generation_* and IX_*_Generation_* across the
                    13 live-edit tables — 1 also dropped the unique-Primary
                    filtered index UX_ClusterNode_Primary_Per_Cluster)
  Drop  13 Columns (12 GenerationId + 1 RedundancyRole)
  Add   12 RowVersion columns (one per live-edit entity)
  Create 4  Tables (Deployment, NodeDeploymentState, ConfigEdit,
                    DataProtectionKeys)
  Create ~45 Indexes (recreated under the new naming pattern
                      UX_<Table>_LogicalId / UX_<Table>_<X> with the
                      GenerationId column stripped from composite keys)

Notable EF quirks accepted:
  Unique-on-required-column indexes (UX_VirtualTag_LogicalId etc.) ship a
  `filter: "[VirtualTagId] IS NOT NULL"` clause that EF auto-inserts for
  SQL Server. Harmless — the column is C#-side `required` so NULL never
  appears.

Verification:
  dotnet build src/Core/ZB.MOM.WW.OtOpcUa.Configuration          -> 0 errors
  dotnet ef migrations script --idempotent (against placeholder DSN)
                                                                 -> 3259-line
                                                                    .sql produced
                                                                    OK
  tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests                -> 0 errors

Live `dotnet ef database update` against a scratch SQL Server deferred to
Task 15 (Migrate-To-V2.ps1) — SSH to the docker host needs a key/password I
don't have, and the always-on SQL at 10.100.0.35,14330 uses Integrated
Security (Windows auth, unreachable from this macOS dev). The migration
itself is structurally correct by construction (EF tooling generated it
against the live DbContext model); the live-DB confidence step is the
PowerShell wrapper's job.

SchemaComplianceTests updates:
  - All_expected_tables_exist: removed ConfigGeneration +
    ClusterNodeGenerationState; added Deployment, NodeDeploymentState,
    ConfigEdit, DataProtectionKeys.
  - Filtered_unique_indexes_match_schema_spec: removed entries for
    UX_ClusterNode_Primary_Per_Cluster (Task 14d) and
    UX_ConfigGeneration_Draft_Per_Cluster (Task 14e). Two filtered uniques
    remain (UX_ClusterNodeCredential_Value, UX_ExternalIdReservation_KindValue_Active).
  - Check_constraints_match_schema_spec: added CK_ConfigEdit_FieldsJson_IsJson.

StoredProceduresTests update:
  - Removed RedundancyRole + 'Primary' from the raw INSERT into ClusterNode
    so the DB-backed test runs against the new schema.
2026-05-26 04:18:50 -04:00
Joseph Doherty e00f46d723 refactor(configdb): delete ConfigGeneration + ClusterNodeGenerationState
Phase 1e of the v2 entity-model rewrite. With the FKs gone (Task 14b) and
the apply pipeline replaced (Task 14c), the v1 draft/publish entities have
no remaining v2 consumers.

Deleted entity classes:
  src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigGeneration.cs
  src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ClusterNodeGenerationState.cs

Deleted enum classes (no v2 consumers):
  src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/GenerationStatus.cs
  src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/NodeApplyStatus.cs

OtOpcUaConfigDbContext changes:
  - Removed DbSet<ConfigGeneration> ConfigGenerations
  - Removed DbSet<ClusterNodeGenerationState> ClusterNodeGenerationStates
  - Removed ConfigureConfigGeneration(modelBuilder) call + method body
  - Removed ConfigureClusterNodeGenerationState(modelBuilder) call + body
  - Tidied the "v2 deploy-model tables" header comment

Navigation property cleanup:
  - ServerCluster.Generations collection -> removed
  - ClusterNode.GenerationState navigation -> removed

doc-comment cref cleanup (replaced <see cref="X"/> with <c>X</c> for the
deleted types so the C# XML comment compiler doesn't fail with CS1574):
  - Deployment.cs (cref to ConfigGeneration)
  - NodeDeploymentState.cs (cref to ClusterNodeGenerationState)
  - Core/OpcUa/EquipmentNodeWalker.cs (cref to ConfigGeneration in the
    EquipmentNamespaceContent record's doc-comment; while there, removed
    "All four collections are scoped to the same ConfigGeneration" since
    that's no longer true in v2)

Verification:
  src/Core/ZB.MOM.WW.OtOpcUa.Configuration            -> 0 errors
  src/Core/ZB.MOM.WW.OtOpcUa.Core                     -> 0 errors
  tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests    -> 0 errors
  tests/Core/ZB.MOM.WW.OtOpcUa.Core.Tests             -> 0 errors
  whole solution                                       -> 15 errors
    (all in Server/Admin; transitive Server.Tests/Admin.Tests skip per the
    parent's failure, so the per-project count dropped vs Task 14d's 71)
2026-05-26 04:14:55 -04:00
Joseph Doherty 3c915e652e refactor(configdb): drop ClusterNode.RedundancyRole (replaced by Akka leader)
Phase 1d of the v2 entity-model rewrite. The static RedundancyRole column
is replaced by Akka cluster's role-leader-of-"driver" election at runtime
(see RedundancyStateActor + ServiceLevelCalculator in Task 35).

Changes:

  - Removed `public required RedundancyRole RedundancyRole` from
    ClusterNode entity.
  - Removed `e.Property(x => x.RedundancyRole).HasConversion<string>()...`
    mapping from OtOpcUaConfigDbContext.ConfigureClusterNode.
  - Removed the `UX_ClusterNode_Primary_Per_Cluster` filtered unique index
    (filter referenced [RedundancyRole]='Primary').
  - Dropped `using ZB.MOM.WW.OtOpcUa.Configuration.Enums` from ClusterNode.cs
    (no longer needed).
  - Deleted `Enums/RedundancyRole.cs` — the enum is unused in v2-kept code.
  - DraftValidator: dropped the "exactly one Primary per cluster"
    validation block. Comment in place explaining v2 picks primary at
    runtime via Akka.
  - DraftValidatorTests: dropped ValidateClusterTopology_flags_multiple_Primary
    test; reworked BuildNode helper to no longer take a `role` argument.

Untouched (Server + Admin still reference RedundancyRole; accepted broken
per Task 56 policy):

  src/Server/ZB.MOM.WW.OtOpcUa.Server/Redundancy/{ClusterTopologyLoader,
    RedundancyStatePublisher, RedundancyTopology, ServiceLevelCalculator}.cs
  src/Server/ZB.MOM.WW.OtOpcUa.Admin/Services/RedundancyMetrics.cs

DB-runtime tests will fail against the new schema (Task 14f's migration
drops the column) — to be updated in Task 14f's SchemaComplianceTests
update:

  - SchemaComplianceTests.cs:55 (expected filtered index list)
  - StoredProceduresTests.cs:263 (raw INSERT names the column)

Verification:
  src/Core/ZB.MOM.WW.OtOpcUa.Configuration            -> 0 errors
  tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests    -> 0 errors
  whole solution                                       -> 71 errors
    (70 from Task 14b in Server/Admin, +1 new Server/Redundancy reference)
2026-05-26 04:11:57 -04:00
Joseph Doherty 1ddf8bb50e refactor(configdb): delete v1 Apply pipeline (replaced by AdminOperationsActor)
Phase 1c of the v2 entity-model rewrite. Deletes the draft/publish lifecycle
machinery that v2 replaces with AdminOperationsActor + ConfigComposer +
DriverInstanceActor.ApplyDelta.

Deleted (6 files):

  src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Apply/
    IGenerationApplier.cs   — interface for the apply pipeline
    GenerationApplier.cs    — the v1 applier coordinating per-driver hook-back
    GenerationDiff.cs       — typed wrapper over the sp_ComputeGenerationDiff
                              SQL output
    ApplyCallbacks.cs       — per-driver hook surface invoked by the applier
    ChangeKind.cs           — enum {Added, Modified, Removed, Unchanged}

  tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests/GenerationApplierTests.cs

The empty Apply/ directory is removed.

Kept (repurposed in Task 39 for stale-config fallback):

  src/Core/ZB.MOM.WW.OtOpcUa.Configuration/LocalCache/GenerationSealedCache.cs
  src/Core/ZB.MOM.WW.OtOpcUa.Configuration/LocalCache/ResilientConfigReader.cs
  tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests/GenerationSealedCacheTests.cs
  tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests/ResilientConfigReaderTests.cs

Naming rename (GenerationSealedCache -> DeploymentArtifactCache) deferred
to Task 39 (DriverHostActor stale-config fallback) where the consumer is
written. The type stays available under its v1 name until then.

IDriver.cs doc-comment: replaced the "Used by IGenerationApplier..." sentence
with "Invoked by the v2 DriverInstanceActor when ApplyDelta reports that only
this driver's config changed in the new deployment."

Server/Admin breakage from Task 14b unchanged (70 errors). Configuration +
Core.Tests + Configuration.Tests stay green.

  src/Core/ZB.MOM.WW.OtOpcUa.Configuration  -> 0 errors
  tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests  -> 0 errors
  whole solution  -> 70 errors (all in Server/Admin)
2026-05-26 04:09:17 -04:00
Joseph Doherty 13d3aeab09 refactor(configdb): drop GenerationId FK from live-edit entities
Phase 1b of the v2 entity-model rewrite. The design's live-edit model means
the 12 v2 live-edit entities no longer carry a generation scope — they're
edited directly via AdminOperationsActor, with RowVersion (added in Task 14a)
providing last-write-wins detection.

Entity changes (12 files):

  Equipment, DriverInstance, Device, Tag, PollGroup, Namespace,
  UnsArea, UnsLine, NodeAcl, Script, VirtualTag, ScriptedAlarm

  - Removed: public long GenerationId
  - Removed: public ConfigGeneration? Generation (navigation)

DbContext changes (OtOpcUaConfigDbContext.cs):

  - Removed 12 HasOne(x => x.Generation).WithMany().HasForeignKey... mappings
  - Rewrote ~36 indexes: dropped the GenerationId column from each composite
    key, renamed UX_<Table>_Generation_<X> -> UX_<Table>_<X> and
    IX_<Table>_Generation_<X> -> IX_<Table>_<X>. Logical IDs become globally
    unique (UX_<Table>_LogicalId on the LogicalId column alone).
  - Removed Namespace's redundant UX_Namespace_Generation_LogicalId_Cluster
    index (subsumed by the new UX_Namespace_LogicalId).

Core.Tests fixtures (4 files):

  Removed "GenerationId = 1," lines from:
    - PermissionTrieBuilderTests.cs (NodeAcl Row factory)
    - PermissionTrieTests.cs (NodeAcl Row factory)
    - TriePermissionEvaluatorTests.cs (NodeAcl Row factory + 2 gen{1,5}Row
      mutations that test stale-generation evaluation; the trie itself still
      carries a generation tag via PermissionTrie.GenerationId, fed in via
      PermissionTrieBuilder.Build's generationId parameter, so the tests
      still exercise the production code path)
    - EquipmentNodeWalkerTests.cs (Area/Line/Eq/Tag/VirtualTag/ScriptedAlarm
      builders)

Expected breakage (accepted per Task 56 policy):

  src/Server/ZB.MOM.WW.OtOpcUa.Server   ~25 errors  (DriverInstanceBootstrapper,
                                                     AuthorizationBootstrap,
                                                     EquipmentNamespaceContentLoader,
                                                     Phase7Composer, ...)
  src/Server/ZB.MOM.WW.OtOpcUa.Admin    ~45 errors  (VirtualTags.razor,
                                                     ScriptedAlarms.razor,
                                                     DriverInstanceService,
                                                     EquipmentService,
                                                     EquipmentImportBatchService,
                                                     UnsService,
                                                     FocasDriverDetailService,
                                                     ...)

Server.Tests, Admin.Tests, Admin.E2ETests also break transitively (they
project-reference Server/Admin). All deleted in Task 56.

Verification:
  dotnet build src/Core/ZB.MOM.WW.OtOpcUa.Configuration -> 0 errors
  dotnet build tests/Core/ZB.MOM.WW.OtOpcUa.Core.Tests  -> 0 errors
  dotnet build tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests -> 0 errors
  dotnet build (whole solution) -> 70 errors, all in Server/Admin
2026-05-26 04:06:25 -04:00
Joseph Doherty 4bb4ad8acb feat(configdb): add RowVersion to live-edit entities
Phase 1a of the v2 entity-model rewrite. Adds:

  public byte[] RowVersion { get; set; } = Array.Empty<byte>();

and the EF Core mapping

  e.Property(x => x.RowVersion).IsRowVersion();

to 12 live-edit entities:

  Equipment, DriverInstance, Device, Tag, PollGroup, Namespace,
  UnsArea, UnsLine, NodeAcl, Script, VirtualTag, ScriptedAlarm

These are the entities that v2 admins will edit directly via
AdminOperationsActor (no draft staging). RowVersion enables
last-write-wins detection when two operators race on the same row.

GenerationId FKs are still in place on these entities (removed in Task 14b);
this commit only adds the rowversion column so the migration in Task 14f can
emit ADD COLUMN before DROP FK as a single atomic step.
2026-05-26 03:58:58 -04:00
Joseph Doherty 990ce343fe docs(plans): split Task 14 into 14a-14f (entity-model rewrite)
The original Task 14 (5-min EF migration that "drops ConfigGeneration") was
under-scoped: the design doc (live-edit model, ~line 208) requires removing
GenerationId from 13 entities (Equipment, DriverInstance, Device, Tag,
PollGroup, Namespace, UnsArea, UnsLine, NodeAcl, Script, VirtualTag,
ScriptedAlarm) and adding RowVersion columns for last-write-wins detection.
That cascades into GenerationApplier / GenerationDiff / GenerationSealedCache
and the legacy Server/Admin CRUD services.

New decomposition (~85 min total, replacing the original 5-min estimate):

  14a  standard   10m  Add RowVersion to live-edit entities
  14b  high-risk  30m  Drop GenerationId FK from those entities
  14c  high-risk  20m  Obsolete GenerationApplier/Diff/SealedCache
  14d  standard   5m   Drop ClusterNode.RedundancyRole
  14e  small      5m   Delete ConfigGeneration + ClusterNodeGenerationState
  14f  high-risk  15m  Consolidator: generate V2HostingAlignment migration

Policy decision (recorded with user): OtOpcUa.Server + OtOpcUa.Admin are
allowed to fail-to-compile between 14b and Task 56 - only the new v2 projects
need to stay green. Task 56 deletes the legacy projects.

Plan markdown: replaces the original Task 14 section with the 6-task
decomposition + a header explaining the rewrite. Task index table at the
bottom of the plan updated.

Tasks JSON: replaces the single Task 14 row with 6 string-id rows
("14a", "14b", ..., "14f"). Task 15 (Migrate-To-V2.ps1) and downstream
consumers re-pointed at "14f".

Verification step in 14f rewritten to use the shared docker host at
10.100.0.35 per CLAUDE.md (Docker is not installed on this Mac dev VM).
2026-05-26 03:55:48 -04:00
Joseph Doherty 8e2c4f2835 feat(configdb): add Deployment, NodeDeploymentState, ConfigEdit, DataProtectionKey entities
Phase 1 entities for the v2 live-edit + snapshot-deploy model:

  Deployment           — immutable artifact snapshot (replaces v1 ConfigGeneration row)
                         Status enum {Dispatching, AwaitingApplyAcks, Sealed,
                         PartiallyFailed, TimedOut}; carries the SHA256 RevisionHash and
                         the SnapshotAndFlatten() ArtifactBlob; RowVersion for optimistic
                         concurrency.
  NodeDeploymentState  — per-(node, deployment) apply progress row owned by
                         DriverHostActor (replaces single-row ClusterNodeGenerationState).
                         Composite key (NodeId, DeploymentId) gives the
                         ConfigPublishCoordinator the full history it needs to
                         reconstruct in-flight state after a failover.
  ConfigEdit           — append-only audit row written by AdminOperationsActor on every
                         mutating op; optional ExecutionId correlates edits inside one
                         admin transaction (e.g. an import batch).
  DataProtectionKey    — ASP.NET DataProtection key ring storage via
                         IDataProtectionKeyContext so every admin-role node decrypts
                         the same cookies without sharing a filesystem.

OtOpcUaConfigDbContext now implements IDataProtectionKeyContext and registers four new
DbSets + four new ConfigureXxx mappings.

Central package bumps (forced by Microsoft.AspNetCore.DataProtection.EntityFrameworkCore
10.0.7's transitive dep):

  Microsoft.EntityFrameworkCore.{,Design,InMemory,SqlServer}  10.0.0 -> 10.0.7
  Microsoft.Extensions.{Configuration.Abstractions,Configuration.Json,Hosting,Hosting.WindowsServices,Http}  10.0.0 -> 10.0.7

EF migration generation + the ConfigGeneration drop + RedundancyRole column removal are
deferred to Task 14 (high-risk, non-parallelizable).
2026-05-26 03:49:59 -04:00
Joseph Doherty 30a2104fa5 feat(scaffold): introduce 8 v2 component projects
Adds the empty project skeletons that subsequent v2 tasks fill in:

  src/Core/ZB.MOM.WW.OtOpcUa.Commons      (types, interfaces, message contracts)
  src/Core/ZB.MOM.WW.OtOpcUa.Cluster      (Akka.Hosting + cluster wiring)
  src/Server/ZB.MOM.WW.OtOpcUa.Security   (cookie+JWT auth, LDAP)
  src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane (admin-role cluster singletons)
  src/Server/ZB.MOM.WW.OtOpcUa.Runtime    (per-node driver actors)
  src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer (OPC UA SDK application host)
  src/Server/ZB.MOM.WW.OtOpcUa.AdminUI    (Razor class library)
  src/Server/ZB.MOM.WW.OtOpcUa.Host       (single fused web binary)

Each project sets TreatWarningsAsErrors=true in its own csproj (per the
Directory.Build.props deviation note in the previous commit). NuGetAuditSuppress
entries cover transitive vulnerability advisories the new strictness surfaces:

  - GHSA-g94r-2vxg-569j (OpenTelemetry.Api 1.9.0 via Akka.Cluster.Hosting/Tools)
  - GHSA-h958-fxgg-g7w3 (Opc.Ua.Core 1.5.374.126 via OpcUaServer)
  - GHSA-37gx-xxp4-5rgx + GHSA-w3x6-4m5h-cxqf (legacy advisories already accepted)

OpcUaServer pins OPCFoundation.NetStandard.Opc.Ua.Configuration to 1.5.374.126
via VersionOverride to match Opc.Ua.Server's transitive Opc.Ua.Core (same
constraint as the legacy Server project).

Runtime does NOT project-reference any concrete Driver.* assemblies; drivers
load reflectively at runtime (Phase 6). Runtime gets the IDriver contract
through Core.Abstractions instead.

Host's Microsoft.Extensions.Hosting.WindowsServices is conditional on the
Windows OS so the project builds on macOS dev machines.

Build verification: dotnet build -> 438 warnings (all pre-existing xUnit1051
in legacy Server.Tests/Admin.Tests), 0 errors. Closes Task 9 (build green
smoke check, no separate commit).
2026-05-26 03:44:56 -04:00
Joseph Doherty 2b811477d1 chore(build): introduce central package management for v2
Adds Directory.Packages.props (ManagePackageVersionsCentrally) and
Directory.Build.props (net10.0/nullable/implicit usings/LangVersion latest).
Strips Version attributes from every csproj PackageReference and consolidates
versions into the central file.

Side fixes (necessary to keep the build green on .NET SDK 10.0.105 on macOS):

- Microsoft.CodeAnalysis.CSharp{,.Workspaces}: 5.3.0 -> 5.0.0. The 5.3.0
  analyzer DLL references compiler 5.3.0.0 and the local SDK ships compiler
  5.0.0.0, producing CS9057 on every project that loaded the Analyzers
  output. Master itself was broken on this machine pre-change.
- Server + Server.Tests pin OPCFoundation.NetStandard.Opc.Ua.{Configuration,
  Client} to 1.5.374.126 via VersionOverride, matching Opc.Ua.Server's
  pin. Mixing 1.5.378.106 Opc.Ua.Core transitively with 1.5.374.126
  Opc.Ua.Server breaks CustomNodeManager2 override signatures
  (CS0115 on LoadPredefinedNodes/Browse/HistoryRead*) and CS7069 in
  the tests. The pin disappears when the legacy Server project is
  deleted in Task 56.
- Client.UI + Client.UI.Tests: NuGetAuditSuppress for
  GHSA-xrw6-gwf8-vvr9 (Tmds.DBus.Protocol 0.20.0 reaches both projects
  transitively from Avalonia.Desktop on Linux/macOS only).

Deviation from the plan: TreatWarningsAsErrors=true is NOT set in
Directory.Build.props because the pre-v2 Admin/Server test projects carry
~240 xUnit1051 analyzer warnings that would fail the build. New v2 projects
opt in via their own csproj; the global flag can return once the legacy
projects are deleted in Task 56.
2026-05-26 03:40:24 -04:00
Joseph Doherty fac32ad69b docs(plans): add v2 implementation plan with 66 bite-sized tasks
Converts the akka-hosting-alignment design into an executable plan:
12 phases covering branch/scaffold, ConfigDb schema, Commons,
Cluster, Security, ControlPlane singletons, Runtime per-node actors,
OpcUaServer extraction, AdminUI migration, Host entry point, cleanup,
integration tests, deploy scripts, and docs. Each task has files,
TDD steps, exact commands, classification, time estimate, and
parallelizable list. Co-located .tasks.json drives executing-plans
resume from any session.
2026-05-26 03:17:29 -04:00
Joseph Doherty ef4a70751c docs(plans): add v2 Akka + fused hosting alignment design
Captures the brainstormed design to align OtOpcUa with ScadaLink:
single role-gated binary, Akka.NET cluster with admin/driver roles,
cluster singletons for control plane, per-node actor hierarchy for
OPC UA runtime, dual-endpoint warm redundancy preserved with
ServiceLevel driven by Akka leader, cookie+JWT auth, Traefik routing,
and ScadaLink-style live-edit + deploy model replacing the
draft/publish ConfigGeneration lifecycle.
2026-05-26 03:04:21 -04:00
Joseph Doherty 866dc03fac style(ui): align admin styling with ScadaLink master conventions
- Move CSS into wwwroot/css/ (theme.css, site.css); sidebar 218 -> 220px
- Add hamburger + Bootstrap collapse for <lg viewports
- Add Components/Shared/ with LoadingSpinner, ToastNotification, StatusBadge
- Replace .page-title with flex + <h4 class="mb-0"> across 20 pages
- Convert NewCluster + IdentificationFields forms to card + h6 subsection pattern
2026-05-26 01:12:57 -04:00
488 changed files with 16053 additions and 34839 deletions
+18
View File
@@ -0,0 +1,18 @@
<Project>
<!--
Defaults inherited by every csproj. Individual projects may override.
Deviation from the original v2 plan: TreatWarningsAsErrors is NOT set globally because the
pre-v2 test projects (e.g. Admin.Tests) carry 240+ xUnit1051 analyzer warnings that would
fail the build. New v2 projects (Commons, Cluster, ControlPlane, Runtime, OpcUaServer, AdminUI,
Host, Security) MUST opt in to <TreatWarningsAsErrors>true</TreatWarningsAsErrors> in their
own csproj. Once the legacy Admin/Server projects are deleted (Phase 10, Task 56), this can
be promoted back to a global default.
-->
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<LangVersion>latest</LangVersion>
</PropertyGroup>
</Project>
+103
View File
@@ -0,0 +1,103 @@
<Project>
<PropertyGroup>
<ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
</PropertyGroup>
<ItemGroup>
<PackageVersion Include="Akka" Version="1.5.62" />
<PackageVersion Include="Akka.Cluster" Version="1.5.62" />
<PackageVersion Include="Akka.Cluster.Hosting" Version="1.5.62" />
<PackageVersion Include="Akka.Cluster.Tools" Version="1.5.62" />
<PackageVersion Include="Akka.Hosting" Version="1.5.62" />
<PackageVersion Include="Akka.Remote" Version="1.5.62" />
<PackageVersion Include="Akka.Remote.Hosting" Version="1.5.62" />
<PackageVersion Include="Akka.Streams" Version="1.5.62" />
<PackageVersion Include="Akka.Streams.TestKit" Version="1.5.62" />
<PackageVersion Include="Akka.TestKit.Xunit2" Version="1.5.62" />
<PackageVersion Include="Avalonia" Version="11.2.7" />
<PackageVersion Include="Avalonia.Controls.DataGrid" Version="11.2.7" />
<PackageVersion Include="Avalonia.Desktop" Version="11.2.7" />
<PackageVersion Include="Avalonia.Diagnostics" Version="11.2.7" />
<PackageVersion Include="Avalonia.Fonts.Inter" Version="11.2.7" />
<PackageVersion Include="Avalonia.Headless" Version="11.2.7" />
<PackageVersion Include="Avalonia.Svg.Skia" Version="11.2.0.2" />
<PackageVersion Include="Avalonia.Themes.Fluent" Version="11.2.7" />
<PackageVersion Include="Beckhoff.TwinCAT.Ads" Version="7.0.172" />
<PackageVersion Include="bunit" Version="2.0.33-preview" />
<PackageVersion Include="CliFx" Version="2.3.6" />
<PackageVersion Include="CommunityToolkit.Mvvm" Version="8.4.0" />
<PackageVersion Include="coverlet.collector" Version="6.0.4" />
<PackageVersion Include="FluentAssertions" Version="8.3.0" />
<PackageVersion Include="Google.Protobuf" Version="3.34.1" />
<PackageVersion Include="Grpc.Core.Api" Version="2.76.0" />
<PackageVersion Include="Grpc.Net.Client" Version="2.76.0" />
<PackageVersion Include="libplctag" Version="1.5.2" />
<PackageVersion Include="LiteDB" Version="5.0.21" />
<PackageVersion Include="MessagePack" Version="2.5.187" />
<PackageVersion Include="Microsoft.AspNetCore.Authentication.JwtBearer" Version="10.0.7" />
<PackageVersion Include="Microsoft.AspNetCore.Authorization" Version="10.0.7" />
<PackageVersion Include="Microsoft.AspNetCore.DataProtection" Version="10.0.7" />
<PackageVersion Include="Microsoft.AspNetCore.DataProtection.EntityFrameworkCore" Version="10.0.7" />
<PackageVersion Include="Microsoft.AspNetCore.Mvc.Testing" Version="10.0.0" />
<PackageVersion Include="Microsoft.AspNetCore.SignalR.Client" Version="10.0.0" />
<PackageVersion Include="Microsoft.AspNetCore.SignalR.Core" Version="1.2.0" />
<PackageVersion Include="Microsoft.AspNetCore.TestHost" Version="10.0.7" />
<!--
Roslyn analyzer packages pin to the same major version as the SDK's compiler.
.NET SDK 10.0.105 ships compiler 5.0.0.0. Microsoft.CodeAnalysis.CSharp 5.3.x emits
analyzer DLLs that reference compiler 5.3.0.0 and fail with CS9057 on the local SDK.
Pin to 5.0.0 (matches the compiler the SDK ships) until the SDK rolls to 10.0.110+.
-->
<PackageVersion Include="Microsoft.CodeAnalysis.CSharp" Version="5.0.0" />
<PackageVersion Include="Microsoft.CodeAnalysis.CSharp.Scripting" Version="4.12.0" />
<PackageVersion Include="Microsoft.CodeAnalysis.CSharp.Workspaces" Version="5.0.0" />
<PackageVersion Include="Microsoft.Data.SqlClient" Version="6.1.1" />
<PackageVersion Include="Microsoft.Data.Sqlite" Version="9.0.0" />
<PackageVersion Include="Microsoft.EntityFrameworkCore" Version="10.0.7" />
<PackageVersion Include="Microsoft.EntityFrameworkCore.Design" Version="10.0.7" />
<PackageVersion Include="Microsoft.EntityFrameworkCore.InMemory" Version="10.0.7" />
<PackageVersion Include="Microsoft.EntityFrameworkCore.SqlServer" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.Configuration.Abstractions" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.Configuration.Json" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.DependencyInjection" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.DependencyInjection.Abstractions" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.Hosting" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.Hosting.Abstractions" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.Hosting.WindowsServices" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.Http" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.Logging" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.Logging.Abstractions" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.Options" Version="10.0.7" />
<PackageVersion Include="Microsoft.Extensions.Options.ConfigurationExtensions" Version="10.0.7" />
<PackageVersion Include="Microsoft.IdentityModel.Tokens" Version="8.11.0" />
<PackageVersion Include="Microsoft.NET.Test.Sdk" Version="17.12.0" />
<PackageVersion Include="Microsoft.Playwright" Version="1.51.0" />
<PackageVersion Include="Novell.Directory.Ldap.NETStandard" Version="3.6.0" />
<PackageVersion Include="OPCFoundation.NetStandard.Opc.Ua.Client" Version="1.5.378.106" />
<PackageVersion Include="OPCFoundation.NetStandard.Opc.Ua.Configuration" Version="1.5.378.106" />
<PackageVersion Include="OPCFoundation.NetStandard.Opc.Ua.Server" Version="1.5.374.126" />
<PackageVersion Include="OpenTelemetry.Exporter.Prometheus.AspNetCore" Version="1.15.3-beta.1" />
<PackageVersion Include="OpenTelemetry.Extensions.Hosting" Version="1.15.3" />
<PackageVersion Include="Polly.Core" Version="8.6.6" />
<PackageVersion Include="S7netplus" Version="0.20.0" />
<PackageVersion Include="Serilog" Version="4.3.0" />
<PackageVersion Include="Serilog.AspNetCore" Version="9.0.0" />
<PackageVersion Include="Serilog.Extensions.Hosting" Version="9.0.0" />
<PackageVersion Include="Serilog.Formatting.Compact" Version="3.0.0" />
<PackageVersion Include="Serilog.Settings.Configuration" Version="9.0.0" />
<PackageVersion Include="Serilog.Sinks.Console" Version="6.0.0" />
<PackageVersion Include="Serilog.Sinks.File" Version="7.0.0" />
<PackageVersion Include="Shouldly" Version="4.3.0" />
<PackageVersion Include="System.CommandLine" Version="2.0.5" />
<PackageVersion Include="System.Data.SqlClient" Version="4.9.0" />
<PackageVersion Include="System.IdentityModel.Tokens.Jwt" Version="8.11.0" />
<PackageVersion Include="System.IO.Pipes.AccessControl" Version="5.0.0" />
<PackageVersion Include="System.Memory" Version="4.5.5" />
<PackageVersion Include="System.Threading.Tasks.Extensions" Version="4.5.4" />
<PackageVersion Include="xunit" Version="2.9.2" />
<PackageVersion Include="xunit.runner.visualstudio" Version="3.0.2" />
<PackageVersion Include="xunit.v3" Version="1.1.0" />
</ItemGroup>
</Project>
+13 -5
View File
@@ -2,6 +2,8 @@
<Folder Name="/src/" />
<Folder Name="/src/Core/">
<Project Path="src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/ZB.MOM.WW.OtOpcUa.Core.Abstractions.csproj" />
<Project Path="src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ZB.MOM.WW.OtOpcUa.Cluster.csproj" />
<Project Path="src/Core/ZB.MOM.WW.OtOpcUa.Commons/ZB.MOM.WW.OtOpcUa.Commons.csproj" />
<Project Path="src/Core/ZB.MOM.WW.OtOpcUa.Configuration/ZB.MOM.WW.OtOpcUa.Configuration.csproj" />
<Project Path="src/Core/ZB.MOM.WW.OtOpcUa.Core/ZB.MOM.WW.OtOpcUa.Core.csproj" />
<Project Path="src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting/ZB.MOM.WW.OtOpcUa.Core.Scripting.csproj" />
@@ -10,8 +12,12 @@
<Project Path="src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.csproj" />
</Folder>
<Folder Name="/src/Server/">
<Project Path="src/Server/ZB.MOM.WW.OtOpcUa.Server/ZB.MOM.WW.OtOpcUa.Server.csproj" />
<Project Path="src/Server/ZB.MOM.WW.OtOpcUa.Admin/ZB.MOM.WW.OtOpcUa.Admin.csproj" />
<Project Path="src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/ZB.MOM.WW.OtOpcUa.AdminUI.csproj" />
<Project Path="src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/ZB.MOM.WW.OtOpcUa.ControlPlane.csproj" />
<Project Path="src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj" />
<Project Path="src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer/ZB.MOM.WW.OtOpcUa.OpcUaServer.csproj" />
<Project Path="src/Server/ZB.MOM.WW.OtOpcUa.Runtime/ZB.MOM.WW.OtOpcUa.Runtime.csproj" />
<Project Path="src/Server/ZB.MOM.WW.OtOpcUa.Security/ZB.MOM.WW.OtOpcUa.Security.csproj" />
</Folder>
<Folder Name="/src/Drivers/">
<Project Path="src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj" />
@@ -46,6 +52,7 @@
<Folder Name="/tests/" />
<Folder Name="/tests/Core/">
<Project Path="tests/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests/ZB.MOM.WW.OtOpcUa.Core.Abstractions.Tests.csproj" />
<Project Path="tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests/ZB.MOM.WW.OtOpcUa.Cluster.Tests.csproj" />
<Project Path="tests/Core/ZB.MOM.WW.OtOpcUa.Configuration.Tests/ZB.MOM.WW.OtOpcUa.Configuration.Tests.csproj" />
<Project Path="tests/Core/ZB.MOM.WW.OtOpcUa.Core.Tests/ZB.MOM.WW.OtOpcUa.Core.Tests.csproj" />
<Project Path="tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests.csproj" />
@@ -54,9 +61,10 @@
<Project Path="tests/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.Tests/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.Tests.csproj" />
</Folder>
<Folder Name="/tests/Server/">
<Project Path="tests/Server/ZB.MOM.WW.OtOpcUa.Server.Tests/ZB.MOM.WW.OtOpcUa.Server.Tests.csproj" />
<Project Path="tests/Server/ZB.MOM.WW.OtOpcUa.Admin.Tests/ZB.MOM.WW.OtOpcUa.Admin.Tests.csproj" />
<Project Path="tests/Server/ZB.MOM.WW.OtOpcUa.Admin.E2ETests/ZB.MOM.WW.OtOpcUa.Admin.E2ETests.csproj" />
<Project Path="tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests.csproj" />
<Project Path="tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests.csproj" />
<Project Path="tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/ZB.MOM.WW.OtOpcUa.Runtime.Tests.csproj" />
<Project Path="tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests/ZB.MOM.WW.OtOpcUa.Security.Tests.csproj" />
</Folder>
<Folder Name="/tests/Drivers/">
<Project Path="tests/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests.csproj" />
@@ -0,0 +1,451 @@
# OtOpcUa v2 — Akka.NET + Fused Hosting Alignment with ScadaLink
**Status:** Design approved, ready for implementation planning
**Date:** 2026-05-26
**Branch:** `v2-akka-fuse`
**Sister project reference:** `~/Desktop/scadalink-design` (ScadaLink)
## 1. Motivation
OtOpcUa today runs as three separate processes (`OtOpcUa.Server` OPC UA host, `OtOpcUa.Admin` Blazor Server web UI, optional `OtOpcUaWonderwareHistorian` Framework sidecar) with manual operator-driven warm-redundancy failover. The sister project ScadaLink — owned by the same developer — solved similar problems with a fused single-binary, role-gated hosting model on top of an Akka.NET cluster.
The motivation for this refactor is twofold:
1. **Consistency.** A single developer (the project owner) moves between OtOpcUa and ScadaLink frequently. Sharing patterns — hosting, auth, actor hierarchy, deployment model — reduces cognitive overhead and makes fixes portable.
2. **Real HA improvements.** Upgrade OtOpcUa's manual operator-driven failover to automatic, Akka-cluster-driven failover with Traefik routing for the web UI. Preserve OPC UA dual-endpoint client-side failover semantics (clients connect to both nodes and pick based on `ServiceLevel`), now driven automatically by Akka cluster leadership.
## 2. Architecture overview
**One binary, role-gated.** `OtOpcUa.Host` (Microsoft.NET.Sdk.Web, .NET 10) replaces `OtOpcUa.Server` and `OtOpcUa.Admin`. Same binary on every node. Role configured via `OTOPCUA_ROLES` environment variable.
**Two Akka roles, single cluster:**
- **`admin`** — hosts Blazor web UI + cluster singletons. Singletons pinned via `ClusterSingletonManagerSettings.WithRole("admin")`. Traefik routes `/` to whichever Admin-role node `/health/active` reports as leader.
- **`driver`** — hosts OPC UA endpoint + per-node `DriverHostActor` hierarchy. Every Driver-role node always serves OPC UA; `ServiceLevel` computed by `RedundancyStateActor` is broadcast back to each Driver node and used to publish to the local OPC UA address space.
Roles are additive: `OTOPCUA_ROLES=admin`, `OTOPCUA_ROLES=driver`, or `OTOPCUA_ROLES=admin,driver`. Small deployments run both roles on both nodes; larger deployments separate them.
**Per-role leadership.** `Cluster.Get(system).State.RoleLeader("driver")` drives OPC UA `ServiceLevel`. `RoleLeader("admin")` drives `/health/active` (Traefik routing). These are independent — admin and driver leadership can land on different nodes if separated.
**Cluster membership.** Both seed nodes; keep-oldest split-brain resolver; `down-if-alone = on`; 15s stable-after; 2s heartbeat / 10s threshold. CoordinatedShutdown for graceful singleton handover. Exact ScadaLink tuning.
**OPC UA dual-endpoint preserved.** Driver-role nodes all bind `opc.tcp://0.0.0.0:4840`. Clients still see N endpoints in `ServerUriArray` and fail over via `ServiceLevel`. OPC UA spec compliance unchanged from today.
**Mac dev:** role `admin,driver,dev``dev` short-circuits Windows-only driver registration (Galaxy, Wonderware) with explicit `[DEV-STUB]` log lines.
## 3. Project & process restructure
Single solution, ScadaLink-style folder layout. Existing OtOpcUa naming convention (`ZB.MOM.WW.OtOpcUa.*`) preserved.
### New entry point & deletions
| Action | Project |
|---|---|
| **New** | `OtOpcUa.Host``Microsoft.NET.Sdk.Web`, single Program.cs, role-gated startup, `AddWindowsService` |
| **Delete** | `OtOpcUa.Server` (content migrates) |
| **Delete** | `OtOpcUa.Admin` (UI moves to library) |
### New libraries
| Project | Owns | ScadaLink analog |
|---|---|---|
| `OtOpcUa.Commons` | Entity POCOs, interfaces, message contracts (`Types/`, `Interfaces/`, `Entities/`, `Messages/`) | `ScadaLink.Commons` |
| `OtOpcUa.ConfigDb` | EF Core `DbContext`, repositories, `IAuditService`, migrations, Data Protection key store | `ScadaLink.ConfigurationDatabase` |
| `OtOpcUa.Cluster` | Akka HOCON, `AkkaHostedService`, split-brain resolver config, role-aware membership helpers, `IClusterRoleInfo` | (split out of ScadaLink Host) |
| `OtOpcUa.Security` | LDAP bind, cookie+JWT hybrid, JWT issuance, role mapping, `/auth/login`, `/auth/ping` endpoints | `ScadaLink.Security` |
| `OtOpcUa.ControlPlane` | Cluster singletons: `ConfigPublishCoordinator`, `AdminOperationsActor`, `AuditWriterActor`, `FleetStatusBroadcaster`, `RedundancyStateActor` | `ScadaLink.ManagementService` |
| `OtOpcUa.Runtime` | Per-node actors: `DriverHostActor`, `DriverInstanceActor`, `VirtualTagActor`, `ScriptedAlarmActor`, `OpcUaPublishActor`, `HistorianAdapterActor`, `PeerOpcUaProbeActor`, `DbHealthProbeActor` | `ScadaLink.SiteRuntime` |
| `OtOpcUa.OpcUaServer` | OPC UA app host, address-space build, `Phase7Composer` extraction | (in ScadaLink.SiteRuntime/DCL) |
| `OtOpcUa.AdminUI` | Blazor components, hubs (`FleetStatusHub`, `AlertHub`, `ScriptLogHub`), auth state provider, `MapAdminUI<TApp>()` | `ScadaLink.CentralUI` |
### Unchanged
- Driver projects (`OtOpcUa.Driver.Galaxy`, `.Modbus`, `.S7`, `.AbCip`, `.AbLegacy`, `.TwinCAT`, `.FOCAS`, `.OpcUaClient`) — still implement `IDriver`, now consumed by `DriverInstanceActor` instead of `DriverInstanceBootstrapper`.
- `OtOpcUa.Driver.Historian.Wonderware` — .NET Framework 4.8 sidecar, named-pipe IPC, wrapped by a `HistorianAdapterActor` in `OtOpcUa.Runtime`.
- `mxaccessgw` sibling repo — unchanged; Galaxy driver still talks gRPC to it.
### Tests
- `tests/OtOpcUa.Cluster.Tests` — split-brain, leadership transitions
- `tests/OtOpcUa.ControlPlane.Tests` — singleton actor unit tests via Akka.TestKit
- `tests/OtOpcUa.Runtime.Tests` — per-node actor tests, driver lifecycle
- `tests/OtOpcUa.Security.Tests` — LDAP, cookie+JWT roundtrip
- `tests/OtOpcUa.Host.IntegrationTests` — 2-node in-process cluster, deployment flow, failover, Mac-safe
- `tests/OtOpcUa.OpcUa.IntegrationTests` — real OPCFoundation client against stubbed Host
- `tests/OtOpcUa.E2E.Tests` — full stack with Traefik (nightly CI)
### Deploy
- `deploy/Install-Services.ps1` — installs one Windows Service per node (`OtOpcUaHost`), passes role via env var. Old script replaced.
- `deploy/traefik/` — Windows Traefik config + service registration for the leader-routed `/health/active`.
- `docker-dev/` (new, optional) — 2-node Mac dev compose with stubbed drivers + LDAP + SQL Server + Traefik.
Solution file: `OtOpcUa.slnx` (matches ScadaLink convention; switch from current `.sln`).
## 4. Actor hierarchy
### Per-node tree
Rooted under `OtOpcUa.Runtime`, one tree per Driver-role node:
```
DriverHostActor (per-node coordinator, started by Host)
├─ DriverInstanceActor (per DriverInstance row)
│ └─ children = pooled or per-subscription work
├─ VirtualTagActor (per VirtualTag row)
├─ ScriptedAlarmActor (per ScriptedAlarm row)
├─ OpcUaPublishActor (per-node bridge to OPCFoundation address space)
├─ HistorianAdapterActor (per-node, wraps Wonderware named-pipe sidecar)
├─ PeerOpcUaProbeActor (per-node, tests peer OPC UA stack health)
└─ DbHealthProbeActor (per-node, cached DB health probe)
```
### Cluster singletons
Pinned to `admin` role via `ClusterSingletonManagerSettings.WithRole("admin")`:
| Actor | Owns | Notes |
|---|---|---|
| `ConfigPublishCoordinator` | The deploy protocol. Writes `Deployment` row, broadcasts `DispatchDeployment(deploymentId)` via `DistributedPubSub` to every `DriverHostActor`, tracks apply ACKs per node. | Replaces `ApplyLeaseRegistry`. Resumes after failover by re-reading ConfigDb state — no Akka.Persistence. |
| `AdminOperationsActor` | All mutating admin ops (CRUD on equipment, drivers, scripts, namespaces, ACLs). Wraps each in an audit envelope. | UI calls via `ClusterSingletonProxy` (in-process when UI is on Admin node). |
| `AuditWriterActor` | Receives `AuditEvent` telemetry from any node, batch-inserts into `ConfigAuditLog`. | Idempotent on `EventId`. |
| `FleetStatusBroadcaster` | Aggregates Akka cluster member events + per-node `DriverHostStatus` heartbeats. Publishes diffs to `IHubContext<FleetStatusHub>` and `IHubContext<AlertHub>`. | Push-driven; replaces today's 5s `FleetStatusPoller`. |
| `RedundancyStateActor` | Subscribes to `ClusterEvent.IMemberEvent` + `ClusterEvent.LeaderChanged` + per-node health probes. Computes `ServiceLevel` byte + `ServerUriArray` per Driver node. Publishes to `DistributedPubSub` topic `redundancy-state`. | Source of truth for OPC UA redundancy. Local `OpcUaPublishActor` subscribes and writes to its OPCFoundation stack. |
### Supervision
| Actor | Strategy |
|---|---|
| `DriverHostActor` | `Resume` |
| `DriverInstanceActor` | `Restart` with backoff (1s → 30s, ×1.5, jitter) |
| `VirtualTagActor` | `Restart` with backoff |
| `ScriptedAlarmActor` | `Restart` with backoff; preserve alarm state via `PreRestart` hook |
| `OpcUaPublishActor` | `Resume` |
| `HistorianAdapterActor` | `Restart` with backoff; SQLite store-and-forward buffers during pipe outage |
| All singletons | `Resume`; resumable state in ConfigDb |
| Script execution actors (short-lived) | `Stop` on failure |
### State machines
- `DriverInstanceActor` — Become/Stash for `Connecting → Connected → Reconnecting → Failed`. Bad-quality publish on disconnect; transparent re-subscribe on reconnect. Write failures returned synchronously via `Ask` from `OpcUaPublishActor`.
- `ConfigPublishCoordinator``Idle → Publishing → AwaitingApplyAcks → Sealed`, with timeout-driven escalation if a node fails to ack within `ApplyMaxDuration` (default 10 min).
- `RedundancyStateActor` — recomputes on every membership event, debounced 250ms to coalesce bursts.
### Communication conventions
- **Tell** for hot-path internal traffic (driver values, alarm state changes, publish broadcasts).
- **Ask** only at system boundaries (UI controller → `AdminOperationsActor`, with explicit timeout + cancellation token).
- **DistributedPubSub** for cluster-wide broadcasts (`DispatchDeployment`, `RedundancyStateChanged`, `FleetStatusChanged`).
- Application-level **correlation IDs** on every request/response message.
- Messages live in `OtOpcUa.Commons.Messages.{Drivers,Deploy,Admin,Audit,Redundancy}` — additive-only evolution.
### Singleton persistence
No Akka.Persistence. Each singleton reads its resumable state from `ConfigDb` on `PreStart` (e.g., `ConfigPublishCoordinator` reads the current in-flight `Deployment` row + per-node `NodeDeploymentState`) and writes on every state transition.
### Mac-dev stubs
`DevNode` role short-circuits driver registration. `DriverInstanceActor` for any Galaxy/Wonderware row enters a `Stubbed` Become state that returns deterministic test values. Logged at INFO with `[DEV-STUB] driver={Name} reason=windows-only`.
## 5. Web hosting, auth, and SignalR
### Kestrel startup gated by `admin` role
`Program.cs` builds `WebApplicationBuilder`, registers all services, but only calls `app.MapBlazor<App>()`, `app.MapHub<...>()`, `app.MapStaticAssets()`, and auth endpoints when `admin ∈ roles`. Driver-only nodes still bind Kestrel for `/healthz` on `:4841` and nothing else.
### Authentication — cookie+JWT hybrid
| Layer | Config |
|---|---|
| Cookie scheme | `OtOpcUa.Auth`, HttpOnly, SameSite=Strict, Secure (prod) / SameAsRequest (dev). Sliding 30-min idle timeout. |
| Embedded JWT | HMAC-SHA256, 15-min expiry, claims = `sub`, `roles`, `nodeAcls`. |
| LDAP bind | `LdapAuthService.AuthenticateAsync(user, pw)` at `/auth/login` POST — preserved from current `OtOpcUa.Admin/Security`. |
| Role mapping | `RoleMapper.MapGroupsToRolesAsync()` — LDAP groups → `FleetAdmin` / `ConfigEditor` / `ReadOnly`. Stays as-is. |
| Token issuance | `/auth/token` returns bearer for external clients (CLI, automation). |
| Circuit expiry probe | `/auth/ping` returns 200/401, polled by `CookieAuthenticationStateProvider` to detect expiry from inside a SignalR circuit. |
| Failure mode | LDAP unreachable → new logins fail, active sessions continue. |
### Data Protection keys
`services.AddDataProtection().PersistKeysToDbContext<OtOpcUaConfigDbContext>().SetApplicationName("OtOpcUa")` — keys live in `ConfigDb` so a circuit started on Admin-node A survives if Traefik fails over to Admin-node B mid-session.
### SignalR hubs
Three existing hubs preserved (`/hubs/fleet`, `/hubs/alerts`, `/hubs/script-log`):
- **Today:** `FleetStatusPoller` polls SQL every 5s.
- **New:** `FleetStatusBroadcaster` singleton receives Akka cluster events + per-node telemetry, pushes diffs via `IHubContext<FleetStatusHub>`. No polling.
- `HubTokenService` bearer-token fallback retired — hubs are circuit-local, cookie auth flows through SignalR natively. External hub consumers use the bearer token from `/auth/token` with a `JwtBearer` authentication scheme declaration on the hub.
### UI → backend wiring
- **Reads:** Blazor components inject scoped repositories from DI and read directly from `ConfigDb`. No change from today.
- **Writes / mutating ops:** Components inject `IAdminOperationsClient` — a thin wrapper around `ClusterSingletonProxy` to `AdminOperationsActor`. Mutations are `Ask` with a 10s timeout + correlation ID. Audit envelope built UI-side, completed singleton-side.
- **Driver diagnostics:** Today's `DriverDiagnosticsClient` HTTP round-trip retires. UI components ask `IFleetDiagnosticsClient` which delegates to `ClusterClientReceptionist`-published actor messages.
### Health endpoints
| Endpoint | Returns | Used by |
|---|---|---|
| `/health/ready` | 200 once Akka member is `Up` + ConfigDb reachable + DataProtection key ring loaded | Service supervisor readiness gate |
| `/health/active` | 200 only on the Admin-role leader; 503 elsewhere | Traefik — routes browser traffic to leader |
| `/healthz` (existing) | 200 when Driver-role actor system is up + at least one driver registered (preserved on `:4841`) | Ops probes, OPC UA monitoring tools |
### Traefik
Windows Service (or external box). One route: `host=otopcua.*` → load-balance to `{admin-node-a:9000, admin-node-b:9000}` with `/health/active` health check, sticky sessions disabled (DataProtection key sharing handles continuity).
### appsettings structure
Mirrors ScadaLink's per-component options pattern: `Cluster:`, `Security:`, `ConfigDb:`, `OpcUa:`, `Drivers:`, `Historian:` sections, bound to options classes owned by their respective component projects.
## 6. Edit + Deploy flow (replaces draft/publish generations)
The single most consequential domain change: **drop the draft/publish `ConfigGeneration` lifecycle**. Edits are live; deploy is a snapshot+push, ScadaLink-style.
### Edit model
- `Equipment`, `Driver`, `DriverInstance`, `Namespace`, `UnsItem`, `Script`, `VirtualTag`, `ScriptedAlarm`, `NodeAcl` are edited **directly** via `AdminOperationsActor`. No draft staging, no `ConfigGeneration` lifecycle. Last-write-wins per row (rowversion column for stale-write detection only).
- Live edits do **not** affect running Driver-role nodes — running stacks reflect the *last-deployed* state. The UI shows a "drift" indicator when live ConfigDb state differs from last sealed deployment.
- Validation runs on edit (semantic checks: driver tag-path validity, script syntax, namespace name uniqueness) — pulled forward from deploy-time to edit-time.
### Deploy model
```
Admin UI "Deploy" → AdminOperationsActor.Ask(StartDeployment)
AdminOperationsActor:
→ snapshot ConfigDb current state
→ ConfigComposer.Flatten() → DeploymentArtifact
→ compute RevisionHash = SHA256(canonical-serialized artifact)
→ write Deployment row (DeploymentId GUID, RevisionHash, CreatedBy, CreatedAtUtc, Status=Dispatching)
→ Ask ConfigPublishCoordinator.DispatchDeployment(deploymentId)
ConfigPublishCoordinator (cluster singleton, admin role):
→ write Deployment.Status = Dispatching
→ DistributedPubSub Publish to "deployments" topic: DispatchDeployment(deploymentId, revisionHash)
→ schedule ApplyDeadline timer (ApplyMaxDuration, default 10 min)
DriverHostActor (per node, subscribed to "deployments"):
receive DispatchDeployment(deploymentId, revisionHash):
→ if currentDeploymentRevision == revisionHash → ack Applied (idempotent)
→ else:
→ acquire per-node ApplyLock (Become Applying(deploymentId))
→ write NodeDeploymentState row (NodeId, DeploymentId, StartedAtUtc)
→ fetch artifact: read DeploymentArtifact blob from ConfigDb by deploymentId
→ diff against current applied artifact → per-instance ApplyDelta plans
→ dispatch ApplyDelta to DriverInstanceActor / VirtualTagActor / ScriptedAlarmActor children
→ collect per-instance acks (all-or-nothing per node)
→ on full success: write GenerationSealedCache (LiteDb local), update NodeDeploymentState.AppliedAtUtc
→ on any instance Failure: rollback to previous deployment, mark NodeDeploymentState=Failed
→ Tell Coordinator: ApplyAck(deploymentId, nodeId, Applied | Failed(reason))
→ Become Steady
ConfigPublishCoordinator: collect ApplyAcks
→ all Driver nodes Applied → Deployment.Status = Sealed → DistributedPubSub PublishDeploymentSealed
→ any Failed → Deployment.Status = PartiallyFailed → broadcast DeploymentFailed
→ deadline elapsed before all acks → Deployment.Status = TimedOut → broadcast DeploymentTimedOut
```
### Per-instance operation lock
All mutating commands (deploy, disable, enable, delete) on a `DriverInstance` go through `DriverInstanceActor`, which serializes them via the actor mailbox — single-threaded by construction.
### Idempotency
- `DeploymentId` + `RevisionHash` together identify a deployment.
- `DriverHostActor` seeing a `DispatchDeployment` whose `RevisionHash` matches current applied state → immediate ack `Applied`, no work. Safe to redeliver.
- `Phase7Composer.ComposeAsync(artifact)` is pure; same artifact → same delta plan.
- `DriverInstanceActor.ApplyDelta(plan)` compares against current state, applies only diffs.
### Concurrency control
- Last-write-wins on edits (no optimistic concurrency on `Equipment`, `Driver`, `Script`, etc.) — matches ScadaLink template behavior.
- **Optimistic concurrency on `Deployment` and `NodeDeploymentState` rows** (rowversion column) — prevents two concurrent Coordinator instances (during failover) from corrupting state.
### Singleton failover during deploy
1. Old Coordinator wrote `Deployment.Status = Dispatching` + `NodeDeploymentState` rows before broadcast.
2. New Coordinator on takeover queries `Deployment` rows with non-terminal `Status`.
3. For each in-flight deployment, `Ask` every `DriverHostActor` (via cluster-aware actor selection) for current `NodeDeploymentState`.
4. Recompute outstanding-ack set; resume the deadline timer with the remaining time.
5. If apply deadline already passed → mark `Deployment.Status = TimedOut` for any unack'd nodes.
### Crash recovery on Driver node restart
- `DriverHostActor.PreStart` reads `NodeDeploymentState` for self.
- If row says `Applied` for some `DeploymentId` and matches last sealed cache → Become Steady on that artifact.
- If row says `Applying` (didn't reach Applied) → discard partial state, re-fetch the artifact, replay apply (idempotent).
- If ConfigDb unreachable → fall back to local LiteDb sealed cache, Become `Stale` (drops ServiceLevel via `RedundancyStateActor`). Background reconnect retries every 30s.
### Schema migration from today
| Today | New |
|---|---|
| `ConfigGeneration` (Draft/Published/Sealed lifecycle) | **Dropped** |
| `ClusterNodeGenerationState` | Renamed → `NodeDeploymentState` with `(NodeId, DeploymentId, Status, StartedAtUtc, AppliedAtUtc, RowVersion)` |
| `ClusterNode.RedundancyRole` column | **Dropped** (Akka leader-of-driver-role is source of truth) |
| `ConfigAuditLog` | Kept; deploy events added as new event types |
| (new) `Deployment` | `(DeploymentId, RevisionHash, Status, CreatedBy, CreatedAtUtc, ArtifactBlob varbinary(max), RowVersion)` |
| (new) `ConfigEdit` audit row per Equipment/Driver/Script edit | Live-edit history |
| (new) `DataProtectionKeys` | DataProtection key ring storage |
No more `ApplyLeaseRegistry` table or watchdog actor. Apply state lives in `NodeDeploymentState`; watchdog is a Coordinator-side scheduled message keyed by `DeploymentId`.
### Stale-config fallback
Preserved from today's `GenerationSealedCache`: local LiteDb cache holds last-applied `DeploymentArtifact`. On Host boot with ConfigDb unreachable, `DriverHostActor` boots from cache → Become `Stale``RedundancyStateActor` drops `ServiceLevel` for that node.
### Peer probes consolidated
| Today | New |
|---|---|
| `PeerHttpProbeLoop` (HTTP `/healthz`) | Retired — Akka failure detector replaces it |
| `PeerUaProbeLoop` (OPC UA `opc.tcp://peer:4840`) | **Retained** as `PeerOpcUaProbeActor` — tests whether the OPC UA stack itself (not just the process) is up. Feeds `RedundancyStateActor`. |
| `DbHealthCache` (cached DB probe) | Retained as `DbHealthProbeActor` per-node. Feeds `RedundancyStateActor` + `/health/ready`. |
### ServiceLevel computation in `RedundancyStateActor`
```
serviceLevel(node) =
base 240 if (cluster member Up AND db reachable AND not stale AND opc ua probe ok)
base 200 if (member Up AND db reachable AND stale)
base 100 if (member Up AND db unreachable AND stale)
base 0 if (member Down / Unreachable)
+10 bonus if Akka driver-role leader is this node
```
ServiceLevel bands match the existing `RedundancyStatePublisher` so OPC UA client behavior is unchanged from today. The leader-bonus replaces today's operator-managed `RedundancyRole = Primary`.
## 7. Error handling & failure modes
### Akka cluster failure modes
| Scenario | Behavior |
|---|---|
| Network partition (split-brain) | Keep-oldest resolver downs the smaller side after 15s stable-after. `down-if-alone = on` covers isolated nodes. |
| Admin leader process crash | Failure detector trips after 10s, downs the member, new singleton instance starts on remaining Admin node. Traefik `/health/active` probe fails over within 1 polling interval (~5s). |
| Driver-role node crash | RedundancyStateActor sees member Down → drops that node's ServiceLevel to 0 → OPC UA clients reconnect to surviving node. Both nodes were already running their own copy; no in-cluster recovery needed for that node's work. |
| Both Admin nodes down simultaneously | Web UI unavailable. Driver nodes continue serving OPC UA from last-sealed cache. No new deployments possible until Admin node recovers. |
| All Driver nodes down | OPC UA endpoints unavailable. Clients reconnect when any Driver node returns. ServiceLevel back to 240 once member Up + DB reachable + apply sealed. |
| Singleton handover during deploy | Coordinator state survives in `Deployment` + `NodeDeploymentState` ConfigDb rows. New Coordinator queries DriverHostActors via cluster-aware actor selection. Resume remaining deadline. |
### ConfigDb unavailability
- **At edit time:** AdminUI returns user-visible error. No retries — operator decides.
- **At deploy time:** Coordinator refuses to start dispatch if it can't write the `Deployment` row.
- **At Driver node boot:** Fall back to local LiteDb sealed cache. RedundancyStateActor drops `ServiceLevel`.
- **At singleton failover:** New Coordinator's `PreStart` retries via Polly (5 attempts, exponential backoff). If exhausted → singleton crashes → cluster restarts singleton on next viable Admin node.
### Driver / equipment failures
- Driver connection loss → `DriverInstanceActor` enters `Reconnecting` Become state, publishes bad-quality to OPC UA address space immediately, retries at fixed interval.
- Tag-path-resolution failure → retried periodically.
- Write failure to driver → returned synchronously to caller via `Ask` from `OpcUaPublishActor`.
- Driver process unresponsive (Galaxy gateway down) → `IDriver.HealthCheck` returns degraded → `DriverInstanceActor` reports to `DriverHostActor``RedundancyStateActor` factors into ServiceLevel.
### Wonderware historian sidecar
- Named-pipe disconnect → `HistorianAdapterActor` enters `Reconnecting`; alarm history rows buffered to local SQLite store-and-forward.
- Sidecar process crash → no in-cluster recovery (external process); operator restarts via Windows Service control.
### Auth failures
- LDAP unreachable → `/auth/login` returns 503. Active sessions continue with cached claims.
- JWT signature failure (key ring drift) → 401, session terminates. DataProtection keys in ConfigDb prevent this in the happy path.
- Cookie expired (sliding 30-min idle) → `/auth/ping` returns 401 → `CookieAuthenticationStateProvider` triggers UI logout.
### SignalR / circuit drops
- Blazor circuit dropped → `App.razor` reload script reconnects (preserved from today).
- Hub message loss during reconnect → `FleetStatusBroadcaster` resends current state to the reconnecting client on `OnConnectedAsync` (full snapshot, not just diffs).
### OPC UA stack failures
- Address-space corruption → `OpcUaPublishActor` logs ERROR, sends `RebuildAddressSpace` to itself; sequence number bump notifies clients to resubscribe.
- OPC UA listener bind failure (port collision) → Host fails readiness probe, supervisor restarts service.
### Audit invariants
- Audit write failures **never abort** the user-facing action. `AuditWriterActor` buffer overflow → log WARN, drop oldest (with counter metric). The action's success/failure path is authoritative.
- All deploy + edit events carry `ExecutionId` (per-request correlation) so audit rows for one operator action share an ID.
## 8. Testing strategy
Test projects mirror the new layering. Test infrastructure stays Mac-friendly: stubbed Windows-only drivers, ephemeral SQL Server (LocalDB on Windows / `mcr.microsoft.com/mssql/server` container on Mac), `OpenLDAP` container, all spun up via `tests/docker-compose.yml`.
### Layered test pyramid
| Layer | Project | What it covers |
|---|---|---|
| **Unit** | `OtOpcUa.Runtime.Tests` | Per-actor logic via `Akka.TestKit.Xunit2`. `DriverInstanceActor` state-machine transitions, `Phase7Composer` purity, `ScriptedAlarmActor` state machine, `VirtualTagActor` expression eval. Drivers mocked via `IDriver` test doubles. |
| **Unit** | `OtOpcUa.ControlPlane.Tests` | Singleton actor logic. `ConfigPublishCoordinator` happy path + timeout + concurrent ack ordering. `RedundancyStateActor` ServiceLevel computation truth table. `AuditWriterActor` batch flush + idempotency on duplicate `EventId`. |
| **Unit** | `OtOpcUa.Cluster.Tests` | Split-brain resolver config validation, role-aware membership helpers, HOCON parses. |
| **Unit** | `OtOpcUa.Security.Tests` | LDAP role mapping, JWT issuance, cookie+JWT roundtrip, `/auth/ping` expiry semantics. |
| **Integration** | `OtOpcUa.Host.IntegrationTests` | 2-node in-process Akka cluster. Real SQL Server, stubbed drivers. Tests: deploy happy path, deploy timeout, deploy with one node down, singleton failover mid-deploy, ConfigDb outage + stale-config fallback, edit-then-deploy roundtrip, audit row emission. |
| **Integration** | `OtOpcUa.OpcUa.IntegrationTests` | Real OPCFoundation client connects to a running stubbed Host. Asserts: dual endpoint visible, ServerUriArray populated, ServiceLevel reflects leader status, browse + read + write through `OpcUaPublishActor`, write failures returned synchronously. |
| **End-to-end** | `OtOpcUa.E2E.Tests` | Full Host with Traefik in front, two Admin nodes + two Driver nodes (4 processes via Docker). Verifies: web UI login via LDAP, deploy from UI flows to OPC UA stack, kill admin leader → Traefik fails over within 25s, kill driver node → OPC UA clients reconnect with correct ServiceLevel. CI nightly. |
### Failover-specific test cases
1. Kill Admin leader during `Dispatching` phase → new Coordinator resumes, deployment seals.
2. Kill Admin leader during `AwaitingApplyAcks` → new Coordinator queries DriverHostActors, completes ack collection.
3. Kill Driver node during `Applying` → Coordinator marks that node's `NodeDeploymentState=Failed` after deadline; surviving Driver nodes complete their apply.
4. Restart Driver node mid-deploy → on restart, replays apply (idempotent).
5. Akka split-brain (network partition between 2 admin nodes) → keep-oldest wins, smaller side downs itself within 15s.
6. Both Admin nodes restart simultaneously → deployments in `Dispatching` resume cleanly after cluster reforms.
7. Concurrent edits to the same `DriverInstance` from two UI sessions → last write wins, both audit rows present, no row corruption.
### Deploy idempotency tests
- Replay `DispatchDeployment` with same `DeploymentId/RevisionHash` → no work, ack `Applied`.
- Apply same `DeploymentArtifact` twice in a row → second application is a no-op.
- Crash DriverHostActor mid-apply, restart → resumes from `NodeDeploymentState`, completes idempotently.
### Property tests
- `Phase7Composer.ComposeAsync` is pure: same artifact → same plan, no side effects.
- `RedundancyStateActor` ServiceLevel computation: every combination of (member-state, db-ok, stale, opc-ok, is-leader) produces expected byte.
- Audit envelope generation: every mutating op produces exactly one audit row with stable `ExecutionId` correlation.
### Mac-dev test invariants
- All unit + integration tests run on macOS without Windows-only assemblies.
- Cluster tests use in-process Akka.Remote on 127.0.0.1.
- LDAP tests use `OpenLDAP` container or `Security:Ldap:DevStubMode=true`.
### Retired tests
Anything touching `ConfigGeneration` lifecycle, `ApplyLeaseRegistry`, `PeerHttpProbeLoop`, `FleetStatusPoller`, `RedundancyCoordinator` peer-probe loops, `RedundancyStatePublisher`.
## 9. Risks & open questions
1. **Akka.NET on .NET 10.** Verify Akka.NET 1.5+ targets .NET 10 cleanly.
2. **OPCFoundation SDK threading.** The OPC UA stack runs its own threadpool. `OpcUaPublishActor` must marshal writes via thread-safe wrappers; use a dedicated `synchronized-dispatcher` for actors that touch the OPC UA address space.
3. **Failure detector tuning.** ScadaLink's 2s/10s is tuned for site-to-central RTT. Benchmark before locking. Aggressive tuning + GC pauses → spurious singleton handover.
4. **ServiceLevel = Akka leader removes operator control.** No escape hatch in v1. If a customer needs one later, add a `PinnedPrimary` column to `ClusterNode` and an override path in `RedundancyStateActor`. Out of scope now.
5. **Long-lived v2 branch drift.** Monthly rebase from main, CI runs on v2 from day one.
6. **Schema migration is destructive.** Dropping `ConfigGeneration` + `ClusterNode.RedundancyRole` is one-way. Cutover must run on a quiesced system. Provide a `Migrate-To-V2.ps1` script that backs up ConfigDb, runs EF migrations, validates row counts, prints a summary.
7. **Wonderware + mxaccessgw still external processes.** Both untouched by this refactor. Future actorization would be a second refactor.
8. **Audit row volume.** Edit-heavy install ≈ 5k rows/day. Need monthly partition + 365-day retention same as ScadaLink #23.
## 10. Migration plan
Big-bang on `v2-akka-fuse` branch:
1. Branch `v2-akka-fuse` off `main`.
2. Add new projects: `OtOpcUa.Host`, `.Cluster`, `.Security`, `.ControlPlane`, `.Runtime`, `.ConfigDb`, `.Commons`, `.AdminUI`, `.OpcUaServer`. Convert to `OtOpcUa.slnx`.
3. Move ConfigDb access (EF context, repos, migrations) out of `Server` and `Admin` into `OtOpcUa.ConfigDb`. Add DataProtection key store table.
4. Move LDAP + cookie + JWT out of `Admin/Security` into `OtOpcUa.Security`. Adopt 15-min JWT / 30-min sliding cookie / `/auth/ping`.
5. Build `OtOpcUa.Cluster`: HOCON, `AkkaHostedService`, role-aware membership helpers, split-brain resolver.
6. Build `OtOpcUa.ControlPlane`: `ConfigPublishCoordinator`, `AdminOperationsActor`, `AuditWriterActor`, `FleetStatusBroadcaster`, `RedundancyStateActor`.
7. Build `OtOpcUa.Runtime`: `DriverHostActor`, `DriverInstanceActor`, `VirtualTagActor`, `ScriptedAlarmActor`, `OpcUaPublishActor`, `HistorianAdapterActor`, `PeerOpcUaProbeActor`, `DbHealthProbeActor`.
8. Migrate `Phase7Composer` to `OtOpcUa.OpcUaServer`; make it pure and unit-tested.
9. Move Blazor components from `Admin` into `OtOpcUa.AdminUI` library; replace `DriverDiagnosticsClient` HTTP with in-process actor calls; rewire `FleetStatusHub` / `AlertHub` / `ScriptLogHub` to be fed by `FleetStatusBroadcaster` `IHubContext`.
10. Build `OtOpcUa.Host` `Program.cs`: role-gated startup, health endpoints (`/health/ready`, `/health/active`, `/healthz`), `AddWindowsService`.
11. ConfigDb migration: add `Deployment`, `ConfigEdit`, `DataProtectionKeys` tables; rename `ClusterNodeGenerationState``NodeDeploymentState`; drop `ConfigGeneration`; drop `ClusterNode.RedundancyRole`. EF migration + idempotent SQL script + `Migrate-To-V2.ps1`.
12. Delete `OtOpcUa.Server`, `OtOpcUa.Admin`, `DriverInstanceBootstrapper`, `RedundancyCoordinator`, `RedundancyStatePublisher`, `ApplyLeaseRegistry`, `FleetStatusPoller`, `PeerHttpProbeLoop`, `HubTokenService`. Sweep any `*RedundancyRole*` references.
13. Update `deploy/Install-Services.ps1`: single Windows Service per node, role via env var, Traefik service registration.
14. Update docs in `docs/`: rewrite `Redundancy.md`, `ServiceHosting.md`; add `Cluster.md`, `ControlPlane.md`, `Runtime.md`. Add top-level `Architecture-v2.md` summary.
15. CI: add integration test job for the 2-node cluster + OPC UA roundtrip.
16. Tag the last v1 release on `main` for backport-only fixes. Merge `v2-akka-fuse``main` when GA.
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,99 @@
{
"planPath": "docs/plans/2026-05-26-akka-hosting-alignment-plan.md",
"branch": "v2-akka-fuse",
"designDoc": "docs/plans/2026-05-26-akka-hosting-alignment-design.md",
"lastUpdated": "2026-05-26T00:00:00Z",
"tasks": [
{"id": 0, "subject": "Task 0: Create branch and central package management", "status": "completed", "classification": "small", "estMinutes": 3, "parallelizableWith": [], "commit": "2b81147"},
{"id": 1, "subject": "Task 1: Create OtOpcUa.Commons project", "status": "completed", "classification": "small", "estMinutes": 3, "parallelizableWith": [2,3,4,5,6,7,8], "blockedBy": [0], "commit": "30a2104"},
{"id": 2, "subject": "Task 2: Create OtOpcUa.Cluster project", "status": "completed", "classification": "small", "estMinutes": 3, "parallelizableWith": [1,3,4,5,6,7,8], "blockedBy": [0], "commit": "30a2104"},
{"id": 3, "subject": "Task 3: Create OtOpcUa.Security project", "status": "completed", "classification": "small", "estMinutes": 3, "parallelizableWith": [1,2,4,5,6,7,8], "blockedBy": [0], "commit": "30a2104"},
{"id": 4, "subject": "Task 4: Create OtOpcUa.ControlPlane project", "status": "completed", "classification": "small", "estMinutes": 3, "parallelizableWith": [1,2,3,5,6,7,8], "blockedBy": [0], "commit": "30a2104"},
{"id": 5, "subject": "Task 5: Create OtOpcUa.Runtime project", "status": "completed", "classification": "small", "estMinutes": 3, "parallelizableWith": [1,2,3,4,6,7,8], "blockedBy": [0], "commit": "30a2104"},
{"id": 6, "subject": "Task 6: Create OtOpcUa.OpcUaServer project", "status": "completed", "classification": "small", "estMinutes": 3, "parallelizableWith": [1,2,3,4,5,7,8], "blockedBy": [0], "commit": "30a2104"},
{"id": 7, "subject": "Task 7: Create OtOpcUa.AdminUI Razor class library", "status": "completed", "classification": "small", "estMinutes": 3, "parallelizableWith": [1,2,3,4,5,6,8], "blockedBy": [0], "commit": "30a2104"},
{"id": 8, "subject": "Task 8: Create OtOpcUa.Host Web SDK project", "status": "completed", "classification": "small", "estMinutes": 5, "parallelizableWith": [1,2,3,4,5,6,7], "blockedBy": [0], "commit": "30a2104"},
{"id": 9, "subject": "Task 9: Build green smoke check", "status": "completed", "classification": "trivial", "estMinutes": 2, "parallelizableWith": [], "blockedBy": [1,2,3,4,5,6,7,8], "commit": "30a2104"},
{"id": 10, "subject": "Task 10: Add Deployment entity", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [11,12,13], "blockedBy": [9], "commit": "8e2c4f2"},
{"id": 11, "subject": "Task 11: Add NodeDeploymentState entity", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [10,12,13], "blockedBy": [9], "commit": "8e2c4f2"},
{"id": 12, "subject": "Task 12: Add ConfigEdit audit entity", "status": "completed", "classification": "small", "estMinutes": 4, "parallelizableWith": [10,11,13], "blockedBy": [9], "commit": "8e2c4f2"},
{"id": 13, "subject": "Task 13: Add DataProtection keys table", "status": "completed", "classification": "small", "estMinutes": 3, "parallelizableWith": [10,11,12], "blockedBy": [9], "commit": "8e2c4f2"},
{"id": "14a", "subject": "Task 14a: Add RowVersion to live-edit entities", "status": "completed", "classification": "standard", "estMinutes": 10, "parallelizableWith": [], "blockedBy": [13], "commit": "4bb4ad8"},
{"id": "14b", "subject": "Task 14b: Decouple live-edit entities from ConfigGeneration", "status": "completed", "classification": "high-risk", "estMinutes": 30, "parallelizableWith": [], "blockedBy": ["14a"], "commit": "13d3aea"},
{"id": "14c", "subject": "Task 14c: Obsolete GenerationApplier/Diff/SealedCache", "status": "completed", "classification": "high-risk", "estMinutes": 20, "parallelizableWith": [], "blockedBy": ["14b"], "commit": "1ddf8bb"},
{"id": "14d", "subject": "Task 14d: Drop ClusterNode.RedundancyRole", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": ["14a","14b","14c"], "blockedBy": [13], "commit": "3c915e6"},
{"id": "14e", "subject": "Task 14e: Delete ConfigGeneration + ClusterNodeGenerationState", "status": "completed", "classification": "small", "estMinutes": 5, "parallelizableWith": [], "blockedBy": ["14b","14c"], "commit": "e00f46d"},
{"id": "14f", "subject": "Task 14f: V2HostingAlignment EF migration (consolidator)", "status": "completed", "classification": "high-risk", "estMinutes": 15, "parallelizableWith": [], "blockedBy": ["14a","14b","14c","14d","14e"], "commit": "605dbf3"},
{"id": 15, "subject": "Task 15: Migrate-To-V2.ps1 idempotent script", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [16,17,18], "blockedBy": ["14f"], "commit": "c168c1c"},
{"id": 16, "subject": "Task 16: Common types (CorrelationId, ExecutionId, NodeId, ...)", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [17,18], "blockedBy": [9], "commit": "fee4a8c"},
{"id": 17, "subject": "Task 17: Akka message contracts", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [16,18], "blockedBy": [16], "commit": "5d3a5a4"},
{"id": 18, "subject": "Task 18: Common interfaces", "status": "completed", "classification": "small", "estMinutes": 4, "parallelizableWith": [16,17], "blockedBy": [16], "commit": "136234e"},
{"id": 19, "subject": "Task 19: HOCON config", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [20,21,22], "blockedBy": [2], "commit": "3d0f4dc"},
{"id": 20, "subject": "Task 20: AkkaHostedService implementation", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [19,21,22], "blockedBy": [2,18], "commit": "f184f8e"},
{"id": 21, "subject": "Task 21: Role parser from OTOPCUA_ROLES env", "status": "completed", "classification": "small", "estMinutes": 3, "parallelizableWith": [19,20,22], "blockedBy": [2], "commit": "dfb0636"},
{"id": 22, "subject": "Task 22: ClusterRoleInfo implementation", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [19,20,21], "blockedBy": [18,20], "commit": "c217c49"},
{"id": 23, "subject": "Task 23: Cluster test project + tests", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [], "blockedBy": [19,20,21,22], "commit": "e0b6d56"},
{"id": 24, "subject": "Task 24: Move LdapAuthService into OtOpcUa.Security", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [25], "blockedBy": [3], "commit": "567b8ca"},
{"id": 25, "subject": "Task 25: JwtTokenService", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [24], "blockedBy": [3], "commit": "93316e3"},
{"id": 26, "subject": "Task 26: Cookie+JWT hybrid AddOtOpcUaAuth extension", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [27,28], "blockedBy": [13,24,25], "commit": "207fc6a"},
{"id": 27, "subject": "Task 27: /auth/login, /auth/ping, /auth/token endpoints", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [26,28], "blockedBy": [24,25], "commit": "8be84ba"},
{"id": 28, "subject": "Task 28: CookieAuthenticationStateProvider for Blazor", "status": "completed", "classification": "small", "estMinutes": 4, "parallelizableWith": [26,27], "blockedBy": [25], "commit": "e38f22e"},
{"id": 29, "subject": "Task 29: Security test project + tests", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [], "blockedBy": [24,25,26,27,28], "commit": "38ea0c5"},
{"id": 30, "subject": "Task 30: ConfigPublishCoordinator happy path", "status": "completed", "classification": "high-risk", "estMinutes": 5, "parallelizableWith": [32,33,34,35], "blockedBy": [4,17,18,10,11], "commit": "62e12da"},
{"id": 31, "subject": "Task 31: Coordinator timeout + failover recovery", "status": "completed", "classification": "high-risk", "estMinutes": 5, "parallelizableWith": [32,33,34,35], "blockedBy": [30], "commit": "f193872"},
{"id": 32, "subject": "Task 32: AdminOperationsActor + StartDeployment", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [30,31,33,34,35], "blockedBy": [4,17,18,10,12], "commit": "ef683f5"},
{"id": 33, "subject": "Task 33: AuditWriterActor batched idempotent insert", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [30,31,32,34,35], "blockedBy": [4,17], "commit": "23f669c"},
{"id": 34, "subject": "Task 34: FleetStatusBroadcaster", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [30,31,32,33,35], "blockedBy": [4,17], "commit": "dd122c4"},
{"id": 35, "subject": "Task 35: RedundancyStateActor + ServiceLevelCalculator", "status": "completed", "classification": "high-risk", "estMinutes": 5, "parallelizableWith": [30,31,32,33,34], "blockedBy": [4,17,18], "commit": "6b37f99"},
{"id": 36, "subject": "Task 36: Singleton registration extension (admin role)", "status": "completed", "classification": "standard", "estMinutes": 4, "parallelizableWith": [], "blockedBy": [30,31,32,33,34,35], "commit": "52bf4b3"},
{"id": 37, "subject": "Task 37: DriverHostActor scaffolding + PreStart recovery", "status": "completed", "classification": "high-risk", "estMinutes": 5, "parallelizableWith": [41,42,43,44], "blockedBy": [5,17,18,11], "commit": "ed13013"},
{"id": 38, "subject": "Task 38: DriverHostActor DispatchDeployment handler", "status": "completed", "classification": "high-risk", "estMinutes": 5, "parallelizableWith": [41,42,43,44], "blockedBy": [37], "commit": "ed13013"},
{"id": 39, "subject": "Task 39: DriverHostActor stale-config fallback", "status": "completed", "classification": "standard", "estMinutes": 4, "parallelizableWith": [41,42,43,44], "blockedBy": [38], "commit": "ed13013"},
{"id": 40, "subject": "Task 40: Runtime test project bootstrap", "status": "completed", "classification": "small", "estMinutes": 3, "parallelizableWith": [], "blockedBy": [37,38,39], "commit": "ed13013"},
{"id": 41, "subject": "Task 41: DriverInstanceActor state machine", "status": "completed", "classification": "high-risk", "estMinutes": 5, "parallelizableWith": [42,43,44], "blockedBy": [5,17,40], "commit": "64c627f"},
{"id": 42, "subject": "Task 42: VirtualTagActor", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [41,43,44], "blockedBy": [5,17,40], "commit": "39729bf"},
{"id": 43, "subject": "Task 43: ScriptedAlarmActor", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [41,42,44], "blockedBy": [5,17,40], "commit": "95ef533"},
{"id": 44, "subject": "Task 44: OpcUaPublishActor on synchronized dispatcher", "status": "completed", "classification": "high-risk", "estMinutes": 5, "parallelizableWith": [41,42,43], "blockedBy": [5,6,17,19,40], "commit": "e115f13"},
{"id": 45, "subject": "Task 45: HistorianAdapter + PeerOpcUaProbe + DbHealthProbe actors", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [], "blockedBy": [37,40], "commit": "28639cb"},
{"id": 46, "subject": "Task 46: Extract OpcUaApplicationHost + Phase7Composer", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [], "blockedBy": [6], "commit": "2877a88"},
{"id": 47, "subject": "Task 47: Phase7Composer purity + property tests", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [48,49,50,51,52], "blockedBy": [46], "commit": "b7c117a"},
{"id": 48, "subject": "Task 48: Move Blazor components into AdminUI library", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [47], "blockedBy": [7], "commit": "1a067e6"},
{"id": 49, "subject": "Task 49: Move SignalR hubs and rewire to FleetStatusBroadcaster", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [50,51,52], "blockedBy": [34,48], "commit": "26d8f2f"},
{"id": 50, "subject": "Task 50: IAdminOperationsClient via ClusterSingletonProxy", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [49,51,52], "blockedBy": [18,32,48], "commit": "f022499"},
{"id": 51, "subject": "Task 51: Replace DriverDiagnosticsClient with IFleetDiagnosticsClient", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [49,50,52], "blockedBy": [18,48], "commit": "b83f099"},
{"id": 52, "subject": "Task 52: Drift indicator + Deploy button UI", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [49,50,51], "blockedBy": [50,48], "commit": "f167808"},
{"id": 53, "subject": "Task 53: Host Program.cs role-gated startup", "status": "completed", "classification": "high-risk", "estMinutes": 5, "parallelizableWith": [54,55], "blockedBy": [8,15,20,21,22,26,36,40,45,46,48,49], "commit": "e2b357f"},
{"id": 54, "subject": "Task 54: Health endpoints + appsettings layout", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [53,55], "blockedBy": [8,22], "commit": "fa1d685"},
{"id": 55, "subject": "Task 55: Mac dev mode + DEV-STUB drivers", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [53,54], "blockedBy": [41], "commit": "8b4de80"},
{"id": 56, "subject": "Task 56: Delete OtOpcUa.Server + OtOpcUa.Admin projects", "status": "completed", "classification": "high-risk", "estMinutes": 5, "parallelizableWith": [], "blockedBy": [53,54,55], "commit": "76310b8"},
{"id": 57, "subject": "Task 57: Build & test green check", "status": "completed", "classification": "trivial", "estMinutes": 3, "parallelizableWith": [], "blockedBy": [56], "commit": "76310b8"},
{"id": 58, "subject": "Task 58: 2-node integration test harness", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [], "blockedBy": [57]},
{"id": 59, "subject": "Task 59: Deploy + failover integration tests", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [60], "blockedBy": [58]},
{"id": 60, "subject": "Task 60: OPC UA dual-endpoint + ServiceLevel tests", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [59], "blockedBy": [58]},
{"id": 61, "subject": "Task 61: E2E test infrastructure + GitHub Actions CI", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [], "blockedBy": [59,60]},
{"id": 62, "subject": "Task 62: Rewrite Install-Services.ps1", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [63,64,65], "blockedBy": [53]},
{"id": 63, "subject": "Task 63: Traefik config + docker-dev compose", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [62,64,65], "blockedBy": [53]},
{"id": 64, "subject": "Task 64: Update existing docs (Redundancy, ServiceHosting, security)", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [62,63,65], "blockedBy": [57]},
{"id": 65, "subject": "Task 65: New v2 docs (Architecture-v2, Cluster, ControlPlane, Runtime)", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [62,63,64], "blockedBy": [57]},
{"id": "F1", "subject": "Follow-up: AuthEndpoints integration tests against fused Host", "status": "pending", "classification": "small", "estMinutes": 10, "parallelizableWith": ["F2"], "blockedBy": [53], "origin": "Deviation from Task 29 (commit 38ea0c5) — deferred until Task 53 wires AddOtOpcUaAuth/MapOtOpcUaAuth in Program. Add WebApplicationFactory<OtOpcUa.Host.Program> tests for /auth/login (204/401/503), /auth/ping (401/200), /auth/token (200+JWT), /auth/logout (204+cookie clear) using a stub ILdapAuthService."},
{"id": "F2", "subject": "Follow-up: Replace JwtBearer BuildServiceProvider antipattern with IPostConfigureOptions", "status": "pending", "classification": "small", "estMinutes": 5, "parallelizableWith": ["F1"], "blockedBy": [], "origin": "Deviation from Task 26 (commit 207fc6a) — AddOtOpcUaAuth uses services.BuildServiceProvider().CreateScope() inside .AddJwtBearer lambda (ASP0000). Refactor to IPostConfigureOptions<JwtBearerOptions> so validation parameters resolve lazily from the real request provider."},
{"id": "F3", "subject": "Follow-up: Add EventId unique column to ConfigAuditLog for cross-restart audit idempotency", "status": "pending", "classification": "small", "estMinutes": 15, "parallelizableWith": ["F4"], "blockedBy": [], "origin": "Deviation from Task 33 — AuditWriterActor only dedups in-buffer; ConfigAuditLog lacks EventId column so a duplicate AuditEvent that arrives after a flush becomes a duplicate row. Add nullable EventId Guid + filtered unique index, migration, and refactor AuditWriterActor.WrapDetails away."},
{"id": "F4", "subject": "Follow-up: Harden AuditWriterActor.WrapDetails JSON synthesis with System.Text.Json", "status": "pending", "classification": "small", "estMinutes": 5, "parallelizableWith": ["F3"], "blockedBy": [], "origin": "Self-review of Task 33 — WrapDetails uses string concat; malformed caller DetailsJson would produce invalid JSON and trip the CK_ConfigAuditLog_DetailsJson_IsJson constraint, killing the entire flush batch. Discard this task if F3 lands first (F3 removes WrapDetails entirely)."},
{"id": "F5", "subject": "Follow-up: ConfigPublishCoordinator multi-node happy-path test", "status": "pending", "classification": "standard", "estMinutes": 30, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 30 — single-ActorSystem TestKit can't simulate the plan's 'dispatch to N driver nodes, all ack, seals' happy path because DiscoverDriverNodes() needs real cluster membership. Add a multi-system test (two ActorSystems joined into one cluster, driver-role on the second)."},
{"id": "F6", "subject": "Follow-up: RedundancyStateActor publisher abstraction so tests don't need DPS bootstrap", "status": "pending", "classification": "small", "estMinutes": 10, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 35 — RedundancyStateActorTests are skipped because single-node DistributedPubSub bootstrap is unreliable in TestKit. Inject an Action<object> broadcast so tests can replace it with a probe; un-skip both tests."},
{"id": "F7", "subject": "Follow-up: DriverInstanceActor full engine wiring (subscriptions, writes, ApplyDelta diff)", "status": "pending", "classification": "standard", "estMinutes": 45, "parallelizableWith": [], "blockedBy": [44], "origin": "Self-review of Task 41 — subscription publishing, ApplyDelta diffing, bad-quality-on-disconnect, write path, and supervisor backoff are stubbed. Wire after OpcUaPublishActor lands."},
{"id": "F8", "subject": "Follow-up: VirtualTagActor engine wiring (compile expression, subscribe deps, publish result)", "status": "pending", "classification": "standard", "estMinutes": 30, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 42 — VirtualTagEngine.Evaluate not called; DependencyValueChanged just buffers."},
{"id": "F9", "subject": "Follow-up: ScriptedAlarmActor engine wiring + state persistence", "status": "pending", "classification": "standard", "estMinutes": 30, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 43 — AlarmConditionService not called; PreRestart persistence to ScriptedAlarmState DB not wired; HistorianAdapter rows not emitted."},
{"id": "F10", "subject": "Follow-up: OpcUaPublishActor SDK integration (address-space writes + ServiceLevel + RebuildAddressSpace)", "status": "pending", "classification": "high-risk", "estMinutes": 60, "parallelizableWith": [], "blockedBy": [47], "origin": "Self-review of Task 44 — SDK calls stubbed; counters only. Wire after Phase 7 OpcUaServer extraction."},
{"id": "F11", "subject": "Follow-up: HistorianAdapterActor named-pipe IPC + SqliteStoreAndForwardSink wiring", "status": "pending", "classification": "standard", "estMinutes": 30, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 45 — stub buffers in-memory; named-pipe + SQLite store-and-forward not wired."},
{"id": "F12", "subject": "Follow-up: PeerOpcUaProbeActor real opc.tcp ping (replace Ok=true stub)", "status": "pending", "classification": "small", "estMinutes": 20, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 45 — RunProbe always returns Ok=true; replace with OPC UA Client connect."},
{"id": "F13", "subject": "Follow-up: Full OpcUaApplicationHost extraction (security/alarms/history/observability)", "status": "pending", "classification": "high-risk", "estMinutes": 120, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 46 — facade only boots ApplicationInstance + StandardServer. Legacy 391-line file pulls Server.Security/Alarms/History/Observability. Pull those into thin OpcUaServer interfaces."},
{"id": "F14", "subject": "Follow-up: Migrate side-effecting Phase7Composer (EquipmentNodeWalker, trace logs, node cache)", "status": "pending", "classification": "standard", "estMinutes": 60, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 47 — pure version covers the projection. Walker + alarm sink registration + cache mutation stay in legacy until split into Phase7Applier."},
{"id": "F15", "subject": "Follow-up: Migrate 47 legacy Admin Blazor components into AdminUI library", "status": "pending", "classification": "high-risk", "estMinutes": 180, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 48 — only MapAdminUI scaffold + 1 new page (Deployments). 47 pages stay in legacy Admin (accepted-broken) until Task 56."},
{"id": "F16", "subject": "Follow-up: Bridge FleetStatusBroadcaster → SignalR hubs (FleetStatusHub / AlertHub / ScriptLogHub)", "status": "pending", "classification": "standard", "estMinutes": 30, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 49 — hubs are passive Hub subclasses; the bridge from FleetStatusBroadcaster.broadcast → IHubContext is not wired."},
{"id": "F17", "subject": "Follow-up: FleetDiagnosticsClient real Akka ActorSelection round-trip (GetDiagnosticsRequest)", "status": "pending", "classification": "standard", "estMinutes": 30, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 51 — client returns an empty snapshot stub. Add GetDiagnosticsRequest contract + DriverHostActor handler + real Ask/Reply."},
{"id": "F18", "subject": "Follow-up: Thread HttpContext.User.Identity.Name into Deployments page (createdBy)", "status": "pending", "classification": "small", "estMinutes": 5, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 52 — Deployments.razor hardcodes createdBy=\"(current user)\"; needs @inject AuthenticationStateProvider."},
{"id": "F19", "subject": "Follow-up: RuntimeStartup extension for driver-role node-actor spawn", "status": "pending", "classification": "standard", "estMinutes": 20, "parallelizableWith": [], "blockedBy": [], "origin": "Self-review of Task 53 — only admin-role singletons are wired via WithOtOpcUaControlPlaneSingletons. Driver-role nodes need a parallel WithOtOpcUaRuntimeActors that spawns DriverHostActor."},
{"id": "F20", "subject": "Follow-up: Wire DriverInstanceActor.ShouldStub() into DriverHostActor child spawn", "status": "pending", "classification": "small", "estMinutes": 10, "parallelizableWith": ["F7"], "blockedBy": [], "origin": "Self-review of Task 55 — ShouldStub helper exists but nothing calls it. Folds into F7 when DriverHostActor learns to spawn DriverInstanceActor children."}
]
}
+59
View File
@@ -0,0 +1,59 @@
<#
.SYNOPSIS
Idempotent migration runner that takes the OtOpcUaConfig database from the v1 schema
(with ConfigGeneration / ClusterNodeGenerationState) to the v2 hosting-aligned schema
(with Deployment / NodeDeploymentState / ConfigEdit / DataProtectionKeys).
.DESCRIPTION
Backs the database up, applies the idempotent EF migration script, then validates that
expected tables exist and legacy tables are gone. Safe to re-run — the EF script itself
is idempotent, and the backup picks a unique filename per invocation.
.PARAMETER ConnectionString
Mandatory. Full ADO.NET connection string with permissions to BACKUP DATABASE and
apply DDL on the target ConfigDb.
.PARAMETER BackupPath
Optional. Full path for the backup file. Defaults to a timestamped path under $env:TEMP.
.EXAMPLE
.\Migrate-To-V2.ps1 -ConnectionString "Server=sql01;Database=OtOpcUaConfig;Trusted_Connection=True;TrustServerCertificate=True"
#>
[CmdletBinding()]
param(
[Parameter(Mandatory)][string] $ConnectionString,
[string] $BackupPath = "$env:TEMP\OtOpcUa-V1-Backup-$(Get-Date -Format yyyyMMddHHmmss).bak"
)
$ErrorActionPreference = 'Stop'
if (-not (Get-Command Invoke-Sqlcmd -ErrorAction SilentlyContinue)) {
throw "Invoke-Sqlcmd not available. Install module: Install-Module SqlServer -Scope CurrentUser"
}
Write-Host "Step 1/4 — Backup ConfigDb to $BackupPath" -ForegroundColor Cyan
Invoke-Sqlcmd -ConnectionString $ConnectionString `
-Query "BACKUP DATABASE [OtOpcUaConfig] TO DISK = '$BackupPath' WITH FORMAT, COMPRESSION"
Write-Host "Step 2/4 — Row counts (before)" -ForegroundColor Cyan
$beforeCounts = Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\count-rows.sql"
$beforeCounts | Format-Table
Write-Host "Step 3/4 — Apply Migrate-To-V2.sql" -ForegroundColor Cyan
Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\Migrate-To-V2.sql" -QueryTimeout 1800
Write-Host "Step 4/4 — Row counts (after) + validation" -ForegroundColor Cyan
$afterCounts = Invoke-Sqlcmd -ConnectionString $ConnectionString -InputFile "$PSScriptRoot\count-rows.sql"
$afterCounts | Format-Table
$tablesNow = (Invoke-Sqlcmd -ConnectionString $ConnectionString `
-Query "SELECT name FROM sys.tables ORDER BY name").name
foreach ($t in 'Deployment','NodeDeploymentState','ConfigEdit','DataProtectionKeys') {
if ($tablesNow -notcontains $t) { throw "Expected v2 table $t missing." }
}
foreach ($t in 'ConfigGeneration','ClusterNodeGenerationState') {
if ($tablesNow -contains $t) { throw "Legacy v1 table $t still present." }
}
Write-Host "Migration complete. Backup at $BackupPath" -ForegroundColor Green
File diff suppressed because it is too large Load Diff
+26
View File
@@ -0,0 +1,26 @@
-- Per-table row counts for pre/post-migration audit.
-- Covers every table relevant to the v1 -> v2 transition so the operator can confirm
-- live-edit data was preserved and v2 tables came up empty.
SELECT TableName = t.name, [Rows] = SUM(p.[rows])
FROM sys.tables t
JOIN sys.partitions p ON p.object_id = t.object_id AND p.index_id IN (0,1)
WHERE t.name IN (
-- Live-edit configuration (rows must survive)
'ServerCluster','ClusterNode','ClusterNodeCredential',
'Namespace','UnsArea','UnsLine',
'DriverInstance','Device','Equipment','Tag','PollGroup','VirtualTag',
'NodeAcl','ExternalIdReservation',
'Script','ScriptedAlarm','ScriptedAlarmState',
'LdapGroupRoleMapping',
'EquipmentImportBatch','EquipmentImportRow',
-- Status tables (rebuilt at runtime; counts informational)
'DriverHostStatus','DriverInstanceResilienceStatus',
-- Audit (preserved)
'ConfigAuditLog',
-- v2 deploy model (empty pre-migration, populated post)
'Deployment','NodeDeploymentState','ConfigEdit','DataProtectionKeys'
)
GROUP BY t.name
ORDER BY t.name;
GO
@@ -9,9 +9,9 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="CliFx" Version="2.3.6"/>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="Serilog.Sinks.Console" Version="6.0.0"/>
<PackageReference Include="CliFx"/>
<PackageReference Include="Serilog"/>
<PackageReference Include="Serilog.Sinks.Console"/>
</ItemGroup>
<ItemGroup>
@@ -8,8 +8,8 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Client" Version="1.5.378.106"/>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Client"/>
<PackageReference Include="Serilog"/>
</ItemGroup>
<ItemGroup>
@@ -9,17 +9,17 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Avalonia" Version="11.2.7"/>
<PackageReference Include="Avalonia.Desktop" Version="11.2.7"/>
<PackageReference Include="Avalonia.Svg.Skia" Version="11.2.0.2"/>
<PackageReference Include="Avalonia.Themes.Fluent" Version="11.2.7"/>
<PackageReference Include="Avalonia.Fonts.Inter" Version="11.2.7"/>
<PackageReference Include="Avalonia.Controls.DataGrid" Version="11.2.7"/>
<PackageReference Include="Avalonia.Diagnostics" Version="11.2.7"/>
<PackageReference Include="CommunityToolkit.Mvvm" Version="8.4.0"/>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="Serilog.Sinks.Console" Version="6.0.0"/>
<PackageReference Include="Serilog.Sinks.File" Version="7.0.0"/>
<PackageReference Include="Avalonia"/>
<PackageReference Include="Avalonia.Desktop"/>
<PackageReference Include="Avalonia.Svg.Skia"/>
<PackageReference Include="Avalonia.Themes.Fluent"/>
<PackageReference Include="Avalonia.Fonts.Inter"/>
<PackageReference Include="Avalonia.Controls.DataGrid"/>
<PackageReference Include="Avalonia.Diagnostics"/>
<PackageReference Include="CommunityToolkit.Mvvm"/>
<PackageReference Include="Serilog"/>
<PackageReference Include="Serilog.Sinks.Console"/>
<PackageReference Include="Serilog.Sinks.File"/>
</ItemGroup>
<ItemGroup>
@@ -35,4 +35,11 @@
<EmbeddedResource Include="Assets\app-icon.svg" />
</ItemGroup>
<ItemGroup>
<!-- Tmds.DBus.Protocol 0.20.0 reaches this project transitively from Avalonia.Desktop on
Linux/macOS only. We do not ship Linux/macOS builds of the Client.UI to end users;
this advisory affects dev-tooling code paths only. -->
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-xrw6-gwf8-vvr9"/>
</ItemGroup>
</Project>
@@ -0,0 +1,26 @@
namespace ZB.MOM.WW.OtOpcUa.Cluster;
public sealed class AkkaClusterOptions
{
public const string SectionName = "Cluster";
public string SystemName { get; set; } = "otopcua";
public string Hostname { get; set; } = "0.0.0.0";
public int Port { get; set; } = 4053;
/// <summary>
/// Hostname advertised in cluster gossip. Must be reachable by other nodes.
/// In docker-compose this is the container DNS name; in bare metal it's the
/// host's stable LAN address.
/// </summary>
public string PublicHostname { get; set; } = "127.0.0.1";
public string[] SeedNodes { get; set; } = Array.Empty<string>();
/// <summary>
/// Cluster roles for this node. When empty the role list comes from
/// <c>OTOPCUA_ROLES</c> via <see cref="RoleParser"/>. Allowed values:
/// <c>admin</c>, <c>driver</c>, <c>dev</c>.
/// </summary>
public string[] Roles { get; set; } = Array.Empty<string>();
}
@@ -0,0 +1,97 @@
using Akka.Actor;
using Akka.Configuration;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
namespace ZB.MOM.WW.OtOpcUa.Cluster;
/// <summary>
/// Starts the local <see cref="ActorSystem"/>, applies the embedded HOCON plus an overlay
/// generated from <see cref="AkkaClusterOptions"/>, and joins the cluster. On shutdown,
/// runs <c>CoordinatedShutdown</c> with the <c>ClusterLeavingReason</c> so the local node
/// leaves the cluster cleanly before the process exits.
/// </summary>
public sealed class AkkaHostedService : IHostedService
{
private readonly AkkaClusterOptions _options;
private readonly ILogger<AkkaHostedService> _logger;
private ActorSystem? _actorSystem;
public AkkaHostedService(IOptions<AkkaClusterOptions> options, ILogger<AkkaHostedService> logger)
{
_options = options.Value;
_logger = logger;
}
public ActorSystem ActorSystem =>
_actorSystem ?? throw new InvalidOperationException(
"ActorSystem requested before AkkaHostedService.StartAsync ran.");
public Task StartAsync(CancellationToken cancellationToken)
{
var overlay = BuildOverlay(_options);
var baseConfig = ConfigurationFactory.ParseString(HoconLoader.LoadBaseConfig());
var config = ConfigurationFactory.ParseString(overlay).WithFallback(baseConfig);
_logger.LogInformation(
"Starting ActorSystem '{System}' on {Host}:{Port} with roles=[{Roles}]",
_options.SystemName, _options.PublicHostname, _options.Port,
string.Join(",", _options.Roles));
_actorSystem = ActorSystem.Create(_options.SystemName, config);
if (_options.SeedNodes.Length > 0)
{
var seeds = _options.SeedNodes.Select(Address.Parse).ToList();
Akka.Cluster.Cluster.Get(_actorSystem).JoinSeedNodes(seeds);
}
return Task.CompletedTask;
}
public async Task StopAsync(CancellationToken cancellationToken)
{
if (_actorSystem is null) return;
_logger.LogInformation("Initiating cluster-leave CoordinatedShutdown");
var shutdown = CoordinatedShutdown.Get(_actorSystem);
using var cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
cts.CancelAfter(TimeSpan.FromSeconds(30));
try
{
await shutdown.Run(CoordinatedShutdown.ClusterLeavingReason.Instance)
.WaitAsync(cts.Token).ConfigureAwait(false);
}
catch (OperationCanceledException)
{
_logger.LogWarning("Cluster leave timed out after 30s; forcing terminate");
await _actorSystem.Terminate().ConfigureAwait(false);
}
}
private static string BuildOverlay(AkkaClusterOptions o)
{
var seeds = string.Join(",", o.SeedNodes.Select(Quote));
var roles = string.Join(",", o.Roles.Select(Quote));
return $@"
akka {{
remote.dot-netty.tcp {{
hostname = {Quote(o.Hostname)}
port = {o.Port}
public-hostname = {Quote(o.PublicHostname)}
}}
cluster {{
seed-nodes = [{seeds}]
roles = [{roles}]
}}
}}";
}
private static string Quote(string? value)
{
var escaped = (value ?? string.Empty).Replace("\\", "\\\\").Replace("\"", "\\\"");
return $"\"{escaped}\"";
}
}
@@ -0,0 +1,182 @@
using Akka.Actor;
using Akka.Cluster;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.OtOpcUa.Commons.Interfaces;
using CommonsNodeId = ZB.MOM.WW.OtOpcUa.Commons.Types.NodeId;
namespace ZB.MOM.WW.OtOpcUa.Cluster;
/// <summary>
/// Thread-safe live view of cluster membership and role topology. Subscribes to
/// <see cref="ClusterEvent.IMemberEvent"/>, <see cref="ClusterEvent.RoleLeaderChanged"/>, and
/// <see cref="ClusterEvent.LeaderChanged"/> through an internal subscriber actor and keeps
/// a snapshot of role-to-members + role-to-leader. The CLR-facing event surface is
/// <see cref="IClusterRoleInfo.RoleLeaderChanged"/>.
/// </summary>
public sealed class ClusterRoleInfo : IClusterRoleInfo, IDisposable
{
private readonly Akka.Cluster.Cluster _cluster;
private readonly ILogger<ClusterRoleInfo> _logger;
private readonly CommonsNodeId _localNode;
private readonly HashSet<string> _localRoles;
private readonly object _lock = new();
private readonly Dictionary<string, Member?> _roleLeaders = new(StringComparer.Ordinal);
private readonly Dictionary<string, HashSet<Member>> _membersByRole = new(StringComparer.Ordinal);
private IActorRef? _subscriber;
public ClusterRoleInfo(ActorSystem system, IOptions<AkkaClusterOptions> options, ILogger<ClusterRoleInfo> logger)
{
_cluster = Akka.Cluster.Cluster.Get(system);
_logger = logger;
_localNode = CommonsNodeId.Parse(options.Value.PublicHostname);
_localRoles = new HashSet<string>(options.Value.Roles, StringComparer.Ordinal);
SeedFromCurrentState();
_subscriber = system.ActorOf(Props.Create(() => new SubscriberActor(this)), "clusterroleinfo-subscriber");
}
public CommonsNodeId LocalNode => _localNode;
public IReadOnlySet<string> LocalRoles => _localRoles;
public bool HasRole(string role) => _localRoles.Contains(role);
public IReadOnlyList<CommonsNodeId> MembersWithRole(string role)
{
lock (_lock)
{
if (!_membersByRole.TryGetValue(role, out var members)) return Array.Empty<CommonsNodeId>();
return members
.Select(m => CommonsNodeId.Parse(m.Address.Host ?? string.Empty))
.ToArray();
}
}
public CommonsNodeId? RoleLeader(string role)
{
lock (_lock)
{
return _roleLeaders.TryGetValue(role, out var leader) && leader is not null
? CommonsNodeId.Parse(leader.Address.Host ?? string.Empty)
: (CommonsNodeId?)null;
}
}
public event EventHandler<RoleLeaderChangedEventArgs>? RoleLeaderChanged;
private void SeedFromCurrentState()
{
var snapshot = _cluster.State;
lock (_lock)
{
foreach (var member in snapshot.Members)
foreach (var role in member.Roles)
{
if (!_membersByRole.TryGetValue(role, out var set))
_membersByRole[role] = set = new HashSet<Member>();
set.Add(member);
}
foreach (var role in snapshot.Members.SelectMany(m => m.Roles).Distinct())
{
var leaderAddr = _cluster.State.RoleLeader(role);
_roleLeaders[role] = leaderAddr is not null
? snapshot.Members.FirstOrDefault(m => m.Address == leaderAddr)
: null;
}
}
}
internal void HandleMemberEvent(ClusterEvent.IMemberEvent evt)
{
lock (_lock)
{
switch (evt)
{
case ClusterEvent.MemberUp up:
foreach (var role in up.Member.Roles)
{
if (!_membersByRole.TryGetValue(role, out var set))
_membersByRole[role] = set = new HashSet<Member>();
set.Add(up.Member);
}
break;
case ClusterEvent.MemberRemoved removed:
foreach (var role in removed.Member.Roles)
if (_membersByRole.TryGetValue(role, out var set))
set.Remove(removed.Member);
break;
}
}
}
internal void HandleRoleLeaderChanged(ClusterEvent.RoleLeaderChanged evt)
{
CommonsNodeId? previous = null;
CommonsNodeId? next = null;
var raise = false;
lock (_lock)
{
_roleLeaders.TryGetValue(evt.Role, out var prevMember);
if (prevMember is not null)
previous = CommonsNodeId.Parse(prevMember.Address.Host ?? string.Empty);
var nextMember = evt.Leader is null
? null
: _cluster.State.Members.FirstOrDefault(m => m.Address == evt.Leader);
_roleLeaders[evt.Role] = nextMember;
if (nextMember is not null)
next = CommonsNodeId.Parse(nextMember.Address.Host ?? string.Empty);
raise = !Nullable.Equals(previous, next);
}
if (!raise) return;
try
{
RoleLeaderChanged?.Invoke(this, new RoleLeaderChangedEventArgs
{
Role = evt.Role,
PreviousLeader = previous,
NewLeader = next,
});
}
catch (Exception ex)
{
_logger.LogWarning(ex, "RoleLeaderChanged subscriber threw for role {Role}", evt.Role);
}
}
public void Dispose()
{
_subscriber?.Tell(PoisonPill.Instance);
_subscriber = null;
}
private sealed class SubscriberActor : ReceiveActor
{
public SubscriberActor(ClusterRoleInfo owner)
{
Receive<ClusterEvent.IMemberEvent>(e => owner.HandleMemberEvent(e));
Receive<ClusterEvent.RoleLeaderChanged>(e => owner.HandleRoleLeaderChanged(e));
Receive<ClusterEvent.LeaderChanged>(_ => { /* no-op for now; reserved for ServiceLevel calc */ });
Receive<ClusterEvent.CurrentClusterState>(_ => { /* seeded from initial snapshot */ });
}
protected override void PreStart()
{
Akka.Cluster.Cluster.Get(Context.System).Subscribe(
Self,
ClusterEvent.InitialStateAsEvents,
typeof(ClusterEvent.IMemberEvent),
typeof(ClusterEvent.LeaderChanged),
typeof(ClusterEvent.RoleLeaderChanged));
}
protected override void PostStop() =>
Akka.Cluster.Cluster.Get(Context.System).Unsubscribe(Self);
}
}
@@ -0,0 +1,15 @@
namespace ZB.MOM.WW.OtOpcUa.Cluster;
public static class HoconLoader
{
private const string ResourceName = "ZB.MOM.WW.OtOpcUa.Cluster.Resources.akka.conf";
public static string LoadBaseConfig()
{
using var stream = typeof(HoconLoader).Assembly.GetManifestResourceStream(ResourceName)
?? throw new InvalidOperationException(
$"Embedded resource '{ResourceName}' not found. Verify EmbeddedResource glob in csproj.");
using var reader = new StreamReader(stream);
return reader.ReadToEnd();
}
}
@@ -0,0 +1,73 @@
# Base Akka.NET cluster configuration for OtOpcUa fused-host nodes.
#
# Roles, seed nodes, public hostname/port, and the actor system name are overlaid
# at runtime by AkkaHostedService — see ZB.MOM.WW.OtOpcUa.Cluster/AkkaHostedService.cs.
# Everything else here is the cluster-wide tuning that should match across nodes.
#
# Tuning sourced from ScadaLink (ScadaLink.Host/Actors/AkkaHostedService.BuildHocon);
# any divergence must be deliberate and recorded in docs/v2/Architecture.md.
akka {
extensions = [
"Akka.Cluster.Tools.PublishSubscribe.DistributedPubSubExtensionProvider, Akka.Cluster.Tools"
]
actor {
provider = cluster
}
remote {
dot-netty.tcp {
hostname = "0.0.0.0"
port = 4053
}
transport-failure-detector {
heartbeat-interval = 2s
acceptable-heartbeat-pause = 10s
}
}
cluster {
seed-nodes = []
roles = []
min-nr-of-members = 1
split-brain-resolver {
active-strategy = "keep-oldest"
stable-after = 15s
keep-oldest {
down-if-alone = on
}
}
failure-detector {
heartbeat-interval = 2s
threshold = 10.0
acceptable-heartbeat-pause = 10s
}
down-removal-margin = 15s
run-coordinated-shutdown-when-down = on
singleton {
singleton-name = "singleton"
}
singleton-proxy {
singleton-identification-interval = 1s
}
}
coordinated-shutdown {
run-by-clr-shutdown-hook = on
default-phase-timeout = 30s
}
}
# Pinned dispatcher used by OpcUaPublishActor (Task 44) so the OPC UA SDK sees
# only one thread per actor instance — its session/subscription locks expect
# strict single-threaded access.
opcua-synchronized-dispatcher {
type = "PinnedDispatcher"
executor = "thread-pool-executor"
throughput = 1
}
@@ -0,0 +1,29 @@
namespace ZB.MOM.WW.OtOpcUa.Cluster;
public static class RoleParser
{
private static readonly HashSet<string> Allowed = new(StringComparer.Ordinal)
{
"admin", "driver", "dev",
};
public static string[] Parse(string? raw)
{
if (string.IsNullOrWhiteSpace(raw)) return Array.Empty<string>();
var roles = raw
.Split(',', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
.Select(r => r.ToLowerInvariant())
.Distinct()
.ToArray();
foreach (var r in roles)
{
if (!Allowed.Contains(r))
throw new ArgumentException(
$"Unknown role '{r}'. Allowed: {string.Join(", ", Allowed)}.", nameof(raw));
}
return roles;
}
}
@@ -0,0 +1,28 @@
using Akka.Actor;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using ZB.MOM.WW.OtOpcUa.Commons.Interfaces;
namespace ZB.MOM.WW.OtOpcUa.Cluster;
public static class ServiceCollectionExtensions
{
/// <summary>
/// Registers the Akka cluster hosted service and exposes <see cref="ActorSystem"/> and
/// <see cref="IClusterRoleInfo"/> as singletons resolved from it. Call after binding
/// <c>OTOPCUA_ROLES</c> into <c>AkkaClusterOptions.Roles</c> via the calling Program.cs.
/// </summary>
public static IServiceCollection AddOtOpcUaCluster(this IServiceCollection services, IConfiguration configuration)
{
services.AddOptions<AkkaClusterOptions>()
.Bind(configuration.GetSection(AkkaClusterOptions.SectionName));
services.AddSingleton<AkkaHostedService>();
services.AddHostedService(sp => sp.GetRequiredService<AkkaHostedService>());
services.AddSingleton<ActorSystem>(sp => sp.GetRequiredService<AkkaHostedService>().ActorSystem);
services.AddSingleton<IClusterRoleInfo, ClusterRoleInfo>();
return services;
}
}
@@ -0,0 +1,32 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Cluster</RootNamespace>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Akka.Hosting"/>
<PackageReference Include="Akka.Cluster"/>
<PackageReference Include="Akka.Cluster.Hosting"/>
<PackageReference Include="Akka.Cluster.Tools"/>
<PackageReference Include="Akka.Remote.Hosting"/>
<PackageReference Include="Microsoft.Extensions.Hosting"/>
<PackageReference Include="Microsoft.Extensions.Options.ConfigurationExtensions"/>
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Commons\ZB.MOM.WW.OtOpcUa.Commons.csproj"/>
</ItemGroup>
<ItemGroup>
<EmbeddedResource Include="Resources\akka.conf"/>
</ItemGroup>
<ItemGroup>
<!-- OpenTelemetry.Api 1.9.0 reaches this project transitively from Akka.Cluster.Hosting.
Bump arrives when Akka updates its OTel dependency; tracked separately. -->
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-g94r-2vxg-569j"/>
</ItemGroup>
</Project>
@@ -0,0 +1,13 @@
using ZB.MOM.WW.OtOpcUa.Commons.Messages.Admin;
namespace ZB.MOM.WW.OtOpcUa.Commons.Interfaces;
/// <summary>
/// Cluster-singleton-proxy client for the <c>AdminOperationsActor</c>. The Blazor UI calls
/// this from any host (admin or driver role); the proxy routes the request to whichever node
/// holds the admin singleton.
/// </summary>
public interface IAdminOperationsClient
{
Task<StartDeploymentResult> StartDeploymentAsync(string createdBy, CancellationToken ct);
}
@@ -0,0 +1,20 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Interfaces;
/// <summary>
/// Live view of the local node's identity and the cluster's role topology. Implemented by
/// <c>ClusterRoleInfo</c> in <c>OtOpcUa.Cluster</c>; consumed by everything that needs to
/// distinguish admin-role vs driver-role members or react to role-leader changes (e.g. OPC UA
/// ServiceLevel computation).
/// </summary>
public interface IClusterRoleInfo
{
NodeId LocalNode { get; }
IReadOnlySet<string> LocalRoles { get; }
bool HasRole(string role);
IReadOnlyList<NodeId> MembersWithRole(string role);
NodeId? RoleLeader(string role);
event EventHandler<RoleLeaderChangedEventArgs>? RoleLeaderChanged;
}
@@ -0,0 +1,12 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Interfaces;
/// <summary>
/// Per-node diagnostics fetched on demand. Implemented in Phase 8 (AdminUI/Runtime wiring)
/// over an Akka request/response — the diagnostics actor lives on the target driver node.
/// </summary>
public interface IFleetDiagnosticsClient
{
Task<NodeDiagnosticsSnapshot> GetDiagnosticsAsync(NodeId nodeId, CancellationToken ct);
}
@@ -0,0 +1,21 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Interfaces;
public sealed record DriverInstanceDiagnostics(
Guid DriverInstanceId,
string Name,
string State,
int ConnectedDevices,
int FaultedDevices,
DateTime LastChangeUtc);
/// <summary>
/// Per-node diagnostics returned by <c>IFleetDiagnosticsClient</c>. Populated by the node's
/// local <c>DriverHostActor</c> via a request/response over Akka.
/// </summary>
public sealed record NodeDiagnosticsSnapshot(
NodeId NodeId,
RevisionHash? CurrentRevision,
IReadOnlyList<DriverInstanceDiagnostics> Drivers,
DateTime AsOfUtc);
@@ -0,0 +1,10 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Interfaces;
public sealed class RoleLeaderChangedEventArgs : EventArgs
{
public required string Role { get; init; }
public required NodeId? PreviousLeader { get; init; }
public required NodeId? NewLeader { get; init; }
}
@@ -0,0 +1,11 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Admin;
/// <summary>
/// Request from the admin UI to the <c>AdminOperationsActor</c> singleton asking it to snapshot
/// the current live-edit state and start a deployment.
/// </summary>
public sealed record StartDeployment(
string CreatedBy,
CorrelationId CorrelationId);
@@ -0,0 +1,23 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Admin;
public enum StartDeploymentOutcome
{
Accepted,
NoChanges,
AnotherDeploymentInFlight,
Rejected,
}
/// <summary>
/// Reply from the <c>AdminOperationsActor</c> singleton. <c>Accepted</c> means the snapshot
/// was sealed and a <c>Deployment</c> row was created; the in-flight deployment can be
/// tracked through fleet-status broadcasts.
/// </summary>
public sealed record StartDeploymentResult(
StartDeploymentOutcome Outcome,
DeploymentId? DeploymentId,
RevisionHash? RevisionHash,
string? Message,
CorrelationId CorrelationId);
@@ -0,0 +1,17 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Audit;
/// <summary>
/// Cluster-broadcast audit event consumed by the <c>AuditWriterActor</c> singleton, which
/// batches and idempotently inserts into <c>ConfigAuditLog</c>.
/// </summary>
public sealed record AuditEvent(
Guid EventId,
string Category,
string Action,
string Actor,
DateTime OccurredAtUtc,
string? DetailsJson,
NodeId SourceNode,
CorrelationId CorrelationId);
@@ -0,0 +1,15 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Deploy;
public enum ApplyAckOutcome { Applied, Failed }
/// <summary>
/// Per-node acknowledgment returned by <c>DriverHostActor</c> to the dispatching coordinator.
/// </summary>
public sealed record ApplyAck(
DeploymentId DeploymentId,
NodeId NodeId,
ApplyAckOutcome Outcome,
string? FailureReason,
CorrelationId CorrelationId);
@@ -0,0 +1,14 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Deploy;
/// <summary>
/// Coordinator-published event indicating that the deployment failed and was rolled back.
/// Includes the set of nodes that NACKed or timed out so the admin UI can surface which
/// node(s) are sticky on the prior good revision.
/// </summary>
public sealed record DeploymentFailed(
DeploymentId DeploymentId,
string FailureReason,
IReadOnlyList<NodeId> FailedNodes,
CorrelationId CorrelationId);
@@ -0,0 +1,13 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Deploy;
/// <summary>
/// Coordinator-published event indicating that every active driver node successfully applied
/// the deployment and the row in <c>Deployment</c> has been transitioned to <c>Sealed</c>.
/// </summary>
public sealed record DeploymentSealed(
DeploymentId DeploymentId,
RevisionHash RevisionHash,
DateTime SealedAtUtc,
CorrelationId CorrelationId);
@@ -0,0 +1,13 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Deploy;
/// <summary>
/// Sent from the admin-role <c>ConfigPublishCoordinator</c> singleton to each driver node's
/// <c>DriverHostActor</c>. Tells the node to fetch the deployment artifact identified by
/// <paramref name="DeploymentId"/> + <paramref name="RevisionHash"/> and apply it.
/// </summary>
public sealed record DispatchDeployment(
DeploymentId DeploymentId,
RevisionHash RevisionHash,
CorrelationId CorrelationId);
@@ -0,0 +1,21 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Fleet;
public enum FleetNodeHealth { Healthy, Degraded, Unreachable }
public sealed record FleetNodeStatus(
NodeId NodeId,
FleetNodeHealth Health,
RevisionHash? CurrentRevision,
DateTime LastSeenUtc);
/// <summary>
/// Periodic fleet-wide status broadcast pushed by <c>FleetStatusBroadcaster</c> to admin UI
/// subscribers via SignalR.
/// </summary>
public sealed record FleetStatusChanged(
IReadOnlyList<FleetNodeStatus> Nodes,
DeploymentId? CurrentDeployment,
DateTime AsOfUtc,
CorrelationId CorrelationId);
@@ -0,0 +1,16 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Redundancy;
public enum RedundancyRole { Primary, Secondary, Detached }
/// <summary>
/// Snapshot of a single node's redundancy state. Aggregated by <c>RedundancyStateActor</c>
/// to compute fleet-wide ServiceLevel.
/// </summary>
public sealed record NodeRedundancyState(
NodeId NodeId,
RedundancyRole Role,
bool IsClusterLeader,
bool IsRoleLeaderForDriver,
DateTime AsOfUtc);
@@ -0,0 +1,11 @@
using ZB.MOM.WW.OtOpcUa.Commons.Types;
namespace ZB.MOM.WW.OtOpcUa.Commons.Messages.Redundancy;
/// <summary>
/// Broadcast whenever the cluster's redundancy topology changes (node up/down, role-leader
/// change, partition heal). Subscribers compute their local OPC UA ServiceLevel from this.
/// </summary>
public sealed record RedundancyStateChanged(
IReadOnlyList<NodeRedundancyState> Nodes,
CorrelationId CorrelationId);
@@ -0,0 +1,13 @@
namespace ZB.MOM.WW.OtOpcUa.Commons.Types;
public readonly record struct CorrelationId(Guid Value)
{
public static CorrelationId NewId() => new(Guid.NewGuid());
public override string ToString() => Value.ToString("N");
public static CorrelationId Parse(string s) => new(Guid.ParseExact(s, "N"));
public static bool TryParse(string? s, out CorrelationId id)
{
if (Guid.TryParseExact(s, "N", out var g)) { id = new CorrelationId(g); return true; }
id = default; return false;
}
}
@@ -0,0 +1,13 @@
namespace ZB.MOM.WW.OtOpcUa.Commons.Types;
public readonly record struct DeploymentId(Guid Value)
{
public static DeploymentId NewId() => new(Guid.NewGuid());
public override string ToString() => Value.ToString("N");
public static DeploymentId Parse(string s) => new(Guid.ParseExact(s, "N"));
public static bool TryParse(string? s, out DeploymentId id)
{
if (Guid.TryParseExact(s, "N", out var g)) { id = new DeploymentId(g); return true; }
id = default; return false;
}
}
@@ -0,0 +1,13 @@
namespace ZB.MOM.WW.OtOpcUa.Commons.Types;
public readonly record struct ExecutionId(Guid Value)
{
public static ExecutionId NewId() => new(Guid.NewGuid());
public override string ToString() => Value.ToString("N");
public static ExecutionId Parse(string s) => new(Guid.ParseExact(s, "N"));
public static bool TryParse(string? s, out ExecutionId id)
{
if (Guid.TryParseExact(s, "N", out var g)) { id = new ExecutionId(g); return true; }
id = default; return false;
}
}
@@ -0,0 +1,20 @@
namespace ZB.MOM.WW.OtOpcUa.Commons.Types;
/// <summary>
/// Logical cluster node identifier — typically the host name configured on a fused
/// <c>OtOpcUa.Host</c> instance. NOT to be confused with OPC UA <c>NodeId</c> from the
/// Opc.Ua.Core stack.
/// </summary>
public readonly record struct NodeId(string Value)
{
public override string ToString() => Value;
public static NodeId Parse(string s) =>
string.IsNullOrWhiteSpace(s)
? throw new ArgumentException("NodeId value cannot be empty.", nameof(s))
: new NodeId(s);
public static bool TryParse(string? s, out NodeId id)
{
if (!string.IsNullOrWhiteSpace(s)) { id = new NodeId(s); return true; }
id = default; return false;
}
}
@@ -0,0 +1,19 @@
namespace ZB.MOM.WW.OtOpcUa.Commons.Types;
/// <summary>
/// SHA-256 hex digest identifying a config snapshot revision. Storage form is lowercase
/// 64-char hex (no <c>0x</c> prefix). Empty hash is invalid.
/// </summary>
public readonly record struct RevisionHash(string Value)
{
public override string ToString() => Value;
public static RevisionHash Parse(string s) =>
string.IsNullOrWhiteSpace(s)
? throw new ArgumentException("RevisionHash value cannot be empty.", nameof(s))
: new RevisionHash(s);
public static bool TryParse(string? s, out RevisionHash hash)
{
if (!string.IsNullOrWhiteSpace(s)) { hash = new RevisionHash(s); return true; }
hash = default; return false;
}
}
@@ -0,0 +1,12 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Commons</RootNamespace>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Akka"/>
</ItemGroup>
</Project>
@@ -1,19 +0,0 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Apply;
/// <summary>
/// Host-supplied callbacks invoked as the applier walks the diff. Callbacks are idempotent on
/// retry (the applier may re-invoke with the same inputs if a later stage fails — nodes
/// register-applied to the central DB only after success). Order: namespace → driver → device →
/// equipment → poll group → tag, with Removed before Added/Modified.
/// </summary>
public sealed class ApplyCallbacks
{
public Func<EntityChange<Namespace>, CancellationToken, Task>? OnNamespace { get; init; }
public Func<EntityChange<DriverInstance>, CancellationToken, Task>? OnDriver { get; init; }
public Func<EntityChange<Device>, CancellationToken, Task>? OnDevice { get; init; }
public Func<EntityChange<Equipment>, CancellationToken, Task>? OnEquipment { get; init; }
public Func<EntityChange<PollGroup>, CancellationToken, Task>? OnPollGroup { get; init; }
public Func<EntityChange<Tag>, CancellationToken, Task>? OnTag { get; init; }
}
@@ -1,8 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Apply;
public enum ChangeKind
{
Added,
Removed,
Modified,
}
@@ -1,58 +0,0 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Validation;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Apply;
public sealed class GenerationApplier(ApplyCallbacks callbacks) : IGenerationApplier
{
public async Task<ApplyResult> ApplyAsync(DraftSnapshot? from, DraftSnapshot to, CancellationToken ct)
{
var diff = GenerationDiffer.Compute(from, to);
var errors = new List<string>();
// Removed first, then Added/Modified — prevents FK dangling while cascades settle.
await ApplyPass(diff.Tags, ChangeKind.Removed, callbacks.OnTag, errors, ct);
await ApplyPass(diff.PollGroups, ChangeKind.Removed, callbacks.OnPollGroup, errors, ct);
await ApplyPass(diff.Equipment, ChangeKind.Removed, callbacks.OnEquipment, errors, ct);
await ApplyPass(diff.Devices, ChangeKind.Removed, callbacks.OnDevice, errors, ct);
await ApplyPass(diff.Drivers, ChangeKind.Removed, callbacks.OnDriver, errors, ct);
await ApplyPass(diff.Namespaces, ChangeKind.Removed, callbacks.OnNamespace, errors, ct);
foreach (var kind in new[] { ChangeKind.Added, ChangeKind.Modified })
{
// Honour cancellation between passes — a caller can abort the apply between Removed
// and Added phases even if individual callbacks don't observe the token themselves
// (Configuration-007).
ct.ThrowIfCancellationRequested();
await ApplyPass(diff.Namespaces, kind, callbacks.OnNamespace, errors, ct);
await ApplyPass(diff.Drivers, kind, callbacks.OnDriver, errors, ct);
await ApplyPass(diff.Devices, kind, callbacks.OnDevice, errors, ct);
await ApplyPass(diff.Equipment, kind, callbacks.OnEquipment, errors, ct);
await ApplyPass(diff.PollGroups, kind, callbacks.OnPollGroup, errors, ct);
await ApplyPass(diff.Tags, kind, callbacks.OnTag, errors, ct);
}
return errors.Count == 0 ? ApplyResult.Ok(diff) : ApplyResult.Fail(diff, errors);
}
private static async Task ApplyPass<T>(
IReadOnlyList<EntityChange<T>> changes,
ChangeKind kind,
Func<EntityChange<T>, CancellationToken, Task>? callback,
List<string> errors,
CancellationToken ct)
{
if (callback is null) return;
foreach (var change in changes.Where(c => c.Kind == kind))
{
try { await callback(change, ct); }
// Configuration-007: cancellation must propagate, not be silently recorded as an
// entity error. Distinguish caller cancellation (token signalled) from any
// OperationCanceledException raised independently of the caller's token, which we
// still want to surface as an entity error so a single misbehaving callback does
// not crash the entire apply.
catch (OperationCanceledException) when (ct.IsCancellationRequested) { throw; }
catch (Exception ex) { errors.Add($"{typeof(T).Name} {change.Kind} '{change.LogicalId}': {ex.Message}"); }
}
}
}
@@ -1,70 +0,0 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Validation;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Apply;
/// <summary>
/// Per-entity diff computed locally on the node. The enumerable order matches the dependency
/// order expected by <see cref="IGenerationApplier"/>: namespace → driver → device → equipment →
/// poll group → tag → ACL, with Removed processed before Added inside each bucket so cascades
/// settle before new rows appear.
/// </summary>
public sealed record GenerationDiff(
IReadOnlyList<EntityChange<Namespace>> Namespaces,
IReadOnlyList<EntityChange<DriverInstance>> Drivers,
IReadOnlyList<EntityChange<Device>> Devices,
IReadOnlyList<EntityChange<Equipment>> Equipment,
IReadOnlyList<EntityChange<PollGroup>> PollGroups,
IReadOnlyList<EntityChange<Tag>> Tags);
public sealed record EntityChange<T>(ChangeKind Kind, string LogicalId, T? From, T? To);
public static class GenerationDiffer
{
public static GenerationDiff Compute(DraftSnapshot? from, DraftSnapshot to)
{
from ??= new DraftSnapshot { GenerationId = 0, ClusterId = to.ClusterId };
return new GenerationDiff(
Namespaces: DiffById(from.Namespaces, to.Namespaces, x => x.NamespaceId,
(a, b) => (a.ClusterId, a.NamespaceUri, a.Kind, a.Enabled, a.Notes)
== (b.ClusterId, b.NamespaceUri, b.Kind, b.Enabled, b.Notes)),
Drivers: DiffById(from.DriverInstances, to.DriverInstances, x => x.DriverInstanceId,
(a, b) => (a.ClusterId, a.NamespaceId, a.Name, a.DriverType, a.Enabled, a.DriverConfig)
== (b.ClusterId, b.NamespaceId, b.Name, b.DriverType, b.Enabled, b.DriverConfig)),
Devices: DiffById(from.Devices, to.Devices, x => x.DeviceId,
(a, b) => (a.DriverInstanceId, a.Name, a.Enabled, a.DeviceConfig)
== (b.DriverInstanceId, b.Name, b.Enabled, b.DeviceConfig)),
Equipment: DiffById(from.Equipment, to.Equipment, x => x.EquipmentId,
(a, b) => (a.EquipmentUuid, a.DriverInstanceId, a.UnsLineId, a.Name, a.MachineCode, a.ZTag, a.SAPID, a.Enabled)
== (b.EquipmentUuid, b.DriverInstanceId, b.UnsLineId, b.Name, b.MachineCode, b.ZTag, b.SAPID, b.Enabled)),
PollGroups: DiffById(from.PollGroups, to.PollGroups, x => x.PollGroupId,
(a, b) => (a.DriverInstanceId, a.Name, a.IntervalMs)
== (b.DriverInstanceId, b.Name, b.IntervalMs)),
Tags: DiffById(from.Tags, to.Tags, x => x.TagId,
(a, b) => (a.DriverInstanceId, a.DeviceId, a.EquipmentId, a.PollGroupId, a.FolderPath, a.Name, a.DataType, a.AccessLevel, a.WriteIdempotent, a.TagConfig)
== (b.DriverInstanceId, b.DeviceId, b.EquipmentId, b.PollGroupId, b.FolderPath, b.Name, b.DataType, b.AccessLevel, b.WriteIdempotent, b.TagConfig)));
}
private static List<EntityChange<T>> DiffById<T>(
IReadOnlyList<T> from, IReadOnlyList<T> to,
Func<T, string> id, Func<T, T, bool> equal)
{
var fromById = from.ToDictionary(id);
var toById = to.ToDictionary(id);
var result = new List<EntityChange<T>>();
foreach (var (logicalId, src) in fromById.Where(kv => !toById.ContainsKey(kv.Key)))
result.Add(new(ChangeKind.Removed, logicalId, src, default));
foreach (var (logicalId, dst) in toById)
{
if (!fromById.TryGetValue(logicalId, out var src))
result.Add(new(ChangeKind.Added, logicalId, default, dst));
else if (!equal(src, dst))
result.Add(new(ChangeKind.Modified, logicalId, src, dst));
}
return result;
}
}
@@ -1,23 +0,0 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Validation;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Apply;
/// <summary>
/// Applies a <see cref="GenerationDiff"/> to whatever backing runtime the node owns: the OPC UA
/// address space, driver subscriptions, the local cache, etc. The Core project wires concrete
/// callbacks into this via <see cref="ApplyCallbacks"/> so the Configuration project stays free
/// of a Core/Server dependency (interface independence per decision #59).
/// </summary>
public interface IGenerationApplier
{
Task<ApplyResult> ApplyAsync(DraftSnapshot? from, DraftSnapshot to, CancellationToken ct);
}
public sealed record ApplyResult(
bool Succeeded,
GenerationDiff Diff,
IReadOnlyList<string> Errors)
{
public static ApplyResult Ok(GenerationDiff diff) => new(true, diff, []);
public static ApplyResult Fail(GenerationDiff diff, IReadOnlyList<string> errors) => new(false, diff, errors);
}
@@ -1,5 +1,3 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>Physical OPC UA server node within a <see cref="ServerCluster"/>.</summary>
@@ -10,8 +8,6 @@ public sealed class ClusterNode
public required string ClusterId { get; set; }
public required RedundancyRole RedundancyRole { get; set; }
/// <summary>Machine hostname / IP.</summary>
public required string Host { get; set; }
@@ -47,5 +43,4 @@ public sealed class ClusterNode
// Navigation
public ServerCluster? Cluster { get; set; }
public ICollection<ClusterNodeCredential> Credentials { get; set; } = [];
public ClusterNodeGenerationState? GenerationState { get; set; }
}
@@ -1,26 +0,0 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Tracks which generation each node has applied. Per-node (not per-cluster) — both nodes of a
/// 2-node cluster track independently per decision #84.
/// </summary>
public sealed class ClusterNodeGenerationState
{
public required string NodeId { get; set; }
public long? CurrentGenerationId { get; set; }
public DateTime? LastAppliedAt { get; set; }
public NodeApplyStatus? LastAppliedStatus { get; set; }
public string? LastAppliedError { get; set; }
/// <summary>Updated on every poll for liveness detection.</summary>
public DateTime? LastSeenAt { get; set; }
public ClusterNode? Node { get; set; }
public ConfigGeneration? CurrentGeneration { get; set; }
}
@@ -0,0 +1,27 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Append-only audit row written by AdminOperationsActor on every mutating live-edit
/// operation. The ExecutionId optionally correlates a sequence of edits that ran inside one
/// admin transaction (e.g. an import batch that updates many Equipment rows).
/// </summary>
public sealed class ConfigEdit
{
public Guid EditId { get; init; } = Guid.NewGuid();
public required string EntityType { get; init; }
public Guid EntityId { get; init; }
/// <summary>JSON payload of the column-name → new-value pairs touched by this edit.</summary>
public required string FieldsJson { get; init; }
/// <summary>Optional correlation across edits inside a single admin operation.</summary>
public Guid? ExecutionId { get; init; }
public required string EditedBy { get; init; }
public DateTime EditedAtUtc { get; init; } = DateTime.UtcNow;
public required string SourceNode { get; init; }
}
@@ -1,32 +0,0 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Atomic, immutable snapshot of one cluster's configuration.
/// Per decision #82 — cluster-scoped, not fleet-scoped.
/// </summary>
public sealed class ConfigGeneration
{
/// <summary>Monotonically increasing ID, generated by <c>IDENTITY(1, 1)</c>.</summary>
public long GenerationId { get; set; }
public required string ClusterId { get; set; }
public required GenerationStatus Status { get; set; }
public long? ParentGenerationId { get; set; }
public DateTime? PublishedAt { get; set; }
public string? PublishedBy { get; set; }
public string? Notes { get; set; }
public DateTime CreatedAt { get; set; } = DateTime.UtcNow;
public required string CreatedBy { get; set; }
public ServerCluster? Cluster { get; set; }
public ConfigGeneration? Parent { get; set; }
}
@@ -0,0 +1,30 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Immutable snapshot of a config artifact dispatched to every driver-role node by the
/// ConfigPublishCoordinator. Replaces the v1 <c>ConfigGeneration</c> draft/publish
/// row; the ArtifactBlob carries the SnapshotAndFlatten() output produced by
/// AdminOperationsActor.
/// </summary>
public sealed class Deployment
{
public Guid DeploymentId { get; init; } = Guid.NewGuid();
public required string RevisionHash { get; init; }
public DeploymentStatus Status { get; set; } = DeploymentStatus.Dispatching;
public required string CreatedBy { get; init; }
public DateTime CreatedAtUtc { get; init; } = DateTime.UtcNow;
public byte[] ArtifactBlob { get; init; } = Array.Empty<byte>();
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
public string? FailureReason { get; set; }
public DateTime? SealedAtUtc { get; set; }
}
@@ -5,8 +5,6 @@ public sealed class Device
{
public Guid DeviceRowId { get; set; }
public long GenerationId { get; set; }
public required string DeviceId { get; set; }
/// <summary>Logical FK to <see cref="DriverInstance.DriverInstanceId"/>.</summary>
@@ -19,5 +17,6 @@ public sealed class Device
/// <summary>Schemaless per-driver-type device config (host, port, unit ID, slot, etc.).</summary>
public required string DeviceConfig { get; set; }
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
}
@@ -5,8 +5,6 @@ public sealed class DriverInstance
{
public Guid DriverInstanceRowId { get; set; }
public long GenerationId { get; set; }
public required string DriverInstanceId { get; set; }
public required string ClusterId { get; set; }
@@ -45,6 +43,8 @@ public sealed class DriverInstance
/// </summary>
public string? ResilienceConfig { get; set; }
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
public ServerCluster? Cluster { get; set; }
}
@@ -9,8 +9,6 @@ public sealed class Equipment
{
public Guid EquipmentRowId { get; set; }
public long GenerationId { get; set; }
/// <summary>
/// System-generated stable internal logical ID. Format: <c>'EQ-' + first 12 hex chars of EquipmentUuid</c>.
/// NEVER operator-supplied, NEVER in CSV imports, NEVER editable in Admin UI (decision #125).
@@ -60,5 +58,6 @@ public sealed class Equipment
public bool Enabled { get; set; } = true;
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
}
@@ -10,9 +10,7 @@ public sealed class Namespace
{
public Guid NamespaceRowId { get; set; }
public long GenerationId { get; set; }
/// <summary>Stable logical ID across generations, e.g. "LINE3-OPCUA-equipment".</summary>
/// <summary>Stable logical ID, e.g. "LINE3-OPCUA-equipment". Globally unique in v2.</summary>
public required string NamespaceId { get; set; }
public required string ClusterId { get; set; }
@@ -26,6 +24,8 @@ public sealed class Namespace
public string? Notes { get; set; }
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
public ServerCluster? Cluster { get; set; }
}
@@ -10,8 +10,6 @@ public sealed class NodeAcl
{
public Guid NodeAclRowId { get; set; }
public long GenerationId { get; set; }
public required string NodeAclId { get; set; }
public required string ClusterId { get; set; }
@@ -28,5 +26,6 @@ public sealed class NodeAcl
public string? Notes { get; set; }
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
}
@@ -0,0 +1,29 @@
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
/// <summary>
/// Per-(node, deployment) apply progress row owned by the DriverHostActor. Replaces the
/// v1 <c>ClusterNodeGenerationState</c> single-row-per-node model with a history
/// of every apply attempt so the ConfigPublishCoordinator can reconstruct in-flight state
/// after a failover.
/// </summary>
public sealed class NodeDeploymentState
{
public required string NodeId { get; init; }
public Guid DeploymentId { get; init; }
public NodeDeploymentStatus Status { get; set; } = NodeDeploymentStatus.Applying;
public DateTime StartedAtUtc { get; set; } = DateTime.UtcNow;
public DateTime? AppliedAtUtc { get; set; }
public string? FailureReason { get; set; }
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
public ClusterNode? Node { get; set; }
public Deployment? Deployment { get; set; }
}
@@ -5,8 +5,6 @@ public sealed class PollGroup
{
public Guid PollGroupRowId { get; set; }
public long GenerationId { get; set; }
public required string PollGroupId { get; set; }
public required string DriverInstanceId { get; set; }
@@ -15,5 +13,6 @@ public sealed class PollGroup
public int IntervalMs { get; set; }
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
}
@@ -17,9 +17,8 @@ namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
public sealed class Script
{
public Guid ScriptRowId { get; set; }
public long GenerationId { get; set; }
/// <summary>Stable logical id. Carries across generations.</summary>
/// <summary>Stable logical id. Globally unique in v2.</summary>
public required string ScriptId { get; set; }
/// <summary>Operator-friendly name for log filtering + Admin UI list view.</summary>
@@ -34,5 +33,6 @@ public sealed class Script
/// <summary>Language — always "CSharp" today; placeholder for future engines (Python/Lua).</summary>
public string Language { get; set; } = "CSharp";
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
}
@@ -17,9 +17,8 @@ namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
public sealed class ScriptedAlarm
{
public Guid ScriptedAlarmRowId { get; set; }
public long GenerationId { get; set; }
/// <summary>Stable logical id — drives <c>AlarmConditionType.ConditionName</c>.</summary>
/// <summary>Stable logical id — drives <c>AlarmConditionType.ConditionName</c>. Globally unique in v2.</summary>
public required string ScriptedAlarmId { get; set; }
/// <summary>Logical FK to <see cref="Equipment.EquipmentId"/> — owner of this alarm.</summary>
@@ -55,5 +54,6 @@ public sealed class ScriptedAlarm
public bool Enabled { get; set; } = true;
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
}
@@ -38,5 +38,4 @@ public sealed class ServerCluster
// Navigation
public ICollection<ClusterNode> Nodes { get; set; } = [];
public ICollection<Namespace> Namespaces { get; set; } = [];
public ICollection<ConfigGeneration> Generations { get; set; } = [];
}
@@ -11,8 +11,6 @@ public sealed class Tag
{
public Guid TagRowId { get; set; }
public long GenerationId { get; set; }
public required string TagId { get; set; }
public required string DriverInstanceId { get; set; }
@@ -43,5 +41,6 @@ public sealed class Tag
/// <summary>Register address / scaling / poll group / byte-order / etc. — schemaless per driver type.</summary>
public required string TagConfig { get; set; }
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
}
@@ -5,8 +5,6 @@ public sealed class UnsArea
{
public Guid UnsAreaRowId { get; set; }
public long GenerationId { get; set; }
public required string UnsAreaId { get; set; }
public required string ClusterId { get; set; }
@@ -16,6 +14,8 @@ public sealed class UnsArea
public string? Notes { get; set; }
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
public ServerCluster? Cluster { get; set; }
}
@@ -5,11 +5,9 @@ public sealed class UnsLine
{
public Guid UnsLineRowId { get; set; }
public long GenerationId { get; set; }
public required string UnsLineId { get; set; }
/// <summary>Logical FK to <see cref="UnsArea.UnsAreaId"/>; resolved within the same generation.</summary>
/// <summary>Logical FK to <see cref="UnsArea.UnsAreaId"/>.</summary>
public required string UnsAreaId { get; set; }
/// <summary>UNS level 4 segment: matches <c>^[a-z0-9-]{1,32}$</c> OR equals literal <c>_default</c>.</summary>
@@ -17,5 +15,6 @@ public sealed class UnsLine
public string? Notes { get; set; }
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
}
@@ -21,9 +21,8 @@ namespace ZB.MOM.WW.OtOpcUa.Configuration.Entities;
public sealed class VirtualTag
{
public Guid VirtualTagRowId { get; set; }
public long GenerationId { get; set; }
/// <summary>Stable logical id.</summary>
/// <summary>Stable logical id. Globally unique in v2.</summary>
public required string VirtualTagId { get; set; }
/// <summary>Logical FK to <see cref="Equipment.EquipmentId"/> — owner of this virtual tag.</summary>
@@ -49,5 +48,6 @@ public sealed class VirtualTag
public bool Enabled { get; set; } = true;
public ConfigGeneration? Generation { get; set; }
/// <summary>Optimistic concurrency token for last-write-wins detection in the v2 live-edit model.</summary>
public byte[] RowVersion { get; set; } = Array.Empty<byte>();
}
@@ -0,0 +1,15 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>
/// Lifecycle of a deployment artifact dispatched by the v2 ConfigPublishCoordinator.
/// Replaces the v1 ConfigGeneration draft/publish lifecycle (decision tracked in the
/// v2 hosting-alignment design doc).
/// </summary>
public enum DeploymentStatus
{
Dispatching = 0,
AwaitingApplyAcks = 1,
Sealed = 2,
PartiallyFailed = 3,
TimedOut = 4,
}
@@ -1,10 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>Generation lifecycle state. Draft → Published → Superseded | RolledBack.</summary>
public enum GenerationStatus
{
Draft,
Published,
Superseded,
RolledBack,
}
@@ -1,10 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>Status tracked per node in <see cref="Entities.ClusterNodeGenerationState"/>.</summary>
public enum NodeApplyStatus
{
Applied,
RolledBack,
Failed,
InProgress,
}
@@ -0,0 +1,12 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>
/// Per-node deployment apply state. Replaces the v1 NodeApplyStatus that was attached to
/// ClusterNodeGenerationState.
/// </summary>
public enum NodeDeploymentStatus
{
Applying = 0,
Applied = 1,
Failed = 2,
}
@@ -1,9 +0,0 @@
namespace ZB.MOM.WW.OtOpcUa.Configuration.Enums;
/// <summary>Per-node redundancy role within a cluster. Per decision #84.</summary>
public enum RedundancyRole
{
Primary,
Secondary,
Standalone,
}
@@ -1,3 +1,4 @@
using Microsoft.AspNetCore.DataProtection.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
@@ -9,12 +10,11 @@ namespace ZB.MOM.WW.OtOpcUa.Configuration;
/// any divergence is a defect caught by the SchemaComplianceTests introspection check.
/// </summary>
public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbContext> options)
: DbContext(options)
: DbContext(options), IDataProtectionKeyContext
{
public DbSet<ServerCluster> ServerClusters => Set<ServerCluster>();
public DbSet<ClusterNode> ClusterNodes => Set<ClusterNode>();
public DbSet<ClusterNodeCredential> ClusterNodeCredentials => Set<ClusterNodeCredential>();
public DbSet<ConfigGeneration> ConfigGenerations => Set<ConfigGeneration>();
public DbSet<Namespace> Namespaces => Set<Namespace>();
public DbSet<UnsArea> UnsAreas => Set<UnsArea>();
public DbSet<UnsLine> UnsLines => Set<UnsLine>();
@@ -24,7 +24,6 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
public DbSet<Tag> Tags => Set<Tag>();
public DbSet<PollGroup> PollGroups => Set<PollGroup>();
public DbSet<NodeAcl> NodeAcls => Set<NodeAcl>();
public DbSet<ClusterNodeGenerationState> ClusterNodeGenerationStates => Set<ClusterNodeGenerationState>();
public DbSet<ConfigAuditLog> ConfigAuditLogs => Set<ConfigAuditLog>();
public DbSet<ExternalIdReservation> ExternalIdReservations => Set<ExternalIdReservation>();
public DbSet<DriverHostStatus> DriverHostStatuses => Set<DriverHostStatus>();
@@ -37,13 +36,21 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
public DbSet<ScriptedAlarm> ScriptedAlarms => Set<ScriptedAlarm>();
public DbSet<ScriptedAlarmState> ScriptedAlarmStates => Set<ScriptedAlarmState>();
// v2 deploy-model tables (Phase 1 of the Akka + fused-hosting alignment).
public DbSet<Deployment> Deployments => Set<Deployment>();
public DbSet<NodeDeploymentState> NodeDeploymentStates => Set<NodeDeploymentState>();
public DbSet<ConfigEdit> ConfigEdits => Set<ConfigEdit>();
// ASP.NET DataProtection key ring storage (decision: keys persisted in ConfigDb so every
// admin-role node decrypts the same cookies without sharing a filesystem).
public DbSet<DataProtectionKey> DataProtectionKeys => Set<DataProtectionKey>();
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
base.OnModelCreating(modelBuilder);
ConfigureServerCluster(modelBuilder);
ConfigureClusterNode(modelBuilder);
ConfigureClusterNodeCredential(modelBuilder);
ConfigureConfigGeneration(modelBuilder);
ConfigureNamespace(modelBuilder);
ConfigureUnsArea(modelBuilder);
ConfigureUnsLine(modelBuilder);
@@ -53,7 +60,6 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
ConfigureTag(modelBuilder);
ConfigurePollGroup(modelBuilder);
ConfigureNodeAcl(modelBuilder);
ConfigureClusterNodeGenerationState(modelBuilder);
ConfigureConfigAuditLog(modelBuilder);
ConfigureExternalIdReservation(modelBuilder);
ConfigureDriverHostStatus(modelBuilder);
@@ -64,6 +70,10 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
ConfigureVirtualTag(modelBuilder);
ConfigureScriptedAlarm(modelBuilder);
ConfigureScriptedAlarmState(modelBuilder);
ConfigureDeployment(modelBuilder);
ConfigureNodeDeploymentState(modelBuilder);
ConfigureConfigEdit(modelBuilder);
ConfigureDataProtectionKey(modelBuilder);
}
private static void ConfigureServerCluster(ModelBuilder modelBuilder)
@@ -100,7 +110,6 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.HasKey(x => x.NodeId);
e.Property(x => x.NodeId).HasMaxLength(64);
e.Property(x => x.ClusterId).HasMaxLength(64);
e.Property(x => x.RedundancyRole).HasConversion<string>().HasMaxLength(16);
e.Property(x => x.Host).HasMaxLength(255);
e.Property(x => x.ApplicationUri).HasMaxLength(256);
e.Property(x => x.DriverConfigOverridesJson).HasColumnType("nvarchar(max)");
@@ -115,10 +124,10 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
// Fleet-wide unique per decision #86
e.HasIndex(x => x.ApplicationUri).IsUnique().HasDatabaseName("UX_ClusterNode_ApplicationUri");
e.HasIndex(x => x.ClusterId).HasDatabaseName("IX_ClusterNode_ClusterId");
// At most one Primary per cluster
e.HasIndex(x => x.ClusterId).IsUnique()
.HasFilter("[RedundancyRole] = 'Primary'")
.HasDatabaseName("UX_ClusterNode_Primary_Per_Cluster");
// v2: the "one Primary per cluster" filtered unique index (and the RedundancyRole
// column it filtered on) are gone. Akka cluster leader-of-driver-role is the
// authoritative primary signal (see RedundancyStateActor + ServiceLevelCalculator,
// Task 35).
});
}
@@ -147,39 +156,6 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
});
}
private static void ConfigureConfigGeneration(ModelBuilder modelBuilder)
{
modelBuilder.Entity<ConfigGeneration>(e =>
{
e.ToTable("ConfigGeneration");
e.HasKey(x => x.GenerationId);
e.Property(x => x.GenerationId).UseIdentityColumn(seed: 1, increment: 1);
e.Property(x => x.ClusterId).HasMaxLength(64);
e.Property(x => x.Status).HasConversion<string>().HasMaxLength(16);
e.Property(x => x.PublishedAt).HasColumnType("datetime2(3)");
e.Property(x => x.PublishedBy).HasMaxLength(128);
e.Property(x => x.Notes).HasMaxLength(1024);
e.Property(x => x.CreatedAt).HasColumnType("datetime2(3)").HasDefaultValueSql("SYSUTCDATETIME()");
e.Property(x => x.CreatedBy).HasMaxLength(128);
e.HasOne(x => x.Cluster).WithMany(c => c.Generations)
.HasForeignKey(x => x.ClusterId)
.OnDelete(DeleteBehavior.Restrict);
e.HasOne(x => x.Parent).WithMany()
.HasForeignKey(x => x.ParentGenerationId)
.OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.ClusterId, x.Status, x.GenerationId })
.IsDescending(false, false, true)
.IncludeProperties(x => x.PublishedAt)
.HasDatabaseName("IX_ConfigGeneration_Cluster_Published");
// One Draft per cluster at a time
e.HasIndex(x => x.ClusterId).IsUnique()
.HasFilter("[Status] = 'Draft'")
.HasDatabaseName("UX_ConfigGeneration_Draft_Per_Cluster");
});
}
private static void ConfigureNamespace(ModelBuilder modelBuilder)
{
modelBuilder.Entity<Namespace>(e =>
@@ -192,24 +168,20 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.Kind).HasConversion<string>().HasMaxLength(32);
e.Property(x => x.NamespaceUri).HasMaxLength(256);
e.Property(x => x.Notes).HasMaxLength(1024);
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany()
.HasForeignKey(x => x.GenerationId)
.OnDelete(DeleteBehavior.Restrict);
e.HasOne(x => x.Cluster).WithMany(c => c.Namespaces)
.HasForeignKey(x => x.ClusterId)
.OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.ClusterId, x.Kind }).IsUnique()
.HasDatabaseName("UX_Namespace_Generation_Cluster_Kind");
e.HasIndex(x => new { x.GenerationId, x.NamespaceUri }).IsUnique()
.HasDatabaseName("UX_Namespace_Generation_NamespaceUri");
e.HasIndex(x => new { x.GenerationId, x.NamespaceId }).IsUnique()
.HasDatabaseName("UX_Namespace_Generation_LogicalId");
e.HasIndex(x => new { x.GenerationId, x.NamespaceId, x.ClusterId }).IsUnique()
.HasDatabaseName("UX_Namespace_Generation_LogicalId_Cluster");
e.HasIndex(x => new { x.GenerationId, x.ClusterId })
.HasDatabaseName("IX_Namespace_Generation_Cluster");
e.HasIndex(x => new { x.ClusterId, x.Kind }).IsUnique()
.HasDatabaseName("UX_Namespace_Cluster_Kind");
e.HasIndex(x => x.NamespaceUri).IsUnique()
.HasDatabaseName("UX_Namespace_NamespaceUri");
e.HasIndex(x => x.NamespaceId).IsUnique()
.HasDatabaseName("UX_Namespace_LogicalId");
e.HasIndex(x => x.ClusterId)
.HasDatabaseName("IX_Namespace_Cluster");
});
}
@@ -224,13 +196,13 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.ClusterId).HasMaxLength(64);
e.Property(x => x.Name).HasMaxLength(32);
e.Property(x => x.Notes).HasMaxLength(512);
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasOne(x => x.Cluster).WithMany().HasForeignKey(x => x.ClusterId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.ClusterId }).HasDatabaseName("IX_UnsArea_Generation_Cluster");
e.HasIndex(x => new { x.GenerationId, x.UnsAreaId }).IsUnique().HasDatabaseName("UX_UnsArea_Generation_LogicalId");
e.HasIndex(x => new { x.GenerationId, x.ClusterId, x.Name }).IsUnique().HasDatabaseName("UX_UnsArea_Generation_ClusterName");
e.HasIndex(x => x.ClusterId).HasDatabaseName("IX_UnsArea_Cluster");
e.HasIndex(x => x.UnsAreaId).IsUnique().HasDatabaseName("UX_UnsArea_LogicalId");
e.HasIndex(x => new { x.ClusterId, x.Name }).IsUnique().HasDatabaseName("UX_UnsArea_ClusterName");
});
}
@@ -245,12 +217,11 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.UnsAreaId).HasMaxLength(64);
e.Property(x => x.Name).HasMaxLength(32);
e.Property(x => x.Notes).HasMaxLength(512);
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.UnsAreaId }).HasDatabaseName("IX_UnsLine_Generation_Area");
e.HasIndex(x => new { x.GenerationId, x.UnsLineId }).IsUnique().HasDatabaseName("UX_UnsLine_Generation_LogicalId");
e.HasIndex(x => new { x.GenerationId, x.UnsAreaId, x.Name }).IsUnique().HasDatabaseName("UX_UnsLine_Generation_AreaName");
e.HasIndex(x => x.UnsAreaId).HasDatabaseName("IX_UnsLine_Area");
e.HasIndex(x => x.UnsLineId).IsUnique().HasDatabaseName("UX_UnsLine_LogicalId");
e.HasIndex(x => new { x.UnsAreaId, x.Name }).IsUnique().HasDatabaseName("UX_UnsLine_AreaName");
});
}
@@ -274,13 +245,13 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.DriverType).HasMaxLength(32);
e.Property(x => x.DriverConfig).HasColumnType("nvarchar(max)");
e.Property(x => x.ResilienceConfig).HasColumnType("nvarchar(max)");
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasOne(x => x.Cluster).WithMany().HasForeignKey(x => x.ClusterId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.ClusterId }).HasDatabaseName("IX_DriverInstance_Generation_Cluster");
e.HasIndex(x => new { x.GenerationId, x.NamespaceId }).HasDatabaseName("IX_DriverInstance_Generation_Namespace");
e.HasIndex(x => new { x.GenerationId, x.DriverInstanceId }).IsUnique().HasDatabaseName("UX_DriverInstance_Generation_LogicalId");
e.HasIndex(x => x.ClusterId).HasDatabaseName("IX_DriverInstance_Cluster");
e.HasIndex(x => x.NamespaceId).HasDatabaseName("IX_DriverInstance_Namespace");
e.HasIndex(x => x.DriverInstanceId).IsUnique().HasDatabaseName("UX_DriverInstance_LogicalId");
});
}
@@ -298,11 +269,10 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.DriverInstanceId).HasMaxLength(64);
e.Property(x => x.Name).HasMaxLength(128);
e.Property(x => x.DeviceConfig).HasColumnType("nvarchar(max)");
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.DriverInstanceId }).HasDatabaseName("IX_Device_Generation_Driver");
e.HasIndex(x => new { x.GenerationId, x.DeviceId }).IsUnique().HasDatabaseName("UX_Device_Generation_LogicalId");
e.HasIndex(x => x.DriverInstanceId).HasDatabaseName("IX_Device_Driver");
e.HasIndex(x => x.DeviceId).IsUnique().HasDatabaseName("UX_Device_LogicalId");
});
}
@@ -330,17 +300,16 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.ManufacturerUri).HasMaxLength(512);
e.Property(x => x.DeviceManualUri).HasMaxLength(512);
e.Property(x => x.EquipmentClassRef).HasMaxLength(128);
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.DriverInstanceId }).HasDatabaseName("IX_Equipment_Generation_Driver");
e.HasIndex(x => new { x.GenerationId, x.UnsLineId }).HasDatabaseName("IX_Equipment_Generation_Line");
e.HasIndex(x => new { x.GenerationId, x.EquipmentId }).IsUnique().HasDatabaseName("UX_Equipment_Generation_LogicalId");
e.HasIndex(x => new { x.GenerationId, x.UnsLineId, x.Name }).IsUnique().HasDatabaseName("UX_Equipment_Generation_LinePath");
e.HasIndex(x => new { x.GenerationId, x.EquipmentUuid }).IsUnique().HasDatabaseName("UX_Equipment_Generation_Uuid");
e.HasIndex(x => new { x.GenerationId, x.ZTag }).HasFilter("[ZTag] IS NOT NULL").HasDatabaseName("IX_Equipment_Generation_ZTag");
e.HasIndex(x => new { x.GenerationId, x.SAPID }).HasFilter("[SAPID] IS NOT NULL").HasDatabaseName("IX_Equipment_Generation_SAPID");
e.HasIndex(x => new { x.GenerationId, x.MachineCode }).HasDatabaseName("IX_Equipment_Generation_MachineCode");
e.HasIndex(x => x.DriverInstanceId).HasDatabaseName("IX_Equipment_Driver");
e.HasIndex(x => x.UnsLineId).HasDatabaseName("IX_Equipment_Line");
e.HasIndex(x => x.EquipmentId).IsUnique().HasDatabaseName("UX_Equipment_LogicalId");
e.HasIndex(x => new { x.UnsLineId, x.Name }).IsUnique().HasDatabaseName("UX_Equipment_LinePath");
e.HasIndex(x => x.EquipmentUuid).IsUnique().HasDatabaseName("UX_Equipment_Uuid");
e.HasIndex(x => x.ZTag).HasFilter("[ZTag] IS NOT NULL").HasDatabaseName("IX_Equipment_ZTag");
e.HasIndex(x => x.SAPID).HasFilter("[SAPID] IS NOT NULL").HasDatabaseName("IX_Equipment_SAPID");
e.HasIndex(x => x.MachineCode).HasDatabaseName("IX_Equipment_MachineCode");
});
}
@@ -364,20 +333,19 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.AccessLevel).HasConversion<string>().HasMaxLength(16);
e.Property(x => x.PollGroupId).HasMaxLength(64);
e.Property(x => x.TagConfig).HasColumnType("nvarchar(max)");
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.DriverInstanceId, x.DeviceId }).HasDatabaseName("IX_Tag_Generation_Driver_Device");
e.HasIndex(x => new { x.GenerationId, x.EquipmentId })
e.HasIndex(x => new { x.DriverInstanceId, x.DeviceId }).HasDatabaseName("IX_Tag_Driver_Device");
e.HasIndex(x => x.EquipmentId)
.HasFilter("[EquipmentId] IS NOT NULL")
.HasDatabaseName("IX_Tag_Generation_Equipment");
e.HasIndex(x => new { x.GenerationId, x.TagId }).IsUnique().HasDatabaseName("UX_Tag_Generation_LogicalId");
e.HasIndex(x => new { x.GenerationId, x.EquipmentId, x.Name }).IsUnique()
.HasDatabaseName("IX_Tag_Equipment");
e.HasIndex(x => x.TagId).IsUnique().HasDatabaseName("UX_Tag_LogicalId");
e.HasIndex(x => new { x.EquipmentId, x.Name }).IsUnique()
.HasFilter("[EquipmentId] IS NOT NULL")
.HasDatabaseName("UX_Tag_Generation_EquipmentPath");
e.HasIndex(x => new { x.GenerationId, x.DriverInstanceId, x.FolderPath, x.Name }).IsUnique()
.HasDatabaseName("UX_Tag_EquipmentPath");
e.HasIndex(x => new { x.DriverInstanceId, x.FolderPath, x.Name }).IsUnique()
.HasFilter("[EquipmentId] IS NULL")
.HasDatabaseName("UX_Tag_Generation_FolderPath");
.HasDatabaseName("UX_Tag_FolderPath");
});
}
@@ -394,11 +362,10 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.PollGroupId).HasMaxLength(64);
e.Property(x => x.DriverInstanceId).HasMaxLength(64);
e.Property(x => x.Name).HasMaxLength(128);
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.DriverInstanceId }).HasDatabaseName("IX_PollGroup_Generation_Driver");
e.HasIndex(x => new { x.GenerationId, x.PollGroupId }).IsUnique().HasDatabaseName("UX_PollGroup_Generation_LogicalId");
e.HasIndex(x => x.DriverInstanceId).HasDatabaseName("IX_PollGroup_Driver");
e.HasIndex(x => x.PollGroupId).IsUnique().HasDatabaseName("UX_PollGroup_LogicalId");
});
}
@@ -416,36 +383,16 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.ScopeId).HasMaxLength(64);
e.Property(x => x.PermissionFlags).HasConversion<int>();
e.Property(x => x.Notes).HasMaxLength(512);
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.ClusterId }).HasDatabaseName("IX_NodeAcl_Generation_Cluster");
e.HasIndex(x => new { x.GenerationId, x.LdapGroup }).HasDatabaseName("IX_NodeAcl_Generation_Group");
e.HasIndex(x => new { x.GenerationId, x.ScopeKind, x.ScopeId })
e.HasIndex(x => x.ClusterId).HasDatabaseName("IX_NodeAcl_Cluster");
e.HasIndex(x => x.LdapGroup).HasDatabaseName("IX_NodeAcl_Group");
e.HasIndex(x => new { x.ScopeKind, x.ScopeId })
.HasFilter("[ScopeId] IS NOT NULL")
.HasDatabaseName("IX_NodeAcl_Generation_Scope");
e.HasIndex(x => new { x.GenerationId, x.NodeAclId }).IsUnique().HasDatabaseName("UX_NodeAcl_Generation_LogicalId");
e.HasIndex(x => new { x.GenerationId, x.ClusterId, x.LdapGroup, x.ScopeKind, x.ScopeId }).IsUnique()
.HasDatabaseName("UX_NodeAcl_Generation_GroupScope");
});
}
private static void ConfigureClusterNodeGenerationState(ModelBuilder modelBuilder)
{
modelBuilder.Entity<ClusterNodeGenerationState>(e =>
{
e.ToTable("ClusterNodeGenerationState");
e.HasKey(x => x.NodeId);
e.Property(x => x.NodeId).HasMaxLength(64);
e.Property(x => x.LastAppliedAt).HasColumnType("datetime2(3)");
e.Property(x => x.LastAppliedStatus).HasConversion<string>().HasMaxLength(16);
e.Property(x => x.LastAppliedError).HasMaxLength(2048);
e.Property(x => x.LastSeenAt).HasColumnType("datetime2(3)");
e.HasOne(x => x.Node).WithOne(n => n.GenerationState).HasForeignKey<ClusterNodeGenerationState>(x => x.NodeId).OnDelete(DeleteBehavior.Restrict);
e.HasOne(x => x.CurrentGeneration).WithMany().HasForeignKey(x => x.CurrentGenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => x.CurrentGenerationId).HasDatabaseName("IX_ClusterNodeGenerationState_Generation");
.HasDatabaseName("IX_NodeAcl_Scope");
e.HasIndex(x => x.NodeAclId).IsUnique().HasDatabaseName("UX_NodeAcl_LogicalId");
e.HasIndex(x => new { x.ClusterId, x.LdapGroup, x.ScopeKind, x.ScopeId }).IsUnique()
.HasDatabaseName("UX_NodeAcl_GroupScope");
});
}
@@ -640,11 +587,10 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.SourceCode).HasColumnType("nvarchar(max)");
e.Property(x => x.SourceHash).HasMaxLength(64);
e.Property(x => x.Language).HasMaxLength(16);
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.ScriptId }).IsUnique().HasDatabaseName("UX_Script_Generation_LogicalId");
e.HasIndex(x => new { x.GenerationId, x.SourceHash }).HasDatabaseName("IX_Script_Generation_SourceHash");
e.HasIndex(x => x.ScriptId).IsUnique().HasDatabaseName("UX_Script_LogicalId");
e.HasIndex(x => x.SourceHash).HasDatabaseName("IX_Script_SourceHash");
});
}
@@ -666,12 +612,11 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.Name).HasMaxLength(128);
e.Property(x => x.DataType).HasMaxLength(32);
e.Property(x => x.ScriptId).HasMaxLength(64);
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.VirtualTagId }).IsUnique().HasDatabaseName("UX_VirtualTag_Generation_LogicalId");
e.HasIndex(x => new { x.GenerationId, x.EquipmentId, x.Name }).IsUnique().HasDatabaseName("UX_VirtualTag_Generation_EquipmentPath");
e.HasIndex(x => new { x.GenerationId, x.ScriptId }).HasDatabaseName("IX_VirtualTag_Generation_Script");
e.HasIndex(x => x.VirtualTagId).IsUnique().HasDatabaseName("UX_VirtualTag_LogicalId");
e.HasIndex(x => new { x.EquipmentId, x.Name }).IsUnique().HasDatabaseName("UX_VirtualTag_EquipmentPath");
e.HasIndex(x => x.ScriptId).HasDatabaseName("IX_VirtualTag_Script");
});
}
@@ -693,12 +638,11 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.AlarmType).HasMaxLength(32);
e.Property(x => x.MessageTemplate).HasMaxLength(1024);
e.Property(x => x.PredicateScriptId).HasMaxLength(64);
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Generation).WithMany().HasForeignKey(x => x.GenerationId).OnDelete(DeleteBehavior.Restrict);
e.HasIndex(x => new { x.GenerationId, x.ScriptedAlarmId }).IsUnique().HasDatabaseName("UX_ScriptedAlarm_Generation_LogicalId");
e.HasIndex(x => new { x.GenerationId, x.EquipmentId, x.Name }).IsUnique().HasDatabaseName("UX_ScriptedAlarm_Generation_EquipmentPath");
e.HasIndex(x => new { x.GenerationId, x.PredicateScriptId }).HasDatabaseName("IX_ScriptedAlarm_Generation_Script");
e.HasIndex(x => x.ScriptedAlarmId).IsUnique().HasDatabaseName("UX_ScriptedAlarm_LogicalId");
e.HasIndex(x => new { x.EquipmentId, x.Name }).IsUnique().HasDatabaseName("UX_ScriptedAlarm_EquipmentPath");
e.HasIndex(x => x.PredicateScriptId).HasDatabaseName("IX_ScriptedAlarm_Script");
});
}
@@ -729,4 +673,79 @@ public sealed class OtOpcUaConfigDbContext(DbContextOptions<OtOpcUaConfigDbConte
e.Property(x => x.UpdatedAtUtc).HasColumnType("datetime2(3)").HasDefaultValueSql("SYSUTCDATETIME()");
});
}
private static void ConfigureDeployment(ModelBuilder modelBuilder)
{
modelBuilder.Entity<Deployment>(e =>
{
e.ToTable("Deployment");
e.HasKey(x => x.DeploymentId);
e.Property(x => x.DeploymentId).HasDefaultValueSql("NEWSEQUENTIALID()");
e.Property(x => x.RevisionHash).HasMaxLength(64).IsRequired();
e.Property(x => x.Status).HasConversion<int>();
e.Property(x => x.CreatedBy).HasMaxLength(128).IsRequired();
e.Property(x => x.CreatedAtUtc).HasColumnType("datetime2(3)").HasDefaultValueSql("SYSUTCDATETIME()");
e.Property(x => x.ArtifactBlob).HasColumnType("varbinary(max)");
e.Property(x => x.RowVersion).IsRowVersion();
e.Property(x => x.FailureReason).HasMaxLength(2048);
e.Property(x => x.SealedAtUtc).HasColumnType("datetime2(3)");
e.HasIndex(x => x.Status).HasDatabaseName("IX_Deployment_Status");
e.HasIndex(x => x.CreatedAtUtc).HasDatabaseName("IX_Deployment_CreatedAt");
});
}
private static void ConfigureNodeDeploymentState(ModelBuilder modelBuilder)
{
modelBuilder.Entity<NodeDeploymentState>(e =>
{
e.ToTable("NodeDeploymentState");
e.HasKey(x => new { x.NodeId, x.DeploymentId });
e.Property(x => x.NodeId).HasMaxLength(64);
e.Property(x => x.Status).HasConversion<int>();
e.Property(x => x.StartedAtUtc).HasColumnType("datetime2(3)").HasDefaultValueSql("SYSUTCDATETIME()");
e.Property(x => x.AppliedAtUtc).HasColumnType("datetime2(3)");
e.Property(x => x.FailureReason).HasMaxLength(2048);
e.Property(x => x.RowVersion).IsRowVersion();
e.HasOne(x => x.Node).WithMany().HasForeignKey(x => x.NodeId).OnDelete(DeleteBehavior.Restrict);
e.HasOne(x => x.Deployment).WithMany().HasForeignKey(x => x.DeploymentId).OnDelete(DeleteBehavior.Cascade);
e.HasIndex(x => x.DeploymentId).HasDatabaseName("IX_NodeDeploymentState_Deployment");
e.HasIndex(x => x.Status).HasDatabaseName("IX_NodeDeploymentState_Status");
});
}
private static void ConfigureConfigEdit(ModelBuilder modelBuilder)
{
modelBuilder.Entity<ConfigEdit>(e =>
{
e.ToTable("ConfigEdit", t =>
{
t.HasCheckConstraint("CK_ConfigEdit_FieldsJson_IsJson", "ISJSON(FieldsJson) = 1");
});
e.HasKey(x => x.EditId);
e.Property(x => x.EditId).HasDefaultValueSql("NEWSEQUENTIALID()");
e.Property(x => x.EntityType).HasMaxLength(64).IsRequired();
e.Property(x => x.FieldsJson).HasColumnType("nvarchar(max)").IsRequired();
e.Property(x => x.EditedBy).HasMaxLength(128).IsRequired();
e.Property(x => x.EditedAtUtc).HasColumnType("datetime2(3)").HasDefaultValueSql("SYSUTCDATETIME()");
e.Property(x => x.SourceNode).HasMaxLength(64).IsRequired();
// Replays of admin operations group rows by ExecutionId, then by time.
e.HasIndex(x => new { x.EntityType, x.EntityId }).HasDatabaseName("IX_ConfigEdit_Entity");
e.HasIndex(x => x.ExecutionId).HasFilter("[ExecutionId] IS NOT NULL").HasDatabaseName("IX_ConfigEdit_Execution");
e.HasIndex(x => x.EditedAtUtc).HasDatabaseName("IX_ConfigEdit_EditedAt");
});
}
private static void ConfigureDataProtectionKey(ModelBuilder modelBuilder)
{
// ASP.NET DataProtection ships its own EF mapping; override only the table name so it
// matches the rest of the schema's PascalCase convention.
modelBuilder.Entity<DataProtectionKey>(e =>
{
e.ToTable("DataProtectionKeys");
});
}
}
@@ -0,0 +1,24 @@
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
namespace ZB.MOM.WW.OtOpcUa.Configuration;
public static class ServiceCollectionExtensions
{
public const string ConnectionStringName = "ConfigDb";
/// <summary>
/// Registers <see cref="IDbContextFactory{TContext}"/> for <see cref="OtOpcUaConfigDbContext"/>
/// using the connection string named <c>ConfigDb</c> from <see cref="IConfiguration"/>.
/// </summary>
public static IServiceCollection AddOtOpcUaConfigDb(this IServiceCollection services, IConfiguration configuration)
{
var connectionString = configuration.GetConnectionString(ConnectionStringName)
?? throw new InvalidOperationException(
$"Connection string '{ConnectionStringName}' is required. Add it to appsettings.json or the OTOPCUA_CONFIG_CONNECTION env var.");
services.AddDbContextFactory<OtOpcUaConfigDbContext>(opt => opt.UseSqlServer(connectionString));
return services;
}
}
@@ -228,14 +228,9 @@ public static class DraftValidator
$"Toggle the missing node(s) back on or change RedundancyMode/NodeCount to match.",
cluster.ClusterId));
// Primary uniqueness — decision #84. Two Primary nodes is always an invariant violation
// regardless of mode; catch it here so publish fails loud rather than the runtime
// demoting both to ServiceLevelBand.InvalidTopology at boot.
var primaryCount = clusterNodes.Count(n => n.Enabled && n.RedundancyRole == RedundancyRole.Primary);
if (primaryCount > 1)
errors.Add(new("ClusterMultiplePrimary",
$"Cluster '{cluster.ClusterId}' has {primaryCount} Enabled Primary nodes. At most one Primary per cluster.",
cluster.ClusterId));
// v2: the v1 "exactly one Primary per cluster" invariant is gone. RedundancyRole was
// dropped in Task 14d; in v2 the Akka cluster's role-leader-of-"driver" elects the
// primary at runtime, so there is no static configuration to validate here.
return errors;
}
@@ -12,16 +12,17 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.EntityFrameworkCore" Version="10.0.0"/>
<PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="10.0.0"/>
<PackageReference Include="Microsoft.EntityFrameworkCore.Design" Version="10.0.0">
<PackageReference Include="Microsoft.EntityFrameworkCore"/>
<PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer"/>
<PackageReference Include="Microsoft.EntityFrameworkCore.Design">
<PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
</PackageReference>
<PackageReference Include="Microsoft.Extensions.Configuration.Abstractions" Version="10.0.0"/>
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" Version="10.0.0"/>
<PackageReference Include="LiteDB" Version="5.0.21"/>
<PackageReference Include="Polly.Core" Version="8.6.6"/>
<PackageReference Include="Microsoft.AspNetCore.DataProtection.EntityFrameworkCore"/>
<PackageReference Include="Microsoft.Extensions.Configuration.Abstractions"/>
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions"/>
<PackageReference Include="LiteDB"/>
<PackageReference Include="Polly.Core"/>
</ItemGroup>
<ItemGroup>
@@ -25,7 +25,8 @@ public interface IDriver
/// <summary>
/// Apply a config change in place without tearing down the driver process.
/// Used by <c>IGenerationApplier</c> when only this driver's config changed in the new generation.
/// Invoked by the v2 <c>DriverInstanceActor</c> when ApplyDelta reports that only this
/// driver's config changed in the new deployment.
/// </summary>
/// <remarks>
/// Per <c>docs/v2/driver-stability.md</c> §"In-process only (Tier A/B)" — Reinitialize is the
@@ -12,8 +12,8 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.Data.Sqlite" Version="9.0.0"/>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="Microsoft.Data.Sqlite"/>
<PackageReference Include="Serilog"/>
</ItemGroup>
<ItemGroup>
@@ -12,7 +12,7 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="Serilog"/>
</ItemGroup>
<ItemGroup>
@@ -15,8 +15,8 @@
<!-- Roslyn scripting API — compiles user C# snippets with a constrained ScriptOptions
allow-list so scripts can't reach Process/File/HttpClient/reflection. Per Phase 7
plan decisions #1 + #6. -->
<PackageReference Include="Microsoft.CodeAnalysis.CSharp.Scripting" Version="4.12.0"/>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="Microsoft.CodeAnalysis.CSharp.Scripting"/>
<PackageReference Include="Serilog"/>
</ItemGroup>
<ItemGroup>
@@ -12,7 +12,7 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="Serilog"/>
</ItemGroup>
<ItemGroup>
@@ -262,9 +262,8 @@ public static class EquipmentNodeWalker
/// <summary>
/// Pre-loaded + pre-filtered snapshot of one Equipment-kind namespace's worth of Config
/// DB rows. All four collections are scoped to the same
/// <see cref="Configuration.Entities.ConfigGeneration"/> + the same
/// <see cref="Configuration.Entities.Namespace"/> row. The walker assumes this filter
/// was applied by the caller + does no cross-generation or cross-namespace validation.
/// was applied by the caller + does no cross-namespace validation.
/// </summary>
public sealed record EquipmentNamespaceContent(
IReadOnlyList<UnsArea> Areas,
@@ -17,8 +17,8 @@
</ItemGroup>
<ItemGroup>
<PackageReference Include="Polly.Core" Version="8.6.6"/>
<PackageReference Include="Serilog" Version="4.3.0"/>
<PackageReference Include="Polly.Core"/>
<PackageReference Include="Serilog"/>
</ItemGroup>
<ItemGroup>
@@ -14,7 +14,7 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="CliFx" Version="2.3.6"/>
<PackageReference Include="CliFx"/>
</ItemGroup>
<ItemGroup>
@@ -14,7 +14,7 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="CliFx" Version="2.3.6"/>
<PackageReference Include="CliFx"/>
</ItemGroup>
<ItemGroup>
@@ -12,9 +12,9 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="CliFx" Version="2.3.6"/>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="Serilog.Sinks.Console" Version="6.0.0"/>
<PackageReference Include="CliFx"/>
<PackageReference Include="Serilog"/>
<PackageReference Include="Serilog.Sinks.Console"/>
</ItemGroup>
<ItemGroup>
@@ -14,7 +14,7 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="CliFx" Version="2.3.6"/>
<PackageReference Include="CliFx"/>
</ItemGroup>
<ItemGroup>
@@ -14,7 +14,7 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="CliFx" Version="2.3.6"/>
<PackageReference Include="CliFx"/>
</ItemGroup>
<ItemGroup>
@@ -14,7 +14,7 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="CliFx" Version="2.3.6"/>
<PackageReference Include="CliFx"/>
</ItemGroup>
<ItemGroup>
@@ -14,7 +14,7 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="CliFx" Version="2.3.6"/>
<PackageReference Include="CliFx"/>
</ItemGroup>
<ItemGroup>
@@ -21,7 +21,7 @@
<!-- libplctag managed wrapper (pulls in libplctag.NativeImport transitively).
Decision #11 — EtherNet/IP + CIP + Logix symbolic against ControlLogix / CompactLogix /
Micro800 / SLC500 / PLC-5. -->
<PackageReference Include="libplctag" Version="1.5.2"/>
<PackageReference Include="libplctag"/>
</ItemGroup>
<ItemGroup>
@@ -21,7 +21,7 @@
<!-- libplctag — ab_pccc protocol for SLC 500 / MicroLogix / PLC-5 / LogixPccc.
Decision #41 — AbLegacy split from AbCip since PCCC addressing (file-based N7:0) and
Logix addressing (symbolic Motor1.Speed) pull the abstraction in incompatible directions. -->
<PackageReference Include="libplctag" Version="1.5.2"/>
<PackageReference Include="libplctag"/>
</ItemGroup>
<ItemGroup>
@@ -18,7 +18,7 @@
</ItemGroup>
<ItemGroup>
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" Version="10.0.0"/>
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions"/>
</ItemGroup>
<ItemGroup>
@@ -46,11 +46,11 @@
built against) — a package-name mistake, not just a version skew —
which would surface as a runtime MissingMethodException the first
time the client's retry pipeline ran. -->
<PackageReference Include="Google.Protobuf" Version="3.34.1" />
<PackageReference Include="Grpc.Core.Api" Version="2.76.0" />
<PackageReference Include="Grpc.Net.Client" Version="2.76.0" />
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" Version="10.0.7" />
<PackageReference Include="Polly.Core" Version="8.6.6" />
<PackageReference Include="Google.Protobuf" />
<PackageReference Include="Grpc.Core.Api" />
<PackageReference Include="Grpc.Net.Client" />
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" />
<PackageReference Include="Polly.Core" />
</ItemGroup>
<ItemGroup>
@@ -13,8 +13,8 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="MessagePack" Version="2.5.187"/>
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" Version="10.0.3"/>
<PackageReference Include="MessagePack"/>
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions"/>
</ItemGroup>
<ItemGroup>
@@ -20,13 +20,13 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="MessagePack" Version="2.5.187"/>
<PackageReference Include="System.IO.Pipes.AccessControl" Version="5.0.0"/>
<PackageReference Include="System.Memory" Version="4.5.5"/>
<PackageReference Include="System.Threading.Tasks.Extensions" Version="4.5.4"/>
<PackageReference Include="System.Data.SqlClient" Version="4.9.0"/>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="Serilog.Sinks.File" Version="7.0.0"/>
<PackageReference Include="MessagePack"/>
<PackageReference Include="System.IO.Pipes.AccessControl"/>
<PackageReference Include="System.Memory"/>
<PackageReference Include="System.Threading.Tasks.Extensions"/>
<PackageReference Include="System.Data.SqlClient"/>
<PackageReference Include="Serilog"/>
<PackageReference Include="Serilog.Sinks.File"/>
</ItemGroup>
<ItemGroup>

Some files were not shown because too many files have changed in this diff Show More