docs(v2): Architecture-v2 + Cluster + ControlPlane + Runtime overviews (Task 65)
Four new docs at docs/v2/ giving a single-page tour of each v2 piece: - Architecture-v2.md: top-level mental model (fused Host + roles + cluster + live-edit) - Cluster.md: AkkaClusterOptions + IClusterRoleInfo + WithOtOpcUaClusterBootstrap - ControlPlane.md: 5 admin singletons + DPS topics + deploy flow + failover recovery - Runtime.md: per-node actor tree + state machines + engine-wiring follow-up map Each links back to the design doc for depth. Architecture-v2 cross-references the other three + ServiceHosting + Redundancy + security.
This commit is contained in:
127
docs/v2/Architecture-v2.md
Normal file
127
docs/v2/Architecture-v2.md
Normal file
@@ -0,0 +1,127 @@
|
||||
# OtOpcUa v2 Architecture
|
||||
|
||||
Single-page tour of the v2 layout. For decision history + tradeoffs, see [`2026-05-26-akka-hosting-alignment-design.md`](../plans/2026-05-26-akka-hosting-alignment-design.md).
|
||||
|
||||
## Big picture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ OtOpcUa.Host │ (fused binary)
|
||||
│ │
|
||||
│ reads OTOPCUA_ROLES env, mounts: │
|
||||
│ ┌─────────────────────────────────────┐ │
|
||||
│ │ admin → Blazor + auth + control- │ │
|
||||
│ │ plane singletons │ │
|
||||
│ │ driver → OPC UA endpoint + │ │
|
||||
│ │ per-node actors │ │
|
||||
│ └─────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────┘
|
||||
│
|
||||
│ joins
|
||||
▼
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ Akka.NET cluster │
|
||||
│ (split-brain resolver: keep-oldest, 15s) │
|
||||
└─────────────────────────────────────────────┘
|
||||
|
||||
shared by every node: ┌─────────────────┐
|
||||
│ ConfigDb (SQL) │ live-edit + Deployment artifacts + audit
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
The v1 setup was two separate Windows services (`OtOpcUa.Server` + `OtOpcUa.Admin`) talking through the DB. v2 collapses them into one binary with role gating, and adds an Akka cluster so admin singletons can drive deploys and the redundancy story is automatic.
|
||||
|
||||
## Project layout
|
||||
|
||||
```
|
||||
src/Core/ shared abstractions, no Server deps
|
||||
ZB.MOM.WW.OtOpcUa.Commons types + Akka message contracts + interfaces
|
||||
ZB.MOM.WW.OtOpcUa.Cluster HOCON, AkkaClusterOptions, IClusterRoleInfo
|
||||
ZB.MOM.WW.OtOpcUa.Configuration EF Core DbContext + entities
|
||||
|
||||
src/Server/ server-side projects
|
||||
ZB.MOM.WW.OtOpcUa.Security cookie+JWT auth, LDAP, JwtTokenService
|
||||
ZB.MOM.WW.OtOpcUa.ControlPlane admin-role cluster singletons
|
||||
ZB.MOM.WW.OtOpcUa.Runtime driver-role per-node actors
|
||||
ZB.MOM.WW.OtOpcUa.OpcUaServer OPC UA endpoint facade + Phase7Composer
|
||||
ZB.MOM.WW.OtOpcUa.AdminUI Blazor Razor class library
|
||||
ZB.MOM.WW.OtOpcUa.Host fused binary (Program.cs)
|
||||
```
|
||||
|
||||
| Project | Role | Doc |
|
||||
|---|---|---|
|
||||
| Cluster | Bootstrap + cluster topology view | [Cluster.md](Cluster.md) |
|
||||
| ControlPlane | Admin singletons (deploy, audit, fleet, redundancy) | [ControlPlane.md](ControlPlane.md) |
|
||||
| Runtime | Driver-role actor tree | [Runtime.md](Runtime.md) |
|
||||
| Security | Cookie+JWT auth, LDAP, /auth/* endpoints | [../security.md](../security.md) |
|
||||
| OpcUaServer | OPC UA endpoint host + composer | [../OpcUaServer.md](../OpcUaServer.md) |
|
||||
| Host | Role-gated DI graph + Program.cs | [../ServiceHosting.md](../ServiceHosting.md) |
|
||||
|
||||
## Role gating
|
||||
|
||||
`Program.cs` reads `OTOPCUA_ROLES` once (per process) and decides what to wire:
|
||||
|
||||
```csharp
|
||||
var roles = RoleParser.Parse(Environment.GetEnvironmentVariable("OTOPCUA_ROLES"));
|
||||
var hasAdmin = roles.Contains("admin");
|
||||
var hasDriver = roles.Contains("driver");
|
||||
|
||||
builder.Services.AddOtOpcUaConfigDb(builder.Configuration);
|
||||
builder.Services.AddOtOpcUaCluster(builder.Configuration);
|
||||
|
||||
builder.Services.AddAkka("otopcua", (ab, sp) =>
|
||||
{
|
||||
ab.WithOtOpcUaClusterBootstrap(sp); // HOCON + remote + cluster options
|
||||
if (hasAdmin) ab.WithOtOpcUaControlPlaneSingletons();
|
||||
if (hasDriver) ab.WithOtOpcUaRuntimeActors();
|
||||
});
|
||||
|
||||
if (hasAdmin)
|
||||
{
|
||||
builder.Services.AddOtOpcUaAuth(builder.Configuration);
|
||||
builder.Services.AddAdminUI();
|
||||
// SignalR, AdminOpsClient, etc.
|
||||
}
|
||||
|
||||
builder.Services.AddOtOpcUaHealth();
|
||||
```
|
||||
|
||||
There is a **single** ActorSystem. Cluster singletons + per-node actors share it via the `Akka.Hosting` registry. This was a v2 fix (the initial Phase 9 wiring ran two ActorSystems by mistake; see commit `d6fac2d`).
|
||||
|
||||
## Live-edit vs draft/publish
|
||||
|
||||
v1 had `ConfigGeneration(Draft|Published)` with every live-edit entity FK'd to a generation. Edits accumulated in a Draft until Publish promoted them.
|
||||
|
||||
v2 removes that entirely:
|
||||
|
||||
- No `ConfigGeneration` table, no `GenerationId` columns.
|
||||
- Every live-edit entity has a `RowVersion` (`IsRowVersion()`) for last-write-wins.
|
||||
- Audit goes to `ConfigEdit` (per-row delta) and `ConfigAuditLog` (event-level).
|
||||
- Deploys snapshot the *current* DB state into an immutable `Deployment.ArtifactBlob` + its `RevisionHash`. That artifact is what driver nodes apply.
|
||||
|
||||
See [ControlPlane.md § Deploy flow](ControlPlane.md#deploy-flow) for the end-to-end dispatch + ACK + seal sequence.
|
||||
|
||||
## NodeId
|
||||
|
||||
Each cluster member has a `NodeId` derived as `{PublicHostname}:{Port}` of the Akka remote endpoint. `ClusterRoleInfo.LocalNode` + `ConfigPublishCoordinator.DiscoverDriverNodes()` use the same formula so they always agree. The port suffix makes loopback test deployments distinguishable (commit `5cfbe8b`); in production the hostname alone is already unique.
|
||||
|
||||
## Health endpoints
|
||||
|
||||
| Path | Returns 200 when… |
|
||||
|---|---|
|
||||
| `/healthz` | Process is alive (no checks). |
|
||||
| `/health/ready` | DB reachable + this node is `Up` in the cluster. |
|
||||
| `/health/active` | This node is the admin role-leader (used by Traefik/HA-LB to pin traffic). |
|
||||
|
||||
## What lives where (quick map)
|
||||
|
||||
| Concern | Project | Entry point |
|
||||
|---|---|---|
|
||||
| Read OTOPCUA_ROLES | `Cluster.RoleParser` | static `Parse(string?)` |
|
||||
| Cluster lifecycle | `Cluster.WithOtOpcUaClusterBootstrap` | extension on `AkkaConfigurationBuilder` |
|
||||
| Local node identity | `Cluster.IClusterRoleInfo.LocalNode` | DI singleton |
|
||||
| Admin singletons | `ControlPlane.WithOtOpcUaControlPlaneSingletons` | extension on `AkkaConfigurationBuilder` |
|
||||
| Driver actors | `Runtime.WithOtOpcUaRuntimeActors` | extension on `AkkaConfigurationBuilder` |
|
||||
| Auth pipeline | `Security.AddOtOpcUaAuth` + `MapOtOpcUaAuth` | extensions on `IServiceCollection` / `IEndpointRouteBuilder` |
|
||||
| OPC UA facade | `OpcUaServer.OpcUaApplicationHost` | runtime host, started by driver-role startup |
|
||||
| Health endpoints | `Host.Health.AddOtOpcUaHealth` + `MapOtOpcUaHealth` | extensions on `IServiceCollection` / `IEndpointRouteBuilder` |
|
||||
102
docs/v2/Cluster.md
Normal file
102
docs/v2/Cluster.md
Normal file
@@ -0,0 +1,102 @@
|
||||
# OtOpcUa.Cluster
|
||||
|
||||
Akka.NET cluster bootstrap + topology view. Used by every other server-side project to talk to the live cluster.
|
||||
|
||||
Path: `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/`
|
||||
|
||||
## Public surface
|
||||
|
||||
| Type | Role |
|
||||
|---|---|
|
||||
| `AkkaClusterOptions` | DI-bound options from `appsettings.json::Cluster`. Hostname/Port/PublicHostname/SeedNodes/Roles. |
|
||||
| `IClusterRoleInfo` (interface in Commons) | Live view of cluster membership + role-leader topology. Thread-safe + event-raising. |
|
||||
| `ClusterRoleInfo` | Implementation. Subscribes to `ClusterEvent.IMemberEvent` + `RoleLeaderChanged` + `LeaderChanged`. |
|
||||
| `HoconLoader.LoadBaseConfig()` | Reads the embedded `Resources/akka.conf`. |
|
||||
| `RoleParser.Parse(string?)` | Parses `OTOPCUA_ROLES` env var into a deduped `string[]`. |
|
||||
| `ServiceCollectionExtensions.AddOtOpcUaCluster(configuration)` | Binds options + registers `IClusterRoleInfo` singleton. **Does not** start an ActorSystem. |
|
||||
| `WithOtOpcUaClusterBootstrap(serviceProvider)` | Extension on `AkkaConfigurationBuilder`. Loads embedded HOCON + applies `WithRemoting(...)` + `WithClustering(...)` from options. |
|
||||
|
||||
## Bootstrap flow
|
||||
|
||||
```csharp
|
||||
// Program.cs
|
||||
builder.Services.AddOtOpcUaCluster(builder.Configuration);
|
||||
|
||||
builder.Services.AddAkka("otopcua", (ab, sp) =>
|
||||
{
|
||||
ab.WithOtOpcUaClusterBootstrap(sp); // HOCON + remote + cluster
|
||||
// …singletons + node actors layered on
|
||||
});
|
||||
```
|
||||
|
||||
Order matters: `AddOtOpcUaCluster` must come before `AddAkka` so the options binding has run by the time the `AddAkka` lambda fires. Inside the lambda, `WithOtOpcUaClusterBootstrap` resolves `IOptions<AkkaClusterOptions>` from `sp` and writes them into the Akka builder.
|
||||
|
||||
The single ActorSystem this produces is what every other v2 piece runs on. There is no second Akka instance — that was a Phase 9 bug (commit `d6fac2d` consolidated).
|
||||
|
||||
## Embedded HOCON
|
||||
|
||||
`src/Core/ZB.MOM.WW.OtOpcUa.Cluster/Resources/akka.conf` contains:
|
||||
|
||||
| Setting | Value | Why |
|
||||
|---|---|---|
|
||||
| `akka.actor.provider` | `cluster` | Required for `Cluster.Get(system)` to work. |
|
||||
| `akka.cluster.split-brain-resolver.active-strategy` | `keep-oldest` | Smaller/younger side downs itself on partition. |
|
||||
| `akka.cluster.split-brain-resolver.stable-after` | `15s` | Time before SBR acts. |
|
||||
| `akka.cluster.failure-detector.threshold` | `10.0` | Higher than default (8.0) for GC-pause tolerance. |
|
||||
| `opcua-synchronized-dispatcher.type` | `PinnedDispatcher` | Dedicated thread for `OpcUaPublishActor` so SDK calls stay marshalled. |
|
||||
|
||||
The Cluster.Tests project verifies these key values stay correct (`HoconLoaderTests`).
|
||||
|
||||
## Configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"Cluster": {
|
||||
"Hostname": "0.0.0.0",
|
||||
"Port": 4053,
|
||||
"PublicHostname": "node-a.lan",
|
||||
"SeedNodes": ["akka.tcp://otopcua@node-a.lan:4053"],
|
||||
"Roles": ["admin", "driver"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- `Hostname`: interface to bind. `0.0.0.0` listens on every interface.
|
||||
- `Port`: TCP port for cluster gossip. Default 4053.
|
||||
- `PublicHostname`: address advertised in cluster gossip. Must be reachable by every other node.
|
||||
- `SeedNodes`: where new nodes go to join. List one (or two) stable nodes. First node bootstraps the cluster from its own address.
|
||||
- `Roles`: free-form tags Akka gossip propagates. v2 uses `admin` + `driver`; per-role wiring in `Program.cs` reads `OTOPCUA_ROLES` env var, not this list — these two should stay in sync.
|
||||
|
||||
## IClusterRoleInfo
|
||||
|
||||
Anywhere in the host that needs the local node's identity or a view of who-else-is-in-the-cluster, inject `IClusterRoleInfo`:
|
||||
|
||||
```csharp
|
||||
public sealed class MyService(IClusterRoleInfo cluster)
|
||||
{
|
||||
public NodeId Self => cluster.LocalNode;
|
||||
public IReadOnlyList<NodeId> Drivers => cluster.MembersWithRole("driver");
|
||||
public NodeId? AdminLeader => cluster.RoleLeader("admin");
|
||||
|
||||
public MyService(IClusterRoleInfo cluster)
|
||||
{
|
||||
cluster.RoleLeaderChanged += (_, e) =>
|
||||
Console.WriteLine($"role={e.Role}: {e.PreviousLeader} → {e.NewLeader}");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`LocalNode` is `{PublicHostname}:{Port}` (the port suffix lets loopback test deployments stay distinct; production hostnames are already unique). `ConfigPublishCoordinator` uses the same `{host}:{port}` formula so the expected-ack set and the driver self-identification agree (commit `5cfbe8b`).
|
||||
|
||||
## Lifecycle
|
||||
|
||||
Akka.Hosting owns the lifecycle: `IHostedService` starts the ActorSystem at host start, runs `CoordinatedShutdown.ClusterLeavingReason` on host stop. The Cluster project does not register its own `IHostedService` (the v1 `AkkaHostedService` was deleted in commit `d6fac2d`).
|
||||
|
||||
## Tests
|
||||
|
||||
`tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests/` covers:
|
||||
|
||||
- `HoconLoaderTests` — embedded resource loads + key settings parse correctly.
|
||||
- `RoleParserTests` — comma-split + dedup + trim semantics.
|
||||
|
||||
Cross-project integration is in `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/` (cluster formation, deploy round-trip).
|
||||
99
docs/v2/ControlPlane.md
Normal file
99
docs/v2/ControlPlane.md
Normal file
@@ -0,0 +1,99 @@
|
||||
# OtOpcUa.ControlPlane
|
||||
|
||||
Five admin-role cluster singletons that drive the v2 deploy, audit, fleet, and redundancy stories. Path: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/`.
|
||||
|
||||
## Singletons
|
||||
|
||||
| Actor | File | Marker key | Role |
|
||||
|---|---|---|---|
|
||||
| `ConfigPublishCoordinator` | `Coordinators/ConfigPublishCoordinator.cs` | `ConfigPublishCoordinatorKey` | Dispatches `DispatchDeployment`, collects `ApplyAck`s, seals/fails/times-out. |
|
||||
| `AdminOperationsActor` | `AdminOperations/AdminOperationsActor.cs` | `AdminOperationsActorKey` | Receives `StartDeployment` from the UI, snapshots ConfigDb via `ConfigComposer`, persists `Deployment` row + `ConfigEdit` marker, tells the coordinator to dispatch. |
|
||||
| `AuditWriterActor` | `Audit/AuditWriterActor.cs` | `AuditWriterActorKey` | Batched `ConfigAuditLog` writer. Flushes every 500 events or 5 s. In-buffer dedup; cross-restart dedup tracked as F3. |
|
||||
| `FleetStatusBroadcaster` | `Fleet/FleetStatusBroadcaster.cs` | `FleetStatusBroadcasterKey` | Aggregates per-node `FleetNodeStatus` heartbeats; publishes `FleetStatusChanged` on the `fleet-status` DPS topic (SignalR bridge tracked as F16). |
|
||||
| `RedundancyStateActor` | `Redundancy/RedundancyStateActor.cs` | `RedundancyStateActorKey` | Cluster-event subscriber; debounces 250 ms; publishes `RedundancyStateChanged` on the `redundancy-state` DPS topic. |
|
||||
|
||||
All five register via `WithOtOpcUaControlPlaneSingletons()` (extension on `AkkaConfigurationBuilder`). Each uses `ClusterSingletonOptions { Role = "admin" }` so the singleton runs on the admin role-leader and migrates to the next admin node on failover.
|
||||
|
||||
```csharp
|
||||
// Program.cs (admin role only)
|
||||
builder.Services.AddAkka("otopcua", (ab, sp) =>
|
||||
{
|
||||
ab.WithOtOpcUaClusterBootstrap(sp);
|
||||
if (hasAdmin) ab.WithOtOpcUaControlPlaneSingletons();
|
||||
if (hasDriver) ab.WithOtOpcUaRuntimeActors();
|
||||
});
|
||||
```
|
||||
|
||||
Resolve from anywhere via `IRequiredActor<T>` or the `ActorRegistry`:
|
||||
|
||||
```csharp
|
||||
public sealed class AdminOperationsClient(ActorRegistry registry) : IAdminOperationsClient
|
||||
{
|
||||
private readonly IActorRef _proxy = registry.Get<AdminOperationsActorKey>();
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
## Deploy flow
|
||||
|
||||
```
|
||||
UI → IAdminOperationsClient.StartDeploymentAsync(createdBy)
|
||||
│ Ask the AdminOperationsActor singleton proxy
|
||||
▼
|
||||
AdminOperationsActor
|
||||
│ ConfigComposer.SnapshotAndFlattenAsync(db) → ConfigArtifact(blob, revHash)
|
||||
│ insert Deployment(Dispatching) + ConfigEdit marker
|
||||
│ Tell coordinator → DispatchDeployment
|
||||
▼
|
||||
ConfigPublishCoordinator
|
||||
│ DiscoverDriverNodes() → expected ACK set (host:port per member)
|
||||
│ insert NodeDeploymentState(Applying) per driver
|
||||
│ Publish DispatchDeployment on "deployments" topic
|
||||
│ Start apply-deadline timer (2 min default)
|
||||
▼ DistributedPubSub
|
||||
DriverHostActor (on each driver node — subscribed to "deployments")
|
||||
│ PreStart subscribed; current state Steady(rev)
|
||||
│ if currentRev == msg.rev → immediate ApplyAck(Applied) (idempotent)
|
||||
│ else Become(Applying) → write NodeDeploymentStatus → ApplyAck
|
||||
▼ via "deployment-acks" topic
|
||||
ConfigPublishCoordinator (subscribed to "deployment-acks" in PreStart)
|
||||
│ PersistNodeAck + collect
|
||||
│ all-Applied → Sealed
|
||||
│ any-Failed → PartiallyFailed
|
||||
│ deadline → TimedOut
|
||||
```
|
||||
|
||||
The dedicated `deployment-acks` topic + coordinator subscription was added in commit `5cfbe8b`. Before that, ACKs were published back on `deployments` and the coordinator (not subscribed) silently dropped them — deployments hung at `AwaitingApplyAcks` forever in multi-node tests.
|
||||
|
||||
### Failover recovery
|
||||
|
||||
If the admin singleton fails over mid-deploy, the new instance's `PreStart` queries `NodeDeploymentState` for any `Dispatching`/`AwaitingApplyAcks` row, rebuilds `_expectedAcks` + `_acks` from persisted state, and resumes the deadline timer. See `Coordinators/ConfigPublishCoordinator.cs::PreStart`.
|
||||
|
||||
## ConfigComposer
|
||||
|
||||
Pure function `SnapshotAndFlattenAsync(db) → ConfigArtifact(byte[], string)`:
|
||||
|
||||
1. Reads every live-edit table.
|
||||
2. Serialises to a stable byte[] (deterministic ordering).
|
||||
3. Computes SHA-256 over the bytes → 64-hex `RevisionHash`.
|
||||
|
||||
Same DB state → same artifact + same hash. That's what makes the `NoChanges` outcome work (AdminOperations compares the proposed hash to the last sealed deployment's hash).
|
||||
|
||||
## ServiceLevelCalculator
|
||||
|
||||
Pure function exposed at `Redundancy/ServiceLevelCalculator.Compute(NodeHealthInputs)`. Returns the OPC UA `ServiceLevel` byte per the truth table in [Redundancy.md](../Redundancy.md#servicelevel-tiers-part-5-65). No side effects; trivially unit-testable.
|
||||
|
||||
## DPS topics
|
||||
|
||||
| Topic | Publisher | Subscribers |
|
||||
|---|---|---|
|
||||
| `deployments` | ConfigPublishCoordinator | DriverHostActor (per-node) |
|
||||
| `deployment-acks` | DriverHostActor | ConfigPublishCoordinator |
|
||||
| `fleet-status` | FleetStatusBroadcaster | (SignalR bridge — F16) |
|
||||
| `redundancy-state` | RedundancyStateActor | (per-node ServiceLevel calc — F10) |
|
||||
|
||||
## Tests
|
||||
|
||||
`tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/` — 29 tests covering coordinator (happy path, timeout, failover recovery), AdminOps (StartDeployment outcomes), AuditWriter (batching, dedup), FleetStatusBroadcaster (heartbeat staleness), RedundancyStateActor (debounce, snapshot), ConfigComposer (purity), ServiceLevelCalculator (truth table).
|
||||
|
||||
Multi-node tests (cross-ActorSystem) are in `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/`.
|
||||
126
docs/v2/Runtime.md
Normal file
126
docs/v2/Runtime.md
Normal file
@@ -0,0 +1,126 @@
|
||||
# OtOpcUa.Runtime
|
||||
|
||||
Driver-role actor tree — one set per node. Path: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/`.
|
||||
|
||||
## Actor tree
|
||||
|
||||
```
|
||||
DriverHostActor (per node)
|
||||
│ state machine: Steady ⇄ Applying ⇄ Stale
|
||||
│
|
||||
├──▶ DriverInstanceActor (per configured DriverInstance row)
|
||||
│ state: Connecting → Connected → Reconnecting (or Stubbed)
|
||||
│
|
||||
├──▶ VirtualTagActor (per VirtualTag row)
|
||||
│ compiles + evaluates expression, publishes derived value
|
||||
│
|
||||
├──▶ ScriptedAlarmActor (per ScriptedAlarm row)
|
||||
│ state: Inactive ⇄ Active ⇄ Acknowledged
|
||||
│
|
||||
├──▶ OpcUaPublishActor (per node, pinned dispatcher)
|
||||
│ marshalled OPC UA SDK writes + RebuildAddressSpace
|
||||
│
|
||||
├──▶ HistorianAdapterActor (per node)
|
||||
│ pipe IPC to Wonderware historian sidecar
|
||||
│
|
||||
├──▶ PeerOpcUaProbeActor (per peer node)
|
||||
│ opc.tcp ping → redundancy-state DPS topic
|
||||
│
|
||||
└──▶ DbHealthProbeActor (per node)
|
||||
cached SELECT 1; consumed by /health/ready + redundancy calc
|
||||
```
|
||||
|
||||
## Public surface
|
||||
|
||||
| Type | File |
|
||||
|---|---|
|
||||
| `WithOtOpcUaRuntimeActors()` | `ServiceCollectionExtensions.cs` — extension on `AkkaConfigurationBuilder`. Spawns `DriverHostActor` + `DbHealthProbeActor` on the host's ActorSystem. |
|
||||
| `DriverHostActor` | `Drivers/DriverHostActor.cs` |
|
||||
| `DriverInstanceActor` | `Drivers/DriverInstanceActor.cs` |
|
||||
| `VirtualTagActor` | `VirtualTags/VirtualTagActor.cs` |
|
||||
| `ScriptedAlarmActor` | `ScriptedAlarms/ScriptedAlarmActor.cs` |
|
||||
| `OpcUaPublishActor` | `OpcUa/OpcUaPublishActor.cs` |
|
||||
| `HistorianAdapterActor` | `Historian/HistorianAdapterActor.cs` |
|
||||
| `PeerOpcUaProbeActor` | `Health/PeerOpcUaProbeActor.cs` |
|
||||
| `DbHealthProbeActor` | `Health/DbHealthProbeActor.cs` |
|
||||
|
||||
Marker keys for registry lookup: `DriverHostActorKey`, `DbHealthProbeActorKey`.
|
||||
|
||||
## DriverHostActor
|
||||
|
||||
Per-node supervisor with three Become states:
|
||||
|
||||
| State | Meaning |
|
||||
|---|---|
|
||||
| `Steady(rev)` | Caught up. `DispatchDeployment` with `msg.rev == currentRev` → immediate `ApplyAck(Applied)` (idempotent). New rev → `Become(Applying)`. |
|
||||
| `Applying(id)` | Apply in progress. Further `DispatchDeployment` for in-flight ID → debug-log + ignore. For new ID → defer via `Self.Forward`. |
|
||||
| `Stale` | ConfigDb unreachable on bootstrap. Periodic `RetryConfigDbConnection` tries to advance to `Steady`. |
|
||||
|
||||
`PreStart`:
|
||||
|
||||
1. Subscribe to `deployments` DPS topic.
|
||||
2. Read most-recent `NodeDeploymentState` for this node from ConfigDb.
|
||||
3. If `Applied` → restore `_currentRevision`, `Become(Steady)`.
|
||||
4. If `Applying` (orphan from crash) → replay apply (idempotent).
|
||||
5. If `Failed` → `Become(Steady)` at last known rev.
|
||||
6. DB unreachable → `Become(Stale)`, start retry timer.
|
||||
|
||||
ACK publishing: when no `_coordinatorOverride` is set (production), `SendAck` publishes on the dedicated `deployment-acks` DPS topic which the coordinator subscribes to (commit `5cfbe8b`).
|
||||
|
||||
## DriverInstanceActor
|
||||
|
||||
Per-driver-instance child. State machine:
|
||||
|
||||
- `Connecting` → first attempt to reach the underlying driver
|
||||
- `Connected` → subscriptions active, reads/writes flow
|
||||
- `Reconnecting` → temporary disconnect; backoff retry
|
||||
- `Stubbed` → DEV-STUB mode for Windows-only drivers (Galaxy, Wonderware Historian) on non-Windows or when `roles` contains `dev`
|
||||
|
||||
`ShouldStub(driverType, roles)` returns `true` for `"Galaxy" | "Historian.Wonderware"` on non-Windows; the actor goes straight to `Stubbed` and returns deterministic success without touching real hardware. Wiring this into the DriverHost child-spawn path is follow-up F20 (folds into F7).
|
||||
|
||||
Engine wiring (subscription publishing, ApplyDelta diff, bad-quality-on-disconnect, write path, supervisor backoff) is stubbed — tracked as F7. Tests exercise message contracts, not engine behaviour.
|
||||
|
||||
## VirtualTagActor / ScriptedAlarmActor
|
||||
|
||||
Skeleton state machines + message handlers. Engine work:
|
||||
|
||||
- `VirtualTagEngine.Evaluate()` not yet called from `VirtualTagActor.DependencyValueChanged` (F8).
|
||||
- `AlarmConditionService` not yet called from `ScriptedAlarmActor` (F9).
|
||||
- `ScriptedAlarmState` DB persistence on `PreRestart` not wired (F9).
|
||||
|
||||
## OpcUaPublishActor
|
||||
|
||||
The only actor on the **pinned dispatcher** (`opcua-synchronized-dispatcher` from `akka.conf`). All OPC UA SDK address-space writes go through it so the SDK's threading model isn't violated.
|
||||
|
||||
Message contracts are defined; actual SDK calls are stubbed (counters only). Real address-space writes + `ServiceLevel` Variable updates + `RebuildAddressSpace` after a deploy land in F10 (gated on F13 — full `OpcUaApplicationHost` extraction).
|
||||
|
||||
## HistorianAdapterActor, PeerOpcUaProbeActor
|
||||
|
||||
Both have message contracts wired. Engine integration deferred:
|
||||
|
||||
- `HistorianAdapterActor` — named-pipe IPC to the Wonderware historian sidecar + `SqliteStoreAndForwardSink` (F11).
|
||||
- `PeerOpcUaProbeActor` — real `opc.tcp://peer:4840` ping (F12). Current stub always returns `Ok=true`.
|
||||
|
||||
## DbHealthProbeActor
|
||||
|
||||
`Ask<DbHealthStatus>` returns cached state (refreshed every 5 s by an internal `SELECT 1`). Consumed by `/health/ready` and `RedundancyStateActor`.
|
||||
|
||||
## Lifecycle wiring
|
||||
|
||||
```csharp
|
||||
// Program.cs (driver role only)
|
||||
builder.Services.AddAkka("otopcua", (ab, sp) =>
|
||||
{
|
||||
ab.WithOtOpcUaClusterBootstrap(sp);
|
||||
if (hasAdmin) ab.WithOtOpcUaControlPlaneSingletons();
|
||||
if (hasDriver) ab.WithOtOpcUaRuntimeActors();
|
||||
});
|
||||
```
|
||||
|
||||
`WithOtOpcUaRuntimeActors` resolves `IDbContextFactory<OtOpcUaConfigDbContext>` + `IClusterRoleInfo` from DI, then spawns `DbHealthProbeActor` and `DriverHostActor` as top-level `/user/` actors. Both register marker keys in `ActorRegistry` so the registry lookup works from anywhere.
|
||||
|
||||
## Tests
|
||||
|
||||
`tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/` — 16 tests covering DriverHostActor (Steady ack, Applying transitions, Stale recovery), DriverInstanceActor (state machine, stub mode), VirtualTagActor + ScriptedAlarmActor (message contracts), OpcUaPublishActor (props + message acceptance), DbHealthProbe + PeerOpcUaProbe (probe loop), and the `WithOtOpcUaRuntimeActors` registration round-trip.
|
||||
|
||||
End-to-end deploy from admin → driver via the cluster is in `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/DeployHappyPathTests.cs`.
|
||||
Reference in New Issue
Block a user