Files
lmxopcua/docs/v2/Cluster.md
Joseph Doherty 1689901c0e docs(v2): Architecture-v2 + Cluster + ControlPlane + Runtime overviews (Task 65)
Four new docs at docs/v2/ giving a single-page tour of each v2 piece:
- Architecture-v2.md: top-level mental model (fused Host + roles + cluster + live-edit)
- Cluster.md: AkkaClusterOptions + IClusterRoleInfo + WithOtOpcUaClusterBootstrap
- ControlPlane.md: 5 admin singletons + DPS topics + deploy flow + failover recovery
- Runtime.md: per-node actor tree + state machines + engine-wiring follow-up map

Each links back to the design doc for depth. Architecture-v2 cross-references
the other three + ServiceHosting + Redundancy + security.
2026-05-26 06:41:48 -04:00

103 lines
4.8 KiB
Markdown

# OtOpcUa.Cluster
Akka.NET cluster bootstrap + topology view. Used by every other server-side project to talk to the live cluster.
Path: `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/`
## Public surface
| Type | Role |
|---|---|
| `AkkaClusterOptions` | DI-bound options from `appsettings.json::Cluster`. Hostname/Port/PublicHostname/SeedNodes/Roles. |
| `IClusterRoleInfo` (interface in Commons) | Live view of cluster membership + role-leader topology. Thread-safe + event-raising. |
| `ClusterRoleInfo` | Implementation. Subscribes to `ClusterEvent.IMemberEvent` + `RoleLeaderChanged` + `LeaderChanged`. |
| `HoconLoader.LoadBaseConfig()` | Reads the embedded `Resources/akka.conf`. |
| `RoleParser.Parse(string?)` | Parses `OTOPCUA_ROLES` env var into a deduped `string[]`. |
| `ServiceCollectionExtensions.AddOtOpcUaCluster(configuration)` | Binds options + registers `IClusterRoleInfo` singleton. **Does not** start an ActorSystem. |
| `WithOtOpcUaClusterBootstrap(serviceProvider)` | Extension on `AkkaConfigurationBuilder`. Loads embedded HOCON + applies `WithRemoting(...)` + `WithClustering(...)` from options. |
## Bootstrap flow
```csharp
// Program.cs
builder.Services.AddOtOpcUaCluster(builder.Configuration);
builder.Services.AddAkka("otopcua", (ab, sp) =>
{
ab.WithOtOpcUaClusterBootstrap(sp); // HOCON + remote + cluster
// …singletons + node actors layered on
});
```
Order matters: `AddOtOpcUaCluster` must come before `AddAkka` so the options binding has run by the time the `AddAkka` lambda fires. Inside the lambda, `WithOtOpcUaClusterBootstrap` resolves `IOptions<AkkaClusterOptions>` from `sp` and writes them into the Akka builder.
The single ActorSystem this produces is what every other v2 piece runs on. There is no second Akka instance — that was a Phase 9 bug (commit `d6fac2d` consolidated).
## Embedded HOCON
`src/Core/ZB.MOM.WW.OtOpcUa.Cluster/Resources/akka.conf` contains:
| Setting | Value | Why |
|---|---|---|
| `akka.actor.provider` | `cluster` | Required for `Cluster.Get(system)` to work. |
| `akka.cluster.split-brain-resolver.active-strategy` | `keep-oldest` | Smaller/younger side downs itself on partition. |
| `akka.cluster.split-brain-resolver.stable-after` | `15s` | Time before SBR acts. |
| `akka.cluster.failure-detector.threshold` | `10.0` | Higher than default (8.0) for GC-pause tolerance. |
| `opcua-synchronized-dispatcher.type` | `PinnedDispatcher` | Dedicated thread for `OpcUaPublishActor` so SDK calls stay marshalled. |
The Cluster.Tests project verifies these key values stay correct (`HoconLoaderTests`).
## Configuration
```json
{
"Cluster": {
"Hostname": "0.0.0.0",
"Port": 4053,
"PublicHostname": "node-a.lan",
"SeedNodes": ["akka.tcp://otopcua@node-a.lan:4053"],
"Roles": ["admin", "driver"]
}
}
```
- `Hostname`: interface to bind. `0.0.0.0` listens on every interface.
- `Port`: TCP port for cluster gossip. Default 4053.
- `PublicHostname`: address advertised in cluster gossip. Must be reachable by every other node.
- `SeedNodes`: where new nodes go to join. List one (or two) stable nodes. First node bootstraps the cluster from its own address.
- `Roles`: free-form tags Akka gossip propagates. v2 uses `admin` + `driver`; per-role wiring in `Program.cs` reads `OTOPCUA_ROLES` env var, not this list — these two should stay in sync.
## IClusterRoleInfo
Anywhere in the host that needs the local node's identity or a view of who-else-is-in-the-cluster, inject `IClusterRoleInfo`:
```csharp
public sealed class MyService(IClusterRoleInfo cluster)
{
public NodeId Self => cluster.LocalNode;
public IReadOnlyList<NodeId> Drivers => cluster.MembersWithRole("driver");
public NodeId? AdminLeader => cluster.RoleLeader("admin");
public MyService(IClusterRoleInfo cluster)
{
cluster.RoleLeaderChanged += (_, e) =>
Console.WriteLine($"role={e.Role}: {e.PreviousLeader} → {e.NewLeader}");
}
}
```
`LocalNode` is `{PublicHostname}:{Port}` (the port suffix lets loopback test deployments stay distinct; production hostnames are already unique). `ConfigPublishCoordinator` uses the same `{host}:{port}` formula so the expected-ack set and the driver self-identification agree (commit `5cfbe8b`).
## Lifecycle
Akka.Hosting owns the lifecycle: `IHostedService` starts the ActorSystem at host start, runs `CoordinatedShutdown.ClusterLeavingReason` on host stop. The Cluster project does not register its own `IHostedService` (the v1 `AkkaHostedService` was deleted in commit `d6fac2d`).
## Tests
`tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests/` covers:
- `HoconLoaderTests` — embedded resource loads + key settings parse correctly.
- `RoleParserTests` — comma-split + dedup + trim semantics.
Cross-project integration is in `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/` (cluster formation, deploy round-trip).