From 393b746d9be8dd658d8c44f505786f46fe406633 Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Mon, 15 Jun 2026 13:30:55 -0400 Subject: [PATCH] docs(redundancy): sync component table with the wired calculator + PeerProbeSupervisor --- docs/Redundancy.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/Redundancy.md b/docs/Redundancy.md index 06f0f2f4..7a2252c5 100644 --- a/docs/Redundancy.md +++ b/docs/Redundancy.md @@ -13,11 +13,12 @@ The runtime pieces live in: | Component | Project | Role | |---|---|---| | `RedundancyStateActor` | `OtOpcUa.ControlPlane.Redundancy` | Admin-role cluster singleton; subscribes to cluster topology events, debounces 250ms, broadcasts `RedundancyStateChanged` on the `redundancy-state` DPS topic. | -| `OpcUaPublishActor` | `OtOpcUa.Runtime.OpcUa` | Per-driver-node; subscribes to the `redundancy-state` topic, maps the local node's role to a ServiceLevel byte (see below), and forwards it to `IServiceLevelPublisher`. | +| `OpcUaPublishActor` | `OtOpcUa.Runtime.OpcUa` | Per-driver-node; subscribes to the `redundancy-state` topic, computes a health-aware ServiceLevel byte via `ServiceLevelCalculator` (see below), and forwards it to `IServiceLevelPublisher`. | | `IServiceLevelPublisher` / `SdkServiceLevelPublisher` | `OtOpcUa.Commons.OpcUa` / `OtOpcUa.OpcUaServer` | Writes the byte into the SDK's `Server.ServiceLevel` Variable. Production binds `DeferredServiceLevelPublisher`, which swaps in the real `SdkServiceLevelPublisher` once the SDK is up (it needs `IServerInternal`, available only after `StandardServer.Start`); until then writes route through `NullServiceLevelPublisher`. | | `ServiceLevelCalculator` | `OtOpcUa.Cluster.Redundancy` (`Core.Cluster`) | Pure function `(NodeHealthInputs) → byte` — the DB/probe-aware tiering (see truth table below). Covered by `ServiceLevelCalculatorTests`. **Now the live publish path** — `OpcUaPublishActor` calls it on every `HealthTick` and `RedundancyStateChanged` event. Moved to `Core.Cluster` so Runtime can reach it without a Runtime→ControlPlane reference. | -| `DbHealthProbeActor` | `OtOpcUa.Runtime.Health` | Per-node; runs `SELECT 1` against ConfigDb every 5s. Read by health endpoint. | -| `PeerOpcUaProbeActor` | `OtOpcUa.Runtime.Health` | Per-node; pings peer `opc.tcp://peer:4840` with a TCP connect (2s timeout) and publishes the result on the `redundancy-state` topic. A full secure-channel Hello handshake is a possible future upgrade; the TCP connect is the current real probe. | +| `DbHealthProbeActor` | `OtOpcUa.Runtime.Health` | Per-node; runs `SELECT 1` against ConfigDb every 5s. Read by the health endpoint AND by `OpcUaPublishActor` (the `DbReachable` ServiceLevel input). | +| `PeerProbeSupervisor` | `OtOpcUa.Runtime.Health` | Per-node; subscribes to the `redundancy-state` topic and maintains one `PeerOpcUaProbeActor` child per OTHER driver-role peer (spawn on join, stop on departure), so every node is continuously probed by its peers. | +| `PeerOpcUaProbeActor` | `OtOpcUa.Runtime.Health` | Spawned by `PeerProbeSupervisor`; pings a peer `opc.tcp://peer:4840` with a TCP connect (2s timeout) and publishes `OpcUaProbeResult` on the `redundancy-state` topic. A full secure-channel Hello handshake is a possible future upgrade; the TCP connect is the current real probe. | | `ClusterRoleInfo` | `OtOpcUa.Cluster` | Live view of cluster membership + role-leader; exposes `IClusterRoleInfo` to the rest of the host. | ## ServiceLevel tiers