# Code Review — Cluster | Field | Value | |---|---| | Module | `src/Core/ZB.MOM.WW.OtOpcUa.Cluster` | | Reviewer | Claude Code | | Review date | 2026-06-19 | | Commit reviewed | `7286d320` | | Status | Reviewed | | Open findings | 0 | ## Checklist coverage | # | Category | Result | |---|---|---| | 1 | Correctness & logic bugs | Cluster-001 (double-snapshot in seed — resolved), Cluster-002 (stale-member lag — deferred) | | 2 | OtOpcUa conventions | No issues found | | 3 | Concurrency & thread safety | No issues found — all mutable state protected by `_lock`; event raised outside lock | | 4 | Error handling & resilience | No issues found | | 5 | Security | No issues found | | 6 | Performance & resource management | Cluster-003 (redundant double-dispose in `HoconLoader` — resolved) | | 7 | Design-document adherence | Cluster-004 (stale `akka.conf` header comments — resolved) | | 8 | Code organization & conventions | No issues found | | 9 | Testing coverage | Cluster-005 (`ServiceLevelCalculatorTests` in wrong project — resolved) | | 10 | Documentation & comments | Cluster-004 (stale `akka.conf` header comments — resolved) | ## Findings ### Cluster-001 | Field | Value | |---|---| | Severity | Medium | | Category | Correctness & logic bugs | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ClusterRoleInfo.cs:100-105` | | Status | Resolved | **Description:** `SeedFromCurrentState` takes a snapshot of `_cluster.State` into `snapshot` at line 89, then iterates `snapshot.Members` for roles but calls `_cluster.State.RoleLeader(role)` at line 102 — a second, live call to the cluster state. Between these two calls the cluster state could advance. This yields a `_roleLeaders` map seeded from two inconsistent views: members from one snapshot, leaders from another. In the worst case the leader address is not in `snapshot.Members`, so `snapshot.Members.FirstOrDefault(m => m.Address == leaderAddr)` returns `null` and the role is seeded as leaderless. The first `RoleLeaderChanged` event self-corrects. **Recommendation:** Replace `_cluster.State.RoleLeader(role)` with `snapshot.RoleLeader(role)`. `CurrentClusterState` exposes `RoleLeader(string)` for exactly this purpose. **Resolution:** Fixed 2026-06-19. Changed line 102 to call `snapshot.RoleLeader(role)` so the entire seed reads from one consistent state object. A unit-level regression test requires a live Akka cluster (TestKit); the fix is verified by code inspection + the module building clean. Cross-cluster regression coverage lives in `Host.IntegrationTests`. --- ### Cluster-002 | Field | Value | |---|---| | Severity | Low | | Category | Correctness & logic bugs | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/ClusterRoleInfo.cs:110-133` | | Status | Deferred | **Description:** `HandleMemberEvent` only adds members on `MemberUp` and removes on `MemberRemoved`. `MemberLeft`, `MemberExited`, `MemberWeaklyUp`, and `MemberDowned` (all `IMemberEvent` implementors) are silently dropped by the `switch` fall-through default. A member that is gracefully leaving stays in `_membersByRole` until the `Removed` transition (several seconds). Currently `MembersWithRole` has no production callers so this causes no observable misbehaviour, but if a future caller needs leaving-aware membership it will silently receive stale data. **Recommendation:** Add cases for `MemberLeft`, `MemberExited`, and `MemberDowned` that remove the member from `_membersByRole` immediately. **Resolution:** Deferred — `MembersWithRole` has no production callers today; stale window is bounded and self-corrects on `MemberRemoved`. Revisit when a caller is added. --- ### Cluster-003 | Field | Value | |---|---| | Severity | Low | | Category | Performance & resource management | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/HoconLoader.cs:12-16` | | Status | Resolved | **Description:** `LoadBaseConfig` has two nested `using` scopes — one for `stream` and one for `reader`. `StreamReader.Dispose` closes its underlying stream; the outer `using var stream` then disposes the same `UnmanagedMemoryStream` a second time. Double-disposal is safe for managed streams, but the pattern is confusing and misrepresents ownership: `StreamReader` should exclusively own the stream. **Recommendation:** Remove the outer `using var stream`; inline the null-guard into the `StreamReader` constructor call. **Resolution:** Fixed 2026-06-19. Removed the redundant outer `using var stream`; the `StreamReader` now exclusively owns the stream lifetime. Existing `HoconLoaderTests` remain green. --- ### Cluster-004 | Field | Value | |---|---| | Severity | Low | | Category | Documentation & comments | | Location | `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/Resources/akka.conf:4,7` | | Status | Resolved | **Description:** The `akka.conf` header contains two stale references: 1. Line 4 references `AkkaHostedService.cs` which was deleted in commit `d6fac2d` when the bootstrap migrated to `ServiceCollectionExtensions.WithOtOpcUaClusterBootstrap` + Akka.Hosting. 2. Line 7 attributes the tuning to `ScadaLink`, a different project unrelated to this codebase. **Recommendation:** Update the header to reference `WithOtOpcUaClusterBootstrap` and remove the `ScadaLink` attribution. **Resolution:** Fixed 2026-06-19. Updated `akka.conf` header to reference `ServiceCollectionExtensions.WithOtOpcUaClusterBootstrap`; removed stale `ScadaLink` attribution. --- ### Cluster-005 | Field | Value | |---|---| | Severity | Low | | Category | Testing coverage | | Location | `tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/ServiceLevelCalculatorTests.cs` | | Status | Resolved | **Description:** `ServiceLevelCalculator` was moved from `ControlPlane` to `Cluster.Redundancy` (per `docs/Redundancy.md`), but `ServiceLevelCalculatorTests` remained in `ControlPlane.Tests`. The `Cluster.Tests` project has no direct coverage of the module's most complex logic. **Recommendation:** Move the tests to `Cluster.Tests`. As the `ControlPlane.Tests` file is outside this review scope, add equivalent tests directly in `Cluster.Tests`. **Resolution:** Fixed 2026-06-19. Added `ServiceLevelCalculatorTests` to `Cluster.Tests` covering all tiers. The `ControlPlane.Tests` copy is left in place (out of scope).