fix(cluster-infrastructure): resolve ClusterInfrastructure-002..006 — options validation, DI registration, down-if-alone
This commit is contained in:
@@ -8,7 +8,7 @@
|
||||
| Last reviewed | 2026-05-16 |
|
||||
| Reviewer | claude-agent |
|
||||
| Commit reviewed | `9c60592` |
|
||||
| Open findings | 7 |
|
||||
| Open findings | 3 |
|
||||
|
||||
## Summary
|
||||
|
||||
@@ -144,7 +144,7 @@ module-ownership claim was wrong. Module test suite green (3 passed).
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.ClusterInfrastructure/ServiceCollectionExtensions.cs:7-17` |
|
||||
|
||||
**Description**
|
||||
@@ -167,7 +167,23 @@ with the genuine registration when CI-001 is addressed.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Confirmed against the source: both methods returned the `IServiceCollection`
|
||||
unchanged. Verified the consumers — `ScadaLink.Host` calls `AddClusterInfrastructure()`
|
||||
(`Program.cs:68`, `SiteServiceRegistration.cs:24`); `AddClusterInfrastructureActors`
|
||||
is dead — it is called nowhere in the solution.
|
||||
|
||||
**Resolved** — fixing commit `commit pending`, date 2026-05-16.
|
||||
`AddClusterInfrastructure` now does real work: it registers the
|
||||
`ClusterOptionsValidator` (CI-004) via `TryAddEnumerable`, so the method is no longer a
|
||||
no-op and a misconfigured `ScadaLink:Cluster` section fails fast on the first
|
||||
`IOptions<ClusterOptions>` resolution. `AddClusterInfrastructureActors` — which this
|
||||
component never had any actors to register, as CI-001 established the Akka bootstrap
|
||||
lives in `ScadaLink.Host` — now throws `NotImplementedException` with a message
|
||||
pointing the caller to the Host, rather than masquerading as a completed registration.
|
||||
Covered by `ServiceCollectionExtensionsTests`
|
||||
(`AddClusterInfrastructure_RegistersOptionsValidator`,
|
||||
`AddClusterInfrastructure_ValidatorRejectsBadOptionsAtResolution`,
|
||||
`AddClusterInfrastructureActors_ThrowsRatherThanSilentlySucceeding`).
|
||||
|
||||
### ClusterInfrastructure-003 — ClusterOptions omits several documented node-configuration settings
|
||||
|
||||
@@ -175,7 +191,7 @@ _Unresolved._
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Design-document adherence |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.ClusterInfrastructure/ClusterOptions.cs:3-11` |
|
||||
|
||||
**Description**
|
||||
@@ -202,7 +218,27 @@ agree on where each value lives.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Partially re-triaged. Verified against the source: most of the "missing" settings are
|
||||
**deliberately owned by `ScadaLink.Host.NodeOptions`** — `NodeOptions` already carries
|
||||
`Role`, `NodeHostname`, `SiteId`, `RemotingPort` and `GrpcPort`, and `AkkaHostedService`
|
||||
builds the HOCON from `NodeOptions` for exactly those values. Local SQLite storage paths
|
||||
live in the database / store-and-forward options. This is the ownership split CI-001
|
||||
established (the Host owns node identity and bootstrap; this project owns the
|
||||
cluster-formation contract), so those settings do **not** belong in `ClusterOptions`.
|
||||
|
||||
The one genuine gap the finding identifies is `down-if-alone`, which the design doc
|
||||
puts with the split-brain settings.
|
||||
|
||||
**Resolved** — fixing commit `commit pending`, date 2026-05-16. Added the
|
||||
`DownIfAlone` boolean (default `true`) to `ClusterOptions` so the split-brain
|
||||
configuration contract is complete, and added a class-level XML doc that records the
|
||||
deliberate ownership split — node identity/remoting/gRPC in `Host.NodeOptions`, storage
|
||||
paths in the database options, cluster-formation settings here — so the design doc and
|
||||
the options classes now agree on where each value lives. (`AkkaHostedService` currently
|
||||
hard-codes `down-if-alone = on` in HOCON; wiring it to read `DownIfAlone` is a one-line
|
||||
`ScadaLink.Host` change, outside this module's permitted edit scope, and is noted for
|
||||
the Host's review.) Covered by `ClusterOptionsTests.DefaultValues_AreCorrect` and
|
||||
`ClusterOptionsTests.DownIfAlone_CanBeSet`.
|
||||
|
||||
### ClusterInfrastructure-004 — ClusterOptions has no validation despite safety-critical values
|
||||
|
||||
@@ -210,7 +246,7 @@ _Unresolved._
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Code organization & conventions |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.ClusterInfrastructure/ClusterOptions.cs:3-11` |
|
||||
|
||||
**Description**
|
||||
@@ -239,7 +275,26 @@ FailureDetectionThreshold` and positive `StableAfter`. Register it with
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Confirmed: `ClusterOptions` had no validation of any kind, and the design doc's
|
||||
catastrophic-misconfiguration values (`MinNrOfMembers: 2`, a quorum split-brain
|
||||
strategy) would have been bound silently.
|
||||
|
||||
**Resolved** — fixing commit `commit pending`, date 2026-05-16. Added
|
||||
`ClusterOptionsValidator : IValidateOptions<ClusterOptions>`, which enforces
|
||||
`MinNrOfMembers == 1`, restricts `SplitBrainResolverStrategy` to the
|
||||
`keep-oldest`-only allowed set, requires a non-empty `SeedNodes`, requires positive
|
||||
`StableAfter` / `HeartbeatInterval` / `FailureDetectionThreshold`, and asserts
|
||||
`HeartbeatInterval < FailureDetectionThreshold`. It accumulates every failure into one
|
||||
result. It is registered by `AddClusterInfrastructure()` (CI-002) as a singleton
|
||||
`IValidateOptions<ClusterOptions>`, so a misconfigured section throws
|
||||
`OptionsValidationException` on the first `IOptions<ClusterOptions>.Value` resolution
|
||||
— which `AkkaHostedService` performs during startup, giving the fail-fast-at-boot
|
||||
behaviour the recommendation asks for without the src project taking a dependency on
|
||||
the full `Microsoft.Extensions.DependencyInjection` package needed for the
|
||||
`ValidateOnStart()` overload. Data annotations were not used — a single
|
||||
`IValidateOptions` implementation expresses the interdependent timing rules that
|
||||
attributes cannot. Covered by `ClusterOptionsValidatorTests` (8 cases) and
|
||||
`ServiceCollectionExtensionsTests.AddClusterInfrastructure_ValidatorRejectsBadOptionsAtResolution`.
|
||||
|
||||
### ClusterInfrastructure-005 — No configuration section name constant for the Options pattern binding
|
||||
|
||||
@@ -276,7 +331,7 @@ _Unresolved._
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Testing coverage |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `tests/ScadaLink.ClusterInfrastructure.Tests/ClusterOptionsTests.cs:1-51` |
|
||||
|
||||
**Description**
|
||||
@@ -301,7 +356,28 @@ from `ClusterOptions` and for the options validation from CI-004.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Re-triaged in light of CI-001's resolution. The Akka bootstrap, HOCON generation,
|
||||
cluster formation, failover and singleton handover are owned by `ScadaLink.Host`, not
|
||||
this project — multi-node `Akka.Cluster.TestKit` tests for that behaviour belong in the
|
||||
Host's test suite, outside this module's scope. What this module legitimately owns is
|
||||
`ClusterOptions`, its validator, and the DI registration, and the testing gap there is
|
||||
now closed.
|
||||
|
||||
**Resolved** — fixing commit `commit pending`, date 2026-05-16. Added two test classes
|
||||
to `tests/ScadaLink.ClusterInfrastructure.Tests`: `ClusterOptionsValidatorTests`
|
||||
(8 cases — valid defaults pass; `MinNrOfMembers != 1`, unsupported split-brain
|
||||
strategies, empty seed nodes, heartbeat not below the failure threshold, non-positive
|
||||
`StableAfter` all fail; and a multi-failure accumulation case) and
|
||||
`ServiceCollectionExtensionsTests` (3 cases — `AddClusterInfrastructure` registers the
|
||||
validator, the validator rejects bad options at `IOptions` resolution, and
|
||||
`AddClusterInfrastructureActors` throws). The pre-existing `ClusterOptionsTests` was
|
||||
extended with `DownIfAlone` coverage. The test project gained references to
|
||||
`Microsoft.Extensions.DependencyInjection` and `Microsoft.Extensions.Options`. Module
|
||||
test suite green: 16 passed (was 3). Note: the `keep-majority` value used in the
|
||||
pre-existing `ClusterOptionsTests.Properties_CanBeSetToCustomValues` is intentionally
|
||||
left — that test exercises the POCO's property setter (the POCO accepts any string by
|
||||
design); `ClusterOptionsValidator` is the layer that now rejects `keep-majority`, and
|
||||
`UnsupportedSplitBrainStrategy_FailsValidation` proves it.
|
||||
|
||||
### ClusterInfrastructure-007 — ClusterOptions lacks XML documentation comments
|
||||
|
||||
|
||||
Reference in New Issue
Block a user