fbf0f23e76
The C# DraftValidator/DraftSnapshot has NO live caller in OtOpcUa src/ (verified repo-wide) — it is dormant complement code. The enforced pre-publish draft validation runs DB-side in the sp_ValidateDraft stored procedure (Status='Draft' -> sp_PublishGeneration lifecycle). Reframe across current-state/SPEC/GAPS/README/ CLAUDE.md from 'runtime draft validation' + a false publish-pipeline caller to 'dormant managed validator; enforcement is DB-side'. Out-of-scope conclusion for ZB.MOM.WW.Configuration is unchanged.
204 lines
14 KiB
Markdown
204 lines
14 KiB
Markdown
# Configuration validation — normalized target spec
|
||
|
||
Status: **Draft**. The single design the sister projects converge on for **startup
|
||
configuration validation**. Derived from the three code-verified current-state docs
|
||
(`../current-state/`). Goal is *path to shared code*
|
||
(`../shared-contract/ZB.MOM.WW.Configuration.md`), so each normalized section maps to a shared
|
||
library seam. The library is **already built** at
|
||
[`../../../ZB.MOM.WW.Configuration/`](../../../ZB.MOM.WW.Configuration/) (`0.1.0`, 27 tests).
|
||
|
||
## 0. Scope
|
||
|
||
The common concern is **fail-fast validation of configuration at process startup**: bind an
|
||
`appsettings.json` / environment section to a typed options object (or read raw keys before the
|
||
host exists), check every field, and refuse to start when anything is wrong — surfacing **all**
|
||
problems at once so an operator fixes them in one edit rather than one boot-loop per typo. All
|
||
three apps already do this; they do it with three private copies of the same plumbing.
|
||
|
||
**Normalized here** (goes in the shared `ZB.MOM.WW.Configuration` library):
|
||
|
||
- **The `IValidateOptions<T>` failure-accumulation convention.** Every app hand-rolls a
|
||
`List<string> failures`, a pile of `if (...) failures.Add(...)`, and the
|
||
`failures.Count == 0 ? Success : Fail(failures)` tail. That plumbing becomes
|
||
`OptionsValidatorBase<TOptions>`: override `protected void Validate(ValidationBuilder, TOptions)`,
|
||
record failures on the builder, and the base aggregates them and returns a single
|
||
`ValidateOptionsResult` (Success only when the builder is clean).
|
||
- **Reusable rule primitives.** The same checks recur across apps — required-string, TCP port
|
||
range, `host:port` endpoint, positive `TimeSpan`, one-of-a-set, minimum collection count. They
|
||
become `ValidationBuilder` primitives (`Required`, `Port`, `HostPort`, `PositiveTimeSpan`,
|
||
`OneOf`, `MinCount`) plus `RequireThat(bool, message)` / `Add(message)` escape hatches for
|
||
custom and cross-field rules. Wording is centralized in an internal `Checks` seam so a
|
||
given rule reads identically everywhere.
|
||
- **`AddValidatedOptions<TOptions, TValidator>(IConfiguration, sectionPath)`** — one DI call that
|
||
binds the section, registers the validator as the options' `IValidateOptions<TOptions>`, and
|
||
enables `ValidateOnStart()`. Replaces the per-module `AddOptions().Bind(...).ValidateOnStart()`
|
||
+ `AddSingleton<IValidateOptions<...>, ...>()` pair that each app open-codes.
|
||
- **The pre-host `ConfigPreflight` aggregator** — a fluent checker over raw `IConfiguration` for
|
||
the keys that must be valid *before* the host / DI container / actor system is built (node
|
||
role, remoting port, site id). Generalizes ScadaBridge's `StartupValidator`. Fluent surface:
|
||
`For(config)`, `.Require(key, predicate, reason)`, `.RequireValue(key)`, `.RequirePort(key)`,
|
||
`.When(condition, block)` (role-conditional rules), `.ThrowIfInvalid()`.
|
||
|
||
**The error-handling contract** (shared across both front-ends):
|
||
|
||
- **Accumulate ALL failures.** Never short-circuit on the first failure — collect every problem
|
||
and surface them together. (`OptionsValidatorBase` and `ConfigPreflight` both do this; it is
|
||
the behaviour every app already wanted.)
|
||
- **Two surfacing paths**, by where validation runs:
|
||
1. **Options bound through DI** → `ValidateOnStart()` raises an
|
||
**`OptionsValidationException`** at host start (the .NET options pipeline aggregates the
|
||
failures). This is the `AddValidatedOptions` path.
|
||
2. **Raw config, pre-host** → `ConfigPreflight.ThrowIfInvalid()` throws an
|
||
**`InvalidOperationException`** listing all failures.
|
||
- **Message format `"<field> <reason>"`** for each individual failure, produced by the shared
|
||
`Checks` primitives (e.g. `"ScadaBridge:Node:RemotingPort must be between 1 and 65535 (was '0')"`).
|
||
`ConfigPreflight.ThrowIfInvalid()` wraps the accumulated lines in the exact envelope
|
||
ScadaBridge's `StartupValidator` uses today (§4) so the migration is byte-compatible.
|
||
|
||
**Explicitly NOT normalized** (domain-specific — stays per project):
|
||
|
||
- **Each app's options classes and their domain rules.** `GatewayOptions` (worker exe path,
|
||
heartbeat grace ≥ interval, TLS validity years), `ClusterOptions` (split-brain strategy,
|
||
`MinNrOfMembers == 1`, heartbeat ≪ failure-detection), `SecurityOptions` (LDAP server /
|
||
search base), `HealthMonitoringOptions` (positive `PeriodicTimer` intervals),
|
||
`AuditLogOptions` (payload caps, retention bounds), and ScadaBridge's `Node` topology rules
|
||
(gRPC port ≠ remoting port, seed nodes must not target the gRPC port) all stay where they
|
||
live. Only the *plumbing they sit on* is shared; the *rules* are theirs.
|
||
- **OtOpcUa's draft/generation-content validation** (the dormant C# `DraftValidator` /
|
||
`DraftSnapshot`, plus the live DB stored procedure `sp_ValidateDraft` it was designed to
|
||
complement). This is **not** options/config validation at all — it is pre-publish validation of an
|
||
operator's *configuration draft content* (UNS segment regex, EquipmentId derivation, cross-cluster
|
||
namespace binding, reservation pre-flight) against database rows, not against `IConfiguration`;
|
||
enforcement lives DB-side in `sp_ValidateDraft` and the managed `DraftValidator` has **no live
|
||
caller** in `src/` today. It shares only a *philosophy* (return every failure in one pass) with
|
||
this component and is **out of scope** for the shared library. It stays entirely in OtOpcUa.
|
||
|
||
## 1. `IValidateOptions` base — `OptionsValidatorBase<TOptions>`
|
||
|
||
The headline plumbing fix. Today each validator re-implements: the `Validate(string?, TOptions)`
|
||
signature, a local `List<string>`, the `failures.Count == 0 ? Success : Fail(failures)` tail,
|
||
and (in several) private `AddIfBlank` / `AddIfNotPositive` helpers. The base owns all of that:
|
||
|
||
```csharp
|
||
public sealed class ClusterOptionsValidator : OptionsValidatorBase<ClusterOptions>
|
||
{
|
||
protected override void Validate(ValidationBuilder v, ClusterOptions o)
|
||
{
|
||
v.MinCount(o.SeedNodes, 2, "ClusterOptions.SeedNodes");
|
||
v.OneOf(o.SplitBrainResolverStrategy, ["keep-oldest"], "ClusterOptions.SplitBrainResolverStrategy");
|
||
v.PositiveTimeSpan(o.StableAfter, "ClusterOptions.StableAfter");
|
||
v.RequireThat(o.MinNrOfMembers == 1,
|
||
$"ClusterOptions.MinNrOfMembers must be 1 (was {o.MinNrOfMembers})");
|
||
// cross-field rule:
|
||
v.RequireThat(o.HeartbeatInterval < o.FailureDetectionThreshold,
|
||
"ClusterOptions.HeartbeatInterval must be below FailureDetectionThreshold");
|
||
}
|
||
}
|
||
```
|
||
|
||
`OptionsValidatorBase<TOptions>.Validate(string?, TOptions)` guards null, creates a
|
||
`ValidationBuilder`, calls the override, and returns `Success` only when `builder.IsValid`.
|
||
**Accumulation is automatic** — the override never returns early; it records everything.
|
||
|
||
## 2. Rule primitives — `ValidationBuilder`
|
||
|
||
`ValidationBuilder` is the accumulator passed into the override. Primitives both check a value
|
||
and append a consistently-worded `"<field> <reason>"` message on failure; escape hatches cover
|
||
the rest:
|
||
|
||
| Primitive | Checks | Failure wording (from `Checks`) |
|
||
|---|---|---|
|
||
| `Required(value, field)` | non-null, non-whitespace string | `"<field> is required"` |
|
||
| `Port(value, field)` | int in 1–65535 | `"<field> must be between 1 and 65535 (was <value>)"` |
|
||
| `HostPort(value, field)` | `host:port` with port 1–65535 | `"<field> must be 'host:port' with port 1-65535 (was '<value>')"` |
|
||
| `PositiveTimeSpan(value, field)` | `> TimeSpan.Zero` | `"<field> must be a positive duration (was <value>)"` |
|
||
| `OneOf(value, allowed, field)` | case-insensitive membership | `"<field> must be one of [<allowed>] (was '<value>')"` |
|
||
| `MinCount(value, min, field)` | collection ≥ `min` items | `"<field> must contain at least <min> item(s) (had <n>)"` |
|
||
| `RequireThat(ok, message)` | arbitrary boolean (cross-field, custom) | caller-supplied |
|
||
| `Add(message)` | unconditional failure | caller-supplied |
|
||
|
||
Properties: `Failures` (read-only accumulated list) and `IsValid`. Every method returns the
|
||
builder for chaining. `Add`/`RequireThat` carry the rules that are genuinely app-specific (e.g.
|
||
MxGateway's "ExecutablePath must point to a .exe", ScadaBridge's heartbeat-vs-threshold
|
||
ordering) without forcing them into a primitive.
|
||
|
||
## 3. DI wiring — `AddValidatedOptions`
|
||
|
||
```csharp
|
||
builder.Services.AddValidatedOptions<ClusterOptions, ClusterOptionsValidator>(
|
||
builder.Configuration, "ScadaBridge:Cluster");
|
||
```
|
||
|
||
Binds `ScadaBridge:Cluster` → `ClusterOptions`, registers `ClusterOptionsValidator` as a
|
||
singleton `IValidateOptions<ClusterOptions>`, and calls `ValidateOnStart()`. Returns the
|
||
`OptionsBuilder<TOptions>` for further chaining (e.g. `.PostConfigure(...)`). This collapses the
|
||
three-line idiom every module repeats (`AddOptions().Bind(...).ValidateOnStart()` +
|
||
`AddSingleton<IValidateOptions<...>, ...>()`) into one call.
|
||
|
||
> The validator is registered as a **singleton** (it backs the singleton options factory). It
|
||
> must be singleton-safe — no scoped dependencies. All current validators are stateless, so this
|
||
> holds.
|
||
|
||
When a section bound this way fails, the .NET options pipeline raises **`OptionsValidationException`**
|
||
at host start (because of `ValidateOnStart()`), with all accumulated messages.
|
||
|
||
## 4. Pre-host preflight — `ConfigPreflight`
|
||
|
||
For keys that must be valid **before** the host / DI / actor system exists, `ConfigPreflight`
|
||
reads raw `IConfiguration` and accumulates failures the same way:
|
||
|
||
```csharp
|
||
ConfigPreflight.For(configuration)
|
||
.Require("ScadaBridge:Node:Role", v => v is "Central" or "Site", "must be 'Central' or 'Site'")
|
||
.RequireValue("ScadaBridge:Node:NodeHostname")
|
||
.RequirePort("ScadaBridge:Node:RemotingPort")
|
||
.When(role == "Site", p => p.RequireValue("ScadaBridge:Node:SiteId"))
|
||
.ThrowIfInvalid();
|
||
```
|
||
|
||
`.ThrowIfInvalid()` throws **`InvalidOperationException`** when any failure was recorded, with
|
||
this exact envelope:
|
||
|
||
```
|
||
Configuration validation failed:
|
||
- <field> <reason>
|
||
- <field> <reason>
|
||
```
|
||
|
||
> **Byte-compatibility with ScadaBridge's `StartupValidator`.** ScadaBridge's
|
||
> `StartupValidator.Validate` throws
|
||
> `$"Configuration validation failed:\n{string.Join("\n", errors.Select(e => $" - {e}"))}"`.
|
||
> `ConfigPreflight.ThrowIfInvalid()` produces the **identical** string
|
||
> (`"Configuration validation failed:\n" + the same `" - <field> <reason>"` lines, `\n`-joined`).
|
||
> The migration is a behaviour-preserving swap: same exception type
|
||
> (`InvalidOperationException`), same message bytes. This is verified in the library's
|
||
> `ConfigPreflightTests` and is the reason the message format is pinned in §0.
|
||
|
||
`.When(condition, block)` carries role-conditional rules (ScadaBridge only validates database /
|
||
security / gRPC-port keys when the node is `Central` or `Site` respectively) without an `if` ladder.
|
||
|
||
## 5. Per-project migration
|
||
|
||
| Project | Current state | Primary gaps | What normalizes |
|
||
|---|---|---|---|
|
||
| **OtOpcUa** | **No options validation at all** — options bound with bare `.Bind()` (`LdapOptions`, `OpcUa`); zero `IValidateOptions` / `ValidateOnStart` in the repo. The only validation-shaped type is the dormant C# `DraftValidator` (draft/generation content; real enforcement is DB-side `sp_ValidateDraft`) — **out of scope**. | No startup validation of `Ldap` / `OpcUa` sections — a bad value fails opaquely on first use. | *Optional* adoption: add `OptionsValidatorBase` subclasses + `AddValidatedOptions` for the sections worth guarding. `DraftValidator`/`DraftSnapshot` stay per-project untouched. Lightest consumer. |
|
||
| **MxGateway** | One large `GatewayOptionsValidator : IValidateOptions<GatewayOptions>` (~360 LOC, 9 sub-validators, private `AddIfBlank`/`AddIfNotPositive`/`AddIfInvalidPath` helpers); wired via `AddGatewayConfiguration` (`AddOptions().BindConfiguration().ValidateOnStart()`). | Hand-rolled accumulation + helpers duplicate the base; bespoke DI wiring duplicates `AddValidatedOptions`. | `GatewayOptionsValidator` → `OptionsValidatorBase<GatewayOptions>` (delete the `List<string>`/tail/helpers; keep the domain rules); `AddGatewayConfiguration` → `AddValidatedOptions<GatewayOptions, GatewayOptionsValidator>`. Domain rules unchanged. |
|
||
| **ScadaBridge** | **Heaviest.** Four per-module `*OptionsValidator : IValidateOptions<T>` (Cluster / Security / HealthMonitoring / AuditLog) each with their own `List<string>` accumulation, wired through bespoke `AddXxx` extensions; **plus** a raw-config pre-Akka `StartupValidator`. | Four copies of the accumulation plumbing + bespoke DI wiring; `StartupValidator` open-codes the preflight envelope. | Each `*OptionsValidator` → `OptionsValidatorBase<T>`; each module's `AddXxx` → `AddValidatedOptions`; `StartupValidator` → `ConfigPreflight` (byte-compatible message, §4). Domain rules unchanged. |
|
||
|
||
> No sister-repo adoption is in scope for this release — the library is built; adoption is the
|
||
> follow-on tracked in [`../GAPS.md`](../GAPS.md). (Unlike the observability pass, which carried
|
||
> one in-pass MxGateway adoption, this pass is library-only.)
|
||
|
||
## 6. Acceptance (what "converged" means)
|
||
|
||
A project is converged when: (a) every options validator it owns derives from
|
||
`OptionsValidatorBase<TOptions>` and records failures on the supplied `ValidationBuilder` (no
|
||
private `List<string>` plumbing, no early return); (b) every bind-and-validate registration goes
|
||
through `AddValidatedOptions<TOptions, TValidator>(config, sectionPath)`; (c) any pre-host raw-config
|
||
checks go through `ConfigPreflight` and surface via `ThrowIfInvalid()`; (d) all validation
|
||
**accumulates every failure** and surfaces them together (`OptionsValidationException` at host
|
||
start, or `InvalidOperationException` from `ConfigPreflight`); and (e) failure wording for the
|
||
shared primitives comes from the library's `Checks` seam, identical across the fleet. Each app's
|
||
**options classes and domain rules stay its own**; only the plumbing is shared. OtOpcUa's
|
||
`DraftValidator` is explicitly exempt — it is not part of the converged surface.
|