# Configuration validation — normalized target spec Status: **Draft**. The single design the sister projects converge on for **startup configuration validation**. Derived from the three code-verified current-state docs (`../current-state/`). Goal is *path to shared code* (`../shared-contract/ZB.MOM.WW.Configuration.md`), so each normalized section maps to a shared library seam. The library is **already built** at [`../../../ZB.MOM.WW.Configuration/`](../../../ZB.MOM.WW.Configuration/) (`0.1.0`, 27 tests). ## 0. Scope The common concern is **fail-fast validation of configuration at process startup**: bind an `appsettings.json` / environment section to a typed options object (or read raw keys before the host exists), check every field, and refuse to start when anything is wrong — surfacing **all** problems at once so an operator fixes them in one edit rather than one boot-loop per typo. All three apps already do this; they do it with three private copies of the same plumbing. **Normalized here** (goes in the shared `ZB.MOM.WW.Configuration` library): - **The `IValidateOptions` failure-accumulation convention.** Every app hand-rolls a `List failures`, a pile of `if (...) failures.Add(...)`, and the `failures.Count == 0 ? Success : Fail(failures)` tail. That plumbing becomes `OptionsValidatorBase`: override `protected void Validate(ValidationBuilder, TOptions)`, record failures on the builder, and the base aggregates them and returns a single `ValidateOptionsResult` (Success only when the builder is clean). - **Reusable rule primitives.** The same checks recur across apps — required-string, TCP port range, `host:port` endpoint, positive `TimeSpan`, one-of-a-set, minimum collection count. They become `ValidationBuilder` primitives (`Required`, `Port`, `HostPort`, `PositiveTimeSpan`, `OneOf`, `MinCount`) plus `RequireThat(bool, message)` / `Add(message)` escape hatches for custom and cross-field rules. Wording is centralized in an internal `Checks` seam so a given rule reads identically everywhere. - **`AddValidatedOptions(IConfiguration, sectionPath)`** — one DI call that binds the section, registers the validator as the options' `IValidateOptions`, and enables `ValidateOnStart()`. Replaces the per-module `AddOptions().Bind(...).ValidateOnStart()` + `AddSingleton, ...>()` pair that each app open-codes. - **The pre-host `ConfigPreflight` aggregator** — a fluent checker over raw `IConfiguration` for the keys that must be valid *before* the host / DI container / actor system is built (node role, remoting port, site id). Generalizes ScadaBridge's `StartupValidator`. Fluent surface: `For(config)`, `.Require(key, predicate, reason)`, `.RequireValue(key)`, `.RequirePort(key)`, `.When(condition, block)` (role-conditional rules), `.ThrowIfInvalid()`. **The error-handling contract** (shared across both front-ends): - **Accumulate ALL failures.** Never short-circuit on the first failure — collect every problem and surface them together. (`OptionsValidatorBase` and `ConfigPreflight` both do this; it is the behaviour every app already wanted.) - **Two surfacing paths**, by where validation runs: 1. **Options bound through DI** → `ValidateOnStart()` raises an **`OptionsValidationException`** at host start (the .NET options pipeline aggregates the failures). This is the `AddValidatedOptions` path. 2. **Raw config, pre-host** → `ConfigPreflight.ThrowIfInvalid()` throws an **`InvalidOperationException`** listing all failures. - **Message format `" "`** for each individual failure, produced by the shared `Checks` primitives (e.g. `"ScadaBridge:Node:RemotingPort must be between 1 and 65535 (was '0')"`). `ConfigPreflight.ThrowIfInvalid()` wraps the accumulated lines in the exact envelope ScadaBridge's `StartupValidator` uses today (§4) so the migration is byte-compatible. **Explicitly NOT normalized** (domain-specific — stays per project): - **Each app's options classes and their domain rules.** `GatewayOptions` (worker exe path, heartbeat grace ≥ interval, TLS validity years), `ClusterOptions` (split-brain strategy, `MinNrOfMembers == 1`, heartbeat ≪ failure-detection), `SecurityOptions` (LDAP server / search base), `HealthMonitoringOptions` (positive `PeriodicTimer` intervals), `AuditLogOptions` (payload caps, retention bounds), and ScadaBridge's `Node` topology rules (gRPC port ≠ remoting port, seed nodes must not target the gRPC port) all stay where they live. Only the *plumbing they sit on* is shared; the *rules* are theirs. - **OtOpcUa's draft/generation-content validation** (the dormant C# `DraftValidator` / `DraftSnapshot`, plus the live DB stored procedure `sp_ValidateDraft` it was designed to complement). This is **not** options/config validation at all — it is pre-publish validation of an operator's *configuration draft content* (UNS segment regex, EquipmentId derivation, cross-cluster namespace binding, reservation pre-flight) against database rows, not against `IConfiguration`; enforcement lives DB-side in `sp_ValidateDraft` and the managed `DraftValidator` has **no live caller** in `src/` today. It shares only a *philosophy* (return every failure in one pass) with this component and is **out of scope** for the shared library. It stays entirely in OtOpcUa. ## 1. `IValidateOptions` base — `OptionsValidatorBase` The headline plumbing fix. Today each validator re-implements: the `Validate(string?, TOptions)` signature, a local `List`, the `failures.Count == 0 ? Success : Fail(failures)` tail, and (in several) private `AddIfBlank` / `AddIfNotPositive` helpers. The base owns all of that: ```csharp public sealed class ClusterOptionsValidator : OptionsValidatorBase { protected override void Validate(ValidationBuilder v, ClusterOptions o) { v.MinCount(o.SeedNodes, 2, "ClusterOptions.SeedNodes"); v.OneOf(o.SplitBrainResolverStrategy, ["keep-oldest"], "ClusterOptions.SplitBrainResolverStrategy"); v.PositiveTimeSpan(o.StableAfter, "ClusterOptions.StableAfter"); v.RequireThat(o.MinNrOfMembers == 1, $"ClusterOptions.MinNrOfMembers must be 1 (was {o.MinNrOfMembers})"); // cross-field rule: v.RequireThat(o.HeartbeatInterval < o.FailureDetectionThreshold, "ClusterOptions.HeartbeatInterval must be below FailureDetectionThreshold"); } } ``` `OptionsValidatorBase.Validate(string?, TOptions)` guards null, creates a `ValidationBuilder`, calls the override, and returns `Success` only when `builder.IsValid`. **Accumulation is automatic** — the override never returns early; it records everything. ## 2. Rule primitives — `ValidationBuilder` `ValidationBuilder` is the accumulator passed into the override. Primitives both check a value and append a consistently-worded `" "` message on failure; escape hatches cover the rest: | Primitive | Checks | Failure wording (from `Checks`) | |---|---|---| | `Required(value, field)` | non-null, non-whitespace string | `" is required"` | | `Port(value, field)` | int in 1–65535 | `" must be between 1 and 65535 (was )"` | | `HostPort(value, field)` | `host:port` with port 1–65535 | `" must be 'host:port' with port 1-65535 (was '')"` | | `PositiveTimeSpan(value, field)` | `> TimeSpan.Zero` | `" must be a positive duration (was )"` | | `OneOf(value, allowed, field)` | case-insensitive membership | `" must be one of [] (was '')"` | | `MinCount(value, min, field)` | collection ≥ `min` items | `" must contain at least item(s) (had )"` | | `RequireThat(ok, message)` | arbitrary boolean (cross-field, custom) | caller-supplied | | `Add(message)` | unconditional failure | caller-supplied | Properties: `Failures` (read-only accumulated list) and `IsValid`. Every method returns the builder for chaining. `Add`/`RequireThat` carry the rules that are genuinely app-specific (e.g. MxGateway's "ExecutablePath must point to a .exe", ScadaBridge's heartbeat-vs-threshold ordering) without forcing them into a primitive. ## 3. DI wiring — `AddValidatedOptions` ```csharp builder.Services.AddValidatedOptions( builder.Configuration, "ScadaBridge:Cluster"); ``` Binds `ScadaBridge:Cluster` → `ClusterOptions`, registers `ClusterOptionsValidator` as a singleton `IValidateOptions`, and calls `ValidateOnStart()`. Returns the `OptionsBuilder` for further chaining (e.g. `.PostConfigure(...)`). This collapses the three-line idiom every module repeats (`AddOptions().Bind(...).ValidateOnStart()` + `AddSingleton, ...>()`) into one call. > The validator is registered as a **singleton** (it backs the singleton options factory). It > must be singleton-safe — no scoped dependencies. All current validators are stateless, so this > holds. When a section bound this way fails, the .NET options pipeline raises **`OptionsValidationException`** at host start (because of `ValidateOnStart()`), with all accumulated messages. ## 4. Pre-host preflight — `ConfigPreflight` For keys that must be valid **before** the host / DI / actor system exists, `ConfigPreflight` reads raw `IConfiguration` and accumulates failures the same way: ```csharp ConfigPreflight.For(configuration) .Require("ScadaBridge:Node:Role", v => v is "Central" or "Site", "must be 'Central' or 'Site'") .RequireValue("ScadaBridge:Node:NodeHostname") .RequirePort("ScadaBridge:Node:RemotingPort") .When(role == "Site", p => p.RequireValue("ScadaBridge:Node:SiteId")) .ThrowIfInvalid(); ``` `.ThrowIfInvalid()` throws **`InvalidOperationException`** when any failure was recorded, with this exact envelope: ``` Configuration validation failed: - - ``` > **Byte-compatibility with ScadaBridge's `StartupValidator`.** ScadaBridge's > `StartupValidator.Validate` throws > `$"Configuration validation failed:\n{string.Join("\n", errors.Select(e => $" - {e}"))}"`. > `ConfigPreflight.ThrowIfInvalid()` produces the **identical** string > (`"Configuration validation failed:\n" + the same `" - "` lines, `\n`-joined`). > The migration is a behaviour-preserving swap: same exception type > (`InvalidOperationException`), same message bytes. This is verified in the library's > `ConfigPreflightTests` and is the reason the message format is pinned in §0. `.When(condition, block)` carries role-conditional rules (ScadaBridge only validates database / security / gRPC-port keys when the node is `Central` or `Site` respectively) without an `if` ladder. ## 5. Per-project migration | Project | Current state | Primary gaps | What normalizes | |---|---|---|---| | **OtOpcUa** | **No options validation at all** — options bound with bare `.Bind()` (`LdapOptions`, `OpcUa`); zero `IValidateOptions` / `ValidateOnStart` in the repo. The only validation-shaped type is the dormant C# `DraftValidator` (draft/generation content; real enforcement is DB-side `sp_ValidateDraft`) — **out of scope**. | No startup validation of `Ldap` / `OpcUa` sections — a bad value fails opaquely on first use. | *Optional* adoption: add `OptionsValidatorBase` subclasses + `AddValidatedOptions` for the sections worth guarding. `DraftValidator`/`DraftSnapshot` stay per-project untouched. Lightest consumer. | | **MxGateway** | One large `GatewayOptionsValidator : IValidateOptions` (~360 LOC, 9 sub-validators, private `AddIfBlank`/`AddIfNotPositive`/`AddIfInvalidPath` helpers); wired via `AddGatewayConfiguration` (`AddOptions().BindConfiguration().ValidateOnStart()`). | Hand-rolled accumulation + helpers duplicate the base; bespoke DI wiring duplicates `AddValidatedOptions`. | `GatewayOptionsValidator` → `OptionsValidatorBase` (delete the `List`/tail/helpers; keep the domain rules); `AddGatewayConfiguration` → `AddValidatedOptions`. Domain rules unchanged. | | **ScadaBridge** | **Heaviest.** Four per-module `*OptionsValidator : IValidateOptions` (Cluster / Security / HealthMonitoring / AuditLog) each with their own `List` accumulation, wired through bespoke `AddXxx` extensions; **plus** a raw-config pre-Akka `StartupValidator`. | Four copies of the accumulation plumbing + bespoke DI wiring; `StartupValidator` open-codes the preflight envelope. | Each `*OptionsValidator` → `OptionsValidatorBase`; each module's `AddXxx` → `AddValidatedOptions`; `StartupValidator` → `ConfigPreflight` (byte-compatible message, §4). Domain rules unchanged. | > No sister-repo adoption is in scope for this release — the library is built; adoption is the > follow-on tracked in [`../GAPS.md`](../GAPS.md). (Unlike the observability pass, which carried > one in-pass MxGateway adoption, this pass is library-only.) ## 6. Acceptance (what "converged" means) A project is converged when: (a) every options validator it owns derives from `OptionsValidatorBase` and records failures on the supplied `ValidationBuilder` (no private `List` plumbing, no early return); (b) every bind-and-validate registration goes through `AddValidatedOptions(config, sectionPath)`; (c) any pre-host raw-config checks go through `ConfigPreflight` and surface via `ThrowIfInvalid()`; (d) all validation **accumulates every failure** and surfaces them together (`OptionsValidationException` at host start, or `InvalidOperationException` from `ConfigPreflight`); and (e) failure wording for the shared primitives comes from the library's `Checks` seam, identical across the fleet. Each app's **options classes and domain rules stay its own**; only the plumbing is shared. OtOpcUa's `DraftValidator` is explicitly exempt — it is not part of the converged surface.