The C# DraftValidator/DraftSnapshot has NO live caller in OtOpcUa src/ (verified repo-wide) — it is dormant complement code. The enforced pre-publish draft validation runs DB-side in the sp_ValidateDraft stored procedure (Status='Draft' -> sp_PublishGeneration lifecycle). Reframe across current-state/SPEC/GAPS/README/ CLAUDE.md from 'runtime draft validation' + a false publish-pipeline caller to 'dormant managed validator; enforcement is DB-side'. Out-of-scope conclusion for ZB.MOM.WW.Configuration is unchanged.
14 KiB
Configuration validation — normalized target spec
Status: Draft. The single design the sister projects converge on for startup
configuration validation. Derived from the three code-verified current-state docs
(../current-state/). Goal is path to shared code
(../shared-contract/ZB.MOM.WW.Configuration.md), so each normalized section maps to a shared
library seam. The library is already built at
../../../ZB.MOM.WW.Configuration/ (0.1.0, 27 tests).
0. Scope
The common concern is fail-fast validation of configuration at process startup: bind an
appsettings.json / environment section to a typed options object (or read raw keys before the
host exists), check every field, and refuse to start when anything is wrong — surfacing all
problems at once so an operator fixes them in one edit rather than one boot-loop per typo. All
three apps already do this; they do it with three private copies of the same plumbing.
Normalized here (goes in the shared ZB.MOM.WW.Configuration library):
- The
IValidateOptions<T>failure-accumulation convention. Every app hand-rolls aList<string> failures, a pile ofif (...) failures.Add(...), and thefailures.Count == 0 ? Success : Fail(failures)tail. That plumbing becomesOptionsValidatorBase<TOptions>: overrideprotected void Validate(ValidationBuilder, TOptions), record failures on the builder, and the base aggregates them and returns a singleValidateOptionsResult(Success only when the builder is clean). - Reusable rule primitives. The same checks recur across apps — required-string, TCP port
range,
host:portendpoint, positiveTimeSpan, one-of-a-set, minimum collection count. They becomeValidationBuilderprimitives (Required,Port,HostPort,PositiveTimeSpan,OneOf,MinCount) plusRequireThat(bool, message)/Add(message)escape hatches for custom and cross-field rules. Wording is centralized in an internalChecksseam so a given rule reads identically everywhere. AddValidatedOptions<TOptions, TValidator>(IConfiguration, sectionPath)— one DI call that binds the section, registers the validator as the options'IValidateOptions<TOptions>, and enablesValidateOnStart(). Replaces the per-moduleAddOptions().Bind(...).ValidateOnStart()AddSingleton<IValidateOptions<...>, ...>()pair that each app open-codes.
- The pre-host
ConfigPreflightaggregator — a fluent checker over rawIConfigurationfor the keys that must be valid before the host / DI container / actor system is built (node role, remoting port, site id). Generalizes ScadaBridge'sStartupValidator. Fluent surface:For(config),.Require(key, predicate, reason),.RequireValue(key),.RequirePort(key),.When(condition, block)(role-conditional rules),.ThrowIfInvalid().
The error-handling contract (shared across both front-ends):
- Accumulate ALL failures. Never short-circuit on the first failure — collect every problem
and surface them together. (
OptionsValidatorBaseandConfigPreflightboth do this; it is the behaviour every app already wanted.) - Two surfacing paths, by where validation runs:
- Options bound through DI →
ValidateOnStart()raises anOptionsValidationExceptionat host start (the .NET options pipeline aggregates the failures). This is theAddValidatedOptionspath. - Raw config, pre-host →
ConfigPreflight.ThrowIfInvalid()throws anInvalidOperationExceptionlisting all failures.
- Options bound through DI →
- Message format
"<field> <reason>"for each individual failure, produced by the sharedChecksprimitives (e.g."ScadaBridge:Node:RemotingPort must be between 1 and 65535 (was '0')").ConfigPreflight.ThrowIfInvalid()wraps the accumulated lines in the exact envelope ScadaBridge'sStartupValidatoruses today (§4) so the migration is byte-compatible.
Explicitly NOT normalized (domain-specific — stays per project):
- Each app's options classes and their domain rules.
GatewayOptions(worker exe path, heartbeat grace ≥ interval, TLS validity years),ClusterOptions(split-brain strategy,MinNrOfMembers == 1, heartbeat ≪ failure-detection),SecurityOptions(LDAP server / search base),HealthMonitoringOptions(positivePeriodicTimerintervals),AuditLogOptions(payload caps, retention bounds), and ScadaBridge'sNodetopology rules (gRPC port ≠ remoting port, seed nodes must not target the gRPC port) all stay where they live. Only the plumbing they sit on is shared; the rules are theirs. - OtOpcUa's draft/generation-content validation (the dormant C#
DraftValidator/DraftSnapshot, plus the live DB stored proceduresp_ValidateDraftit was designed to complement). This is not options/config validation at all — it is pre-publish validation of an operator's configuration draft content (UNS segment regex, EquipmentId derivation, cross-cluster namespace binding, reservation pre-flight) against database rows, not againstIConfiguration; enforcement lives DB-side insp_ValidateDraftand the managedDraftValidatorhas no live caller insrc/today. It shares only a philosophy (return every failure in one pass) with this component and is out of scope for the shared library. It stays entirely in OtOpcUa.
1. IValidateOptions base — OptionsValidatorBase<TOptions>
The headline plumbing fix. Today each validator re-implements: the Validate(string?, TOptions)
signature, a local List<string>, the failures.Count == 0 ? Success : Fail(failures) tail,
and (in several) private AddIfBlank / AddIfNotPositive helpers. The base owns all of that:
public sealed class ClusterOptionsValidator : OptionsValidatorBase<ClusterOptions>
{
protected override void Validate(ValidationBuilder v, ClusterOptions o)
{
v.MinCount(o.SeedNodes, 2, "ClusterOptions.SeedNodes");
v.OneOf(o.SplitBrainResolverStrategy, ["keep-oldest"], "ClusterOptions.SplitBrainResolverStrategy");
v.PositiveTimeSpan(o.StableAfter, "ClusterOptions.StableAfter");
v.RequireThat(o.MinNrOfMembers == 1,
$"ClusterOptions.MinNrOfMembers must be 1 (was {o.MinNrOfMembers})");
// cross-field rule:
v.RequireThat(o.HeartbeatInterval < o.FailureDetectionThreshold,
"ClusterOptions.HeartbeatInterval must be below FailureDetectionThreshold");
}
}
OptionsValidatorBase<TOptions>.Validate(string?, TOptions) guards null, creates a
ValidationBuilder, calls the override, and returns Success only when builder.IsValid.
Accumulation is automatic — the override never returns early; it records everything.
2. Rule primitives — ValidationBuilder
ValidationBuilder is the accumulator passed into the override. Primitives both check a value
and append a consistently-worded "<field> <reason>" message on failure; escape hatches cover
the rest:
| Primitive | Checks | Failure wording (from Checks) |
|---|---|---|
Required(value, field) |
non-null, non-whitespace string | "<field> is required" |
Port(value, field) |
int in 1–65535 | "<field> must be between 1 and 65535 (was <value>)" |
HostPort(value, field) |
host:port with port 1–65535 |
"<field> must be 'host:port' with port 1-65535 (was '<value>')" |
PositiveTimeSpan(value, field) |
> TimeSpan.Zero |
"<field> must be a positive duration (was <value>)" |
OneOf(value, allowed, field) |
case-insensitive membership | "<field> must be one of [<allowed>] (was '<value>')" |
MinCount(value, min, field) |
collection ≥ min items |
"<field> must contain at least <min> item(s) (had <n>)" |
RequireThat(ok, message) |
arbitrary boolean (cross-field, custom) | caller-supplied |
Add(message) |
unconditional failure | caller-supplied |
Properties: Failures (read-only accumulated list) and IsValid. Every method returns the
builder for chaining. Add/RequireThat carry the rules that are genuinely app-specific (e.g.
MxGateway's "ExecutablePath must point to a .exe", ScadaBridge's heartbeat-vs-threshold
ordering) without forcing them into a primitive.
3. DI wiring — AddValidatedOptions
builder.Services.AddValidatedOptions<ClusterOptions, ClusterOptionsValidator>(
builder.Configuration, "ScadaBridge:Cluster");
Binds ScadaBridge:Cluster → ClusterOptions, registers ClusterOptionsValidator as a
singleton IValidateOptions<ClusterOptions>, and calls ValidateOnStart(). Returns the
OptionsBuilder<TOptions> for further chaining (e.g. .PostConfigure(...)). This collapses the
three-line idiom every module repeats (AddOptions().Bind(...).ValidateOnStart() +
AddSingleton<IValidateOptions<...>, ...>()) into one call.
The validator is registered as a singleton (it backs the singleton options factory). It must be singleton-safe — no scoped dependencies. All current validators are stateless, so this holds.
When a section bound this way fails, the .NET options pipeline raises OptionsValidationException
at host start (because of ValidateOnStart()), with all accumulated messages.
4. Pre-host preflight — ConfigPreflight
For keys that must be valid before the host / DI / actor system exists, ConfigPreflight
reads raw IConfiguration and accumulates failures the same way:
ConfigPreflight.For(configuration)
.Require("ScadaBridge:Node:Role", v => v is "Central" or "Site", "must be 'Central' or 'Site'")
.RequireValue("ScadaBridge:Node:NodeHostname")
.RequirePort("ScadaBridge:Node:RemotingPort")
.When(role == "Site", p => p.RequireValue("ScadaBridge:Node:SiteId"))
.ThrowIfInvalid();
.ThrowIfInvalid() throws InvalidOperationException when any failure was recorded, with
this exact envelope:
Configuration validation failed:
- <field> <reason>
- <field> <reason>
Byte-compatibility with ScadaBridge's
StartupValidator. ScadaBridge'sStartupValidator.Validatethrows$"Configuration validation failed:\n{string.Join("\n", errors.Select(e => $" - {e}"))}".ConfigPreflight.ThrowIfInvalid()produces the identical string ("Configuration validation failed:\n" + the same" - "lines,\n-joined). The migration is a behaviour-preserving swap: same exception type (InvalidOperationException), same message bytes. This is verified in the library'sConfigPreflightTestsand is the reason the message format is pinned in §0.
.When(condition, block) carries role-conditional rules (ScadaBridge only validates database /
security / gRPC-port keys when the node is Central or Site respectively) without an if ladder.
5. Per-project migration
| Project | Current state | Primary gaps | What normalizes |
|---|---|---|---|
| OtOpcUa | No options validation at all — options bound with bare .Bind() (LdapOptions, OpcUa); zero IValidateOptions / ValidateOnStart in the repo. The only validation-shaped type is the dormant C# DraftValidator (draft/generation content; real enforcement is DB-side sp_ValidateDraft) — out of scope. |
No startup validation of Ldap / OpcUa sections — a bad value fails opaquely on first use. |
Optional adoption: add OptionsValidatorBase subclasses + AddValidatedOptions for the sections worth guarding. DraftValidator/DraftSnapshot stay per-project untouched. Lightest consumer. |
| MxGateway | One large GatewayOptionsValidator : IValidateOptions<GatewayOptions> (~360 LOC, 9 sub-validators, private AddIfBlank/AddIfNotPositive/AddIfInvalidPath helpers); wired via AddGatewayConfiguration (AddOptions().BindConfiguration().ValidateOnStart()). |
Hand-rolled accumulation + helpers duplicate the base; bespoke DI wiring duplicates AddValidatedOptions. |
GatewayOptionsValidator → OptionsValidatorBase<GatewayOptions> (delete the List<string>/tail/helpers; keep the domain rules); AddGatewayConfiguration → AddValidatedOptions<GatewayOptions, GatewayOptionsValidator>. Domain rules unchanged. |
| ScadaBridge | Heaviest. Four per-module *OptionsValidator : IValidateOptions<T> (Cluster / Security / HealthMonitoring / AuditLog) each with their own List<string> accumulation, wired through bespoke AddXxx extensions; plus a raw-config pre-Akka StartupValidator. |
Four copies of the accumulation plumbing + bespoke DI wiring; StartupValidator open-codes the preflight envelope. |
Each *OptionsValidator → OptionsValidatorBase<T>; each module's AddXxx → AddValidatedOptions; StartupValidator → ConfigPreflight (byte-compatible message, §4). Domain rules unchanged. |
No sister-repo adoption is in scope for this release — the library is built; adoption is the follow-on tracked in
../GAPS.md. (Unlike the observability pass, which carried one in-pass MxGateway adoption, this pass is library-only.)
6. Acceptance (what "converged" means)
A project is converged when: (a) every options validator it owns derives from
OptionsValidatorBase<TOptions> and records failures on the supplied ValidationBuilder (no
private List<string> plumbing, no early return); (b) every bind-and-validate registration goes
through AddValidatedOptions<TOptions, TValidator>(config, sectionPath); (c) any pre-host raw-config
checks go through ConfigPreflight and surface via ThrowIfInvalid(); (d) all validation
accumulates every failure and surfaces them together (OptionsValidationException at host
start, or InvalidOperationException from ConfigPreflight); and (e) failure wording for the
shared primitives comes from the library's Checks seam, identical across the fleet. Each app's
options classes and domain rules stay its own; only the plumbing is shared. OtOpcUa's
DraftValidator is explicitly exempt — it is not part of the converged surface.