Files
scadaproj/docs/plans/2026-06-01-deploy-zb-configuration-design.md
T
Joseph Doherty c3ab37523a docs: record ZB.MOM.WW.Configuration fleet-wide adoption + add design/plan
Configuration is now adopted across all three sister apps (local branches),
so flip the status lines in CLAUDE.md, components/configuration/GAPS.md, and the
lib README/CLAUDE.md from 'not adopted' to adopted (also corrects 27->42 tests).
Adds the brainstorm design doc + bite-sized implementation plan (+tasks.json)
under docs/plans/ that drove the adoption.
2026-06-01 23:18:02 -04:00

203 lines
11 KiB
Markdown

# Design: Deploy `ZB.MOM.WW.Configuration` fleet-wide
**Date:** 2026-06-01
**Status:** Approved — ready for implementation planning (writing-plans).
**Scope:** Adopt the shared `ZB.MOM.WW.Configuration` library into all three sister apps
(OtOpcUa, MxAccessGateway, ScadaBridge).
> Every state claim below was **code-verified on 2026-06-01**, not taken from the
> `components/*/GAPS.md` prose — those docs proved unreliable in both directions (they
> claimed Health was un-adopted when it is fully adopted, and claimed Telemetry was
> adopted before it was). See memory `component-status-claims-are-optimistic`.
---
## 0. Why this module
Verified fleet-wide adoption state (real `PackageReference` + usage scan of the three
sister-app `src/` trees, plus Gitea-feed `curl`):
| Module | OtOpcUa | MxAccessGateway | ScadaBridge | Status |
|---|---|---|---|---|
| Health | ✅ | ✅ | ✅ | already deployed fleet-wide |
| Telemetry (observability) | ✅ | ✅ | ✅ | already deployed fleet-wide |
| **Configuration** | — | — | — | **chosen: not adopted anywhere** |
| Auth | — | — | — | not adopted |
| UI Theme | — | — | — | not adopted |
| Audit | — | — | — | not adopted |
Configuration was chosen as the next fleet-wide adoption because it is the same
cross-cutting-infra flavour as the already-done Health + Telemetry, it is the
lowest-risk (behaviour-preserving for the two heavy consumers), and it still delivers
real new value (OtOpcUa gains fail-fast startup validation it lacks entirely today).
### Decisions locked during brainstorming
- **Module:** Configuration.
- **OtOpcUa depth:** add **real** validators (net-new `Ldap`/`OpcUa` startup validation),
not just a package reference.
- **Rollout:** per-repo **sequential**, increasing risk order: Foundation → MxGateway →
OtOpcUa → ScadaBridge; each repo on its own branch, verified green before the next.
- **ScadaBridge `StartupValidator``ConfigPreflight`:** included in this pass.
---
## 1. Goal & scope
Move the config-validation **plumbing** (failure accumulation, the bind+validate+
`ValidateOnStart` triple, the pre-host raw-config aggregator) into the shared library so
it is written once; leave every **domain rule and failure message** per-project.
**Out of scope:**
- OtOpcUa's `DraftValidator` / `sp_ValidateDraft` — domain *content* validation over
database draft rows, dormant in `src/`, not the host-config concern this library owns.
- Any change to rule wording or validation semantics (behaviour-preserving except the
*additive* OtOpcUa validators).
---
## 2. The contract being adopted (verified public API)
From `ZB.MOM.WW.Configuration/src/ZB.MOM.WW.Configuration/`:
- **`OptionsValidatorBase<TOptions>`** — abstract `IValidateOptions<TOptions>`. Override
`protected abstract void Validate(ValidationBuilder, TOptions)`; the base creates the
builder, runs the override, and returns `Success` only when no failures were recorded
(else `Fail(builder.Failures)`).
- **`ValidationBuilder`** — rule primitives `Required`, `Port`, `HostPort`,
`PositiveTimeSpan`, `OneOf`, `MinCount`, plus `RequireThat(bool, message)` and
`Add(message)` for custom / cross-field rules. `Failures` / `IsValid` expose state.
- **`ServiceCollectionExtensions.AddValidatedOptions<TOptions, TValidator>(config, sectionPath)`**
`TryAddEnumerable` the validator (singleton) + `AddOptions().Bind(section).ValidateOnStart()`
in one call; returns the `OptionsBuilder` for chaining.
- **`ConfigPreflight.For(IConfiguration)`** — fluent pre-host checker for raw config
before the DI container exists: `RequireValue(key)`, `RequirePort(key)`,
`Require(key, predicate, reason)`, `When(condition, block)`, terminating in
`ThrowIfInvalid()` (throws `InvalidOperationException` listing all failures).
Library health: `dotnet test`**42 passed, 0 failed** (the `CLAUDE.md` "27 tests" line
is stale-low; the suite passes regardless).
---
## 3. Foundation phase (must land before any repo adopts)
This is the part the status docs hide. Verified 2026-06-01:
1. **Pack + push the package.** `ZB.MOM.WW.Configuration` is **404 on the Gitea feed**
(`registration/zb.mom.ww.configuration/index.json`), while the known-adopted Health
package returns 200. `dotnet pack -c Release` then push the `.nupkg` to
`https://gitea.dohertylan.com/api/packages/dohertj2/nuget`.
2. **Per-app feed wiring** (all three `nuget.config` files): the `dohertj2-gitea`
`packageSourceMapping` currently routes only `ZB.MOM.WW.MxGateway.*`,
`ZB.MOM.WW.Health*`, `ZB.MOM.WW.Telemetry*`. Add
`<package pattern="ZB.MOM.WW.Configuration" />`. Without this, restore fails even with
the package on the feed.
3. **Central version pin** in each app's `Directory.Packages.props`:
`<PackageVersion Include="ZB.MOM.WW.Configuration" Version="0.1.0" />`.
4. **Verify gate:** `curl` the registration index → **200** before any repo work begins.
---
## 4. Per-repo adoption (sequential)
Each repo: branch `feat/adopt-zb-configuration`, `PackageReference` (no version — central
package management), migrate, `dotnet build` + `dotnet test` green, then move on.
### Repo 1 — MxAccessGateway (medium; pure refactor)
- `PackageReference Include="ZB.MOM.WW.Configuration"` in
`src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj`.
- `GatewayOptionsValidator : IValidateOptions<GatewayOptions>`
`: OptionsValidatorBase<GatewayOptions>`. Drop the private `List<string>` and the
`Count == 0 ? Success : Fail` tail (now the base's job). Map private helpers:
`AddIfBlank``Required`; `AddIfNotPositive` / `AddIfNegative``RequireThat(... , msg)`.
Keep `AddIfInvalidPath`, the `.exe`-extension rule, the cross-field
`HeartbeatGraceSeconds >= HeartbeatIntervalSeconds`, range checks, and all nine
sub-validators as `RequireThat`/`Add` custom rules. **Every message string unchanged.**
- `AddGatewayConfiguration`'s `AddOptions().BindConfiguration(SectionName).ValidateOnStart()`
+ `AddSingleton<IValidateOptions<GatewayOptions>, GatewayOptionsValidator>()`
`services.AddValidatedOptions<GatewayOptions, GatewayOptionsValidator>(config, GatewayOptions.SectionName)`.
Keep the separate `IGatewayConfigurationProvider` registration.
### Repo 2 — OtOpcUa (lightest base, but net-new validation added)
- `PackageReference` in
`src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj`.
- New `LdapOptionsValidator : OptionsValidatorBase<LdapOptions>`
(`LdapOptions` lives in `ZB.MOM.WW.OtOpcUa.Security/Ldap/`): `Required` on Server /
SearchBase (and other not-optional fields). `Program.cs:99`
`AddOptions<LdapOptions>().Bind(GetSection("Ldap"))`
`AddValidatedOptions<LdapOptions, LdapOptionsValidator>(config, "Ldap")`.
- New validator for the `OpcUa` section; replace the imperative
`GetSection("OpcUa").Bind(options)` at `OtOpcUaServerHostedService.cs:63` with validated
options resolved from DI. Exact rule list finalized in the implementation plan from the
real `OpcUaOptions` fields (ports → `Port`, endpoints → `HostPort`, required strings →
`Required`, durations → `PositiveTimeSpan`).
- New unit tests for both validators (valid config passes; each missing/invalid field
produces its message).
### Repo 3 — ScadaBridge (heaviest; refactor + preflight)
- `PackageReference` in `src/ZB.MOM.WW.ScadaBridge.Host/...csproj` and the module projects
that own validators (ClusterInfrastructure, Security, HealthMonitoring, AuditLog).
- Four `*OptionsValidator``OptionsValidatorBase<T>`:
- `ClusterOptionsValidator`: `SeedNodes` ≥ 2 → `MinCount`; strategy ∈ set → `OneOf`;
three positive `TimeSpan``PositiveTimeSpan`; cross-field heartbeat/threshold and
`DownIfAlone`/`MinNrOfMembers``RequireThat`.
- `SecurityOptionsValidator`: `Required` LdapServer / LdapSearchBase (JwtSigningKey stays
validated in `JwtTokenService` ctor — unchanged).
- `HealthMonitoringOptionsValidator`: three `PositiveTimeSpan` + cross-field
`CentralOfflineTimeout >= OfflineTimeout``RequireThat`. Preserve the idempotent
registration called from all three `Add*HealthMonitoring` entry points.
- `AuditLogOptionsValidator`: positive/`>=`/range checks → `RequireThat`.
- Each module `AddXxx``AddValidatedOptions<T, TValidator>` where the section binding
shape allows (preserve `ValidateOnStart` + `TryAddEnumerable` semantics).
- `StartupValidator.Validate(configuration)` at `Program.cs:41` → `ConfigPreflight.For(
configuration).RequireValue(...)/RequirePort(...)/When(...).ThrowIfInvalid()`. **Must
keep `StartupValidatorTests` green** — the thrown message is byte-compatible with
`ConfigPreflight.ThrowIfInvalid()`.
---
## 5. Error handling / behaviour preservation
- Failure surface is unchanged everywhere: `OptionsValidationException` thrown at host
start via `ValidateOnStart`; `ConfigPreflight.ThrowIfInvalid()` throws the same
`InvalidOperationException` text ScadaBridge's `StartupValidator` throws today.
- MxGateway + ScadaBridge: **zero message changes** — the existing validator tests and
`StartupValidatorTests` are the regression guard.
- OtOpcUa: **additive** — a config that was silently accepted (then failed late as an LDAP
error on first login, or an OPC UA bind error) now fails fast at startup. That is the
intended improvement, called out so it is not mistaken for a regression.
---
## 6. Testing & verification (gate per repo, before moving on)
- Library: re-run `dotnet test` (already 42 green).
- Each repo on its branch: `dotnet build` + `dotnet test` green.
- MxGateway: `src/MxGateway.Tests` (fake worker — no MXAccess needed).
- OtOpcUa: full solution test + the new validator unit tests.
- ScadaBridge: four validator tests + `StartupValidatorTests` still green.
- **Restore proof** per repo: a clean restore pulls `ZB.MOM.WW.Configuration 0.1.0` from
Gitea — confirms both the push and the source-mapping edit.
---
## 7. Risks & mitigations
| Risk | Mitigation |
|---|---|
| Package 404 / source-mapping omission breaks restore | Foundation phase + per-repo restore proof gate. |
| A "trivial" message tweak during refactor changes behaviour | Behaviour-preserving rule; existing tests fail loudly if a message drifts. |
| ScadaBridge preflight message drift | `StartupValidatorTests` must pass unchanged. |
| OtOpcUa `OpcUa`/`Ldap` rule set guesses wrong fields | Plan finalizes rules from the actual options classes; additive-only. |
| `AddValidatedOptions` singleton constraint (no scoped deps in validators) | All four ScadaBridge + the gateway validators are already stateless singletons. |
---
## 8. Deliverable & next step
This design doc, then a step-by-step implementation plan produced via the **writing-plans**
skill. No source changes in any repo until the plan is approved and execution begins.
> Note: `~/Desktop/scadaproj` is **not** a git repository, so this design is not committed
> here; it is saved under `docs/plans/`. (Per memory, do not `git init` it without asking.)