The C# DraftValidator/DraftSnapshot has NO live caller in OtOpcUa src/ (verified
repo-wide) — it is dormant complement code. The enforced pre-publish draft
validation runs DB-side in the sp_ValidateDraft stored procedure (Status='Draft'
-> sp_PublishGeneration lifecycle). Reframe across current-state/SPEC/GAPS/README/
CLAUDE.md from 'runtime draft validation' + a false publish-pipeline caller to
'dormant managed validator; enforcement is DB-side'. Out-of-scope conclusion
for ZB.MOM.WW.Configuration is unchanged.
Shared startup-options-validation library (single package, 27 tests) — OptionsValidatorBase,
ValidationBuilder primitives, AddValidatedOptions (ValidateOnStart), and pre-host ConfigPreflight
(byte-compatible with ScadaBridge's StartupValidator). Plus components/configuration normalization
docs (spec, shared-contract, 3x current-state, GAPS) and index registration. Not yet adopted by
the three apps — adoption tracked in components/configuration/GAPS.md.
The sibling libs (Auth/Theme/Health/Telemetry) are tracked as regular files in
the outer scadaproj repo, not separate git repos. Remove the git-init/nested-repo
instructions; all commits target the outer repo on feat/zb-mom-ww-configuration.
Folds the approved design into the sibling combined-doc format and adds the
phased, bite-sized TDD implementation plan: normalization docs (T1-2), library
scaffold + 4 public types via TDD (T3-7), pack + register (T8-9). Co-located
.tasks.json for executing-plans resume.
Approved brainstorming design for the Config + validation normalization pass
(Tier-2 candidate in upcoming.md). Scope: startup options validation only,
single package ZB.MOM.WW.Configuration, Approach A (lightweight base + rule
primitives + DI/startup helpers). Full pass = components/configuration/ docs +
built library.
Fix 1 — symmetric OTLP trigger: ZbSerilogConfig.ApplyOpenTelemetryExport now activates only
when options.Exporter == ZbExporter.Otlp, matching the core OTel metrics/traces path. The
previous fallback that also triggered on a bare OtlpEndpoint is removed; OtlpEndpoint is the
address to use when Otlp is selected, not an independent enable.
Fix 2 — deterministic service.instance.id: ZbResource.InstanceId (MachineName:ProcessId) is
a new public property that produces a stable, process-unique id without a random GUID.
ZbResource.Configure passes autoGenerateServiceInstanceId:false + serviceInstanceId:InstanceId
so metrics and traces never get a random auto-generated id. ZbSerilogConfig.BuildResourceAttributes
adds service.instance.id from ZbResource.InstanceId so the Serilog OTLP log sink carries the
exact same value — all three signals now share an identical resource for cross-signal joins.
Tests: +2 in ZbResourceTests (InstanceId determinism, no-GUID check), +2 in RedactionTests
(service.instance.id parity assertion in BuildResourceAttributes, symmetric OTLP trigger tests).
Total: 9 + 14 = 23 tests, all green.
Remove the Stage-1 bootstrap-logger line (Log.Logger = new LoggerConfiguration()
.WriteTo.Console().CreateBootstrapLogger()) from AddZbSerilog. A shared library must
not mutate process-global state: when multiple hosts are built in one process (integration
tests, Aspire multi-host, parallel test runs) the second call throws "The logger is
already frozen".
AddSerilog is now called with preserveStaticLogger: true so Serilog.Extensions.Hosting
leaves the static Log.Logger entirely untouched. The DI-registered application logger is
the only artifact AddZbSerilog produces.
Apps that want a pre-Build() bootstrap logger should set Log.Logger themselves in
Program.cs before calling AddZbSerilog — that decision belongs to the application.
Three new regression tests in MultiHostTests verify: two hosts build in the same process
without throwing; Log.Logger is not mutated; each host gets its own independent DI ILogger.
Docs (SPEC.md §5 and shared-contract ZB.MOM.WW.Telemetry.md) updated: the "two-stage
bootstrap" framing is replaced with the correct description — library registers only the
DI application logger; optional Stage-1 bootstrap is the app's responsibility.
C1: NodeHostname is AUTO throughout. Shared-contract AddZbSerilog doc comment now reads
"SiteId + NodeRole from ZbTelemetryOptions; NodeHostname from Environment.MachineName (auto)".
SPEC.md §0 and §5 prose updated to match. ScadaBridge adoption snippet no longer sets
o.NodeHostname (removed; NodeHostname is auto, not caller-supplied).
C2: METRIC-CONVENTIONS §6.1 OtOpcUa instrument table replaced with code-verified set:
counters otopcua.deploy.applied / driver.lifecycle / virtualtag.eval / scriptedalarm.transition /
opcua.sink.write / redundancy.service_level_change; histogram otopcua.deploy.apply.duration (s);
ActivitySource ZB.MOM.WW.OtOpcUa with spans otopcua.deploy.apply + otopcua.opcua.address_space_rebuild.
Removed invented names (deploy.failed, tag.subscriptions, tag.reads, tag.writes, session.active,
connection.gateway).
C3: METRIC-CONVENTIONS §6.2 MxGateway instrument table replaced with code-verified names from
GatewayMetrics.cs: 13 counters (sessions.opened/closed, commands.started/succeeded/failed,
events.received, queues.overflows, faults, workers.killed/exited, heartbeats.failed,
grpc.streams.disconnected, retries.attempted); 3 histograms ms (workers.startup.duration,
commands.duration, events.stream_send.duration); 4 gauges (sessions.open, workers.running,
events.worker_queue.depth, events.grpc_stream_queue.depth). Removed invented names.
m3: §2 example table replaced mxgateway.session.active + mxgateway.worker.call.duration
(invented) with mxgateway.sessions.open + mxgateway.commands.duration (real). Also fixed
the §2 rule-2 body text example which referenced mxgateway.worker.call.duration.
I4: §5 standard instrumentation table corrected — OtOpcUa now shows ⛔ not added for all
five baseline instrumentations, matching current-state/otopcua. All three projects lack
standard instrumentation today; AddZbTelemetry adds it on adoption.
I1+m1: GAPS.md "Decisions still open" — removed the two settled questions (Prometheus-default
and ms→s/meter-rename bundling). Moved them to a new "Decisions settled" section with explicit
resolution notes. One genuinely open question remains (SiteId/NodeRole config binding path).
I2: SPEC.md §5 AddZbSerilog: added note that AddZbSerilog reads Serilog:MinimumLevel from
IConfiguration; callers with a different config key (e.g. ScadaBridge:Logging:MinimumLevel)
apply that override themselves — stays per-project. Shared-contract doc comment updated to match.
I3: MxAccessGateway adoption plan Meters = ["MxGateway.Server"] annotated as temporary with
note to update to ZB.MOM.WW.MxGateway when Gap N1 (Meter-rename) is closed.
m2: SPEC.md §1 now notes AddZbTelemetry also has an IServiceCollection overload for non-standard
hosts, with the IHostApplicationBuilder overload as the primary path.
- Added PackageTags to all 3 library csproj files (health-checks;aspnetcore/akka/efcore;scada;wonderware;zb-mom-ww)
- Full solution dotnet test: 58 tests green (32 Akka + 20 core + 6 EFCore)
- dotnet pack -c Release produces ZB.MOM.WW.Health.0.1.0.nupkg, ZB.MOM.WW.Health.Akka.0.1.0.nupkg, ZB.MOM.WW.Health.EntityFrameworkCore.0.1.0.nupkg; artifacts/ not committed
- ZB.MOM.WW.Health/README.md: overview, packages table, consumer matrix, versioning, build/test/pack instructions, status note
- components/README.md: Health row added to component registry
- CLAUDE.md: Health row in Component-normalization table + Health paragraph; intro updated from "two pieces" to "three pieces"
- upcoming.md: Health checks item checked off with pointer to components/health/ and ZB.MOM.WW.Health/
- components/health/README.md: status updated from "Draft / scaffolded / follow-on" to "Built @ 0.1.0"
- Eagerly call CancelAfter(InfiniteTimeSpan) after a successful probe so the pending OS
timer is released on the happy path rather than held for the full timeout window.
- Add ProbeTimeout_Unhealthy test: 50 ms timeout with an infinite-blocking probe delegate
asserts Unhealthy, covering the timeout code path.
- Fix ProbeQueryThrows_Unhealthy to use Task.FromException rather than a synchronous throw,
accurately modelling a faulted async delegate.
- Wrap all BuildServiceProvider() results in await using so ServiceProvider is disposed
after each test (no DI provider leak).
- Remove unused Microsoft.EntityFrameworkCore.InMemory package reference; tests use
SQLite only (InMemory CanConnect semantics differ and the package was not exercised).
- Add <remarks> to DatabaseHealthCheck<TContext> noting the scoped-resolution path is
safe for AddDbContextPool (scope dispose returns context to pool, not destroys it).