Commit Graph

87 Commits

Author SHA1 Message Date
Joseph Doherty 5f75cd4dab Add per-library code-review scaffolding for the ZB.MOM.WW.* shared libs
Adapts the code-reviews convention (process, README generator, template) from
the ScadaBridge app model (per-src/-module, Akka conventions) to scadaproj's
reality: six shared libraries reviewed against their components/ specs.

- REVIEW-PROCESS.md: review unit is a library; library->component-spec mapping;
  checklist re-targeted for reusable .NET libs (public API/semver, packaging &
  dependency hygiene, spec/shared-contract adherence) instead of actor/supervision.
- _template/findings.md: library/packages/component-spec/shared-contract header.
- regen-readme.py: per-library prose, data-driven Summary, '-' for unreviewed.
- Seed Auth/Theme/Health/Telemetry/Configuration/Audit findings stubs (0 findings).
- README.md generated; --check passes.
2026-06-01 10:46:16 -04:00
Joseph Doherty 899efc2cbf Merge fix/config-otopcua-draft-accuracy: correct OtOpcUa draft-validation description
DraftValidator is dormant (no src/ caller); enforcement is DB-side sp_ValidateDraft.
2026-06-01 10:13:34 -04:00
Joseph Doherty fbf0f23e76 docs(config): correct OtOpcUa draft-validation description
The C# DraftValidator/DraftSnapshot has NO live caller in OtOpcUa src/ (verified
repo-wide) — it is dormant complement code. The enforced pre-publish draft
validation runs DB-side in the sp_ValidateDraft stored procedure (Status='Draft'
-> sp_PublishGeneration lifecycle). Reframe across current-state/SPEC/GAPS/README/
CLAUDE.md from 'runtime draft validation' + a false publish-pipeline caller to
'dormant managed validator; enforcement is DB-side'. Out-of-scope conclusion
for ZB.MOM.WW.Configuration is unchanged.
2026-06-01 10:13:29 -04:00
Joseph Doherty e47ecacb0d Merge feat/zb-mom-ww-configuration: Configuration normalization component + ZB.MOM.WW.Configuration (0.1.0)
Shared startup-options-validation library (single package, 27 tests) — OptionsValidatorBase,
ValidationBuilder primitives, AddValidatedOptions (ValidateOnStart), and pre-host ConfigPreflight
(byte-compatible with ScadaBridge's StartupValidator). Plus components/configuration normalization
docs (spec, shared-contract, 3x current-state, GAPS) and index registration. Not yet adopted by
the three apps — adoption tracked in components/configuration/GAPS.md.
2026-06-01 09:58:08 -04:00
Joseph Doherty 69fb6cb077 chore: mark configuration plan tasks complete 2026-06-01 09:56:01 -04:00
Joseph Doherty a29f226a70 docs: list Checks.cs in library CLAUDE.md src tree 2026-06-01 09:55:47 -04:00
Joseph Doherty 3fa77b70fc docs: register ZB.MOM.WW.Configuration in indexes 2026-06-01 09:51:22 -04:00
Joseph Doherty 46c4bfae31 docs(config): components/configuration normalization (spec, shared-contract, current-state x3, GAPS, README) 2026-06-01 09:48:49 -04:00
Joseph Doherty b754873a44 docs: README + CLAUDE.md; verify 0.1.0 pack
ZB.MOM.WW.Configuration — README with purpose, what's-in-the-box,
three usage snippets (validator subclass, DI wiring, ConfigPreflight),
build/test/pack instructions, and dependency note.
CLAUDE.md with one-screen orientation: package table, commands,
source layout, and component-normalization status note.
27 tests pass; dotnet pack produces exactly one nupkg (0.1.0).
2026-06-01 09:40:20 -04:00
Joseph Doherty 8d91a3021d fix(config): centralize port wording, harden HostPort/key guards, doc null/singleton semantics, add tests 2026-06-01 09:37:53 -04:00
Joseph Doherty 8145d79dc6 feat: ConfigPreflight raw-config aggregator 2026-06-01 09:32:44 -04:00
Joseph Doherty e191893738 feat: AddValidatedOptions bind+validate+ValidateOnStart 2026-06-01 09:31:14 -04:00
Joseph Doherty 563cf44c60 feat: OptionsValidatorBase<TOptions> 2026-06-01 09:29:46 -04:00
Joseph Doherty d18c121033 feat: Checks primitives + ValidationBuilder 2026-06-01 09:28:19 -04:00
Joseph Doherty a104372eac chore: scaffold ZB.MOM.WW.Configuration solution 2026-06-01 09:25:26 -04:00
Joseph Doherty 80e4d59209 plan(config): correct git layout — library committed to outer repo, no nested .git
The sibling libs (Auth/Theme/Health/Telemetry) are tracked as regular files in
the outer scadaproj repo, not separate git repos. Remove the git-init/nested-repo
instructions; all commits target the outer repo on feat/zb-mom-ww-configuration.
2026-06-01 09:23:08 -04:00
Joseph Doherty 229b82efbc plan(config): ZB.MOM.WW.Configuration implementation plan (9 tasks, TDD)
Folds the approved design into the sibling combined-doc format and adds the
phased, bite-sized TDD implementation plan: normalization docs (T1-2), library
scaffold + 4 public types via TDD (T3-7), pack + register (T8-9). Co-located
.tasks.json for executing-plans resume.
2026-06-01 09:18:23 -04:00
Joseph Doherty 18e4b70572 docs(plans): design ZB.MOM.WW.Configuration shared startup-options-validation library
Approved brainstorming design for the Config + validation normalization pass
(Tier-2 candidate in upcoming.md). Scope: startup options validation only,
single package ZB.MOM.WW.Configuration, Approach A (lightweight base + rule
primitives + DI/startup helpers). Full pass = components/configuration/ docs +
built library.
2026-06-01 09:10:35 -04:00
Joseph Doherty a09cc02d46 Merge feat/zb-mom-ww-audit: Audit normalization component + ZB.MOM.WW.Audit (0.1.0)
# Conflicts:
#	CLAUDE.md
#	components/README.md
2026-06-01 09:09:44 -04:00
Joseph Doherty 88c557dee8 fix(telemetry): identical resource across all 3 signals (symmetric OTLP trigger + deterministic service.instance.id)
Fix 1 — symmetric OTLP trigger: ZbSerilogConfig.ApplyOpenTelemetryExport now activates only
when options.Exporter == ZbExporter.Otlp, matching the core OTel metrics/traces path. The
previous fallback that also triggered on a bare OtlpEndpoint is removed; OtlpEndpoint is the
address to use when Otlp is selected, not an independent enable.

Fix 2 — deterministic service.instance.id: ZbResource.InstanceId (MachineName:ProcessId) is
a new public property that produces a stable, process-unique id without a random GUID.
ZbResource.Configure passes autoGenerateServiceInstanceId:false + serviceInstanceId:InstanceId
so metrics and traces never get a random auto-generated id. ZbSerilogConfig.BuildResourceAttributes
adds service.instance.id from ZbResource.InstanceId so the Serilog OTLP log sink carries the
exact same value — all three signals now share an identical resource for cross-signal joins.

Tests: +2 in ZbResourceTests (InstanceId determinism, no-GUID check), +2 in RedactionTests
(service.instance.id parity assertion in BuildResourceAttributes, symmetric OTLP trigger tests).
Total: 9 + 14 = 23 tests, all green.
2026-06-01 08:26:09 -04:00
Joseph Doherty 8311912f40 feat(telemetry): pack ZB.MOM.WW.Telemetry 0.1.0 + README/CLAUDE + register observability component in indexes
- NuGet metadata: expanded Description and PackageTags on both library csproj files
  (opentelemetry;observability;metrics;tracing;prometheus;otlp;... / serilog;logging;...)
- Full dotnet test: 7 (Telemetry) + 12 (Serilog) = 19 tests, all green
- dotnet pack: ZB.MOM.WW.Telemetry.0.1.0.nupkg + ZB.MOM.WW.Telemetry.Serilog.0.1.0.nupkg
  (artifacts/ gitignored, not committed)
- ZB.MOM.WW.Telemetry/README.md: overview, 2 packages, unifying hinge prose,
  exporter options, OTel signals + trace-log correlation, test/pack commands, status
- ZB.MOM.WW.Telemetry/CLAUDE.md: package responsibilities, consumer matrix,
  build/test/pack commands, status + pointers to components/observability/
- components/README.md: Observability row added to component registry table
- CLAUDE.md: Telemetry row added to component-normalization table; intro count
  updated to four shared libs; observability prose paragraph added (MxGateway
  logging adoption noted)
- upcoming.md: Observability item ticked done, pointing at components/observability/
  and ZB.MOM.WW.Telemetry; MxGateway MEL->Serilog adoption noted
- components/observability/README.md: status updated to Built @ 0.1.0, library
  build/pack commands added, MxGateway adoption row updated
2026-06-01 08:20:05 -04:00
Joseph Doherty f569d537d1 fix(telemetry.serilog): don't set process-global Log.Logger in AddZbSerilog (multi-host safe)
Remove the Stage-1 bootstrap-logger line (Log.Logger = new LoggerConfiguration()
.WriteTo.Console().CreateBootstrapLogger()) from AddZbSerilog. A shared library must
not mutate process-global state: when multiple hosts are built in one process (integration
tests, Aspire multi-host, parallel test runs) the second call throws "The logger is
already frozen".

AddSerilog is now called with preserveStaticLogger: true so Serilog.Extensions.Hosting
leaves the static Log.Logger entirely untouched. The DI-registered application logger is
the only artifact AddZbSerilog produces.

Apps that want a pre-Build() bootstrap logger should set Log.Logger themselves in
Program.cs before calling AddZbSerilog — that decision belongs to the application.

Three new regression tests in MultiHostTests verify: two hosts build in the same process
without throwing; Log.Logger is not mutated; each host gets its own independent DI ILogger.

Docs (SPEC.md §5 and shared-contract ZB.MOM.WW.Telemetry.md) updated: the "two-stage
bootstrap" framing is replaced with the correct description — library registers only the
DI application logger; optional Stage-1 bootstrap is the app's responsibility.
2026-06-01 08:13:35 -04:00
Joseph Doherty f1240c0bd4 refactor(telemetry.serilog): review fixes (thread-safe redactor, bootstrap logger, minlevel ordering, test coverage) 2026-06-01 07:48:57 -04:00
Joseph Doherty 37fb84f477 feat(telemetry): core review fixes (Prometheus+OTLP coexistence, ServiceName validation, null guards) + contract overload note
- Fix #1: Prometheus exporter always wired for metrics; OTLP is additive overlay
  when Exporter == ZbExporter.Otlp so /metrics + MapZbMetrics work in all modes.
- Fix #2: BuildOptions throws ArgumentException when ServiceName is null/whitespace.
- Fix #3: AddZbTelemetry(IHostApplicationBuilder) guard: ThrowIfNull(configure)
  added alongside existing ThrowIfNull(builder).
- Fix #6: Contract doc adds IServiceCollection convenience overload signature.
- Tests: +3 new tests (OtlpExporter still serves /metrics, empty ServiceName throws,
  whitespace ServiceName throws). Total: 7 passed (was 4).
2026-06-01 07:43:47 -04:00
Joseph Doherty c284e4d68d docs(audit): register component in indexes + GAPS cross-check 2026-06-01 07:41:45 -04:00
Joseph Doherty 2b856074d5 feat(telemetry.serilog): ILogRedactor seam + OTel log export 2026-06-01 07:40:58 -04:00
Joseph Doherty 70f91a855a feat(telemetry.serilog): TraceContextEnricher for trace<->log correlation 2026-06-01 07:38:54 -04:00
Joseph Doherty 1344f249d0 feat(telemetry.serilog): AddZbSerilog bootstrap + identity enrichers 2026-06-01 07:38:07 -04:00
Joseph Doherty 7f05107c1d feat(audit): AddZbAudit DI extension with safe defaults
TryAdd registers NullAuditRedactor + NoOpAuditWriter so consumer
registrations win; symmetric override tests for both writer and redactor.
2026-06-01 07:34:48 -04:00
Joseph Doherty 3e4d4369bf feat(telemetry): MapZbMetrics Prometheus scrape endpoint 2026-06-01 07:34:26 -04:00
Joseph Doherty 4126e1df54 feat(telemetry): AddZbTelemetry metrics+traces bootstrap 2026-06-01 07:33:51 -04:00
Joseph Doherty 215a646e35 docs(observability): fix metric-convention instrument names + NodeHostname-auto + resolve settled questions
C1: NodeHostname is AUTO throughout. Shared-contract AddZbSerilog doc comment now reads
"SiteId + NodeRole from ZbTelemetryOptions; NodeHostname from Environment.MachineName (auto)".
SPEC.md §0 and §5 prose updated to match. ScadaBridge adoption snippet no longer sets
o.NodeHostname (removed; NodeHostname is auto, not caller-supplied).

C2: METRIC-CONVENTIONS §6.1 OtOpcUa instrument table replaced with code-verified set:
counters otopcua.deploy.applied / driver.lifecycle / virtualtag.eval / scriptedalarm.transition /
opcua.sink.write / redundancy.service_level_change; histogram otopcua.deploy.apply.duration (s);
ActivitySource ZB.MOM.WW.OtOpcUa with spans otopcua.deploy.apply + otopcua.opcua.address_space_rebuild.
Removed invented names (deploy.failed, tag.subscriptions, tag.reads, tag.writes, session.active,
connection.gateway).

C3: METRIC-CONVENTIONS §6.2 MxGateway instrument table replaced with code-verified names from
GatewayMetrics.cs: 13 counters (sessions.opened/closed, commands.started/succeeded/failed,
events.received, queues.overflows, faults, workers.killed/exited, heartbeats.failed,
grpc.streams.disconnected, retries.attempted); 3 histograms ms (workers.startup.duration,
commands.duration, events.stream_send.duration); 4 gauges (sessions.open, workers.running,
events.worker_queue.depth, events.grpc_stream_queue.depth). Removed invented names.

m3: §2 example table replaced mxgateway.session.active + mxgateway.worker.call.duration
(invented) with mxgateway.sessions.open + mxgateway.commands.duration (real). Also fixed
the §2 rule-2 body text example which referenced mxgateway.worker.call.duration.

I4: §5 standard instrumentation table corrected — OtOpcUa now shows  not added for all
five baseline instrumentations, matching current-state/otopcua. All three projects lack
standard instrumentation today; AddZbTelemetry adds it on adoption.

I1+m1: GAPS.md "Decisions still open" — removed the two settled questions (Prometheus-default
and ms→s/meter-rename bundling). Moved them to a new "Decisions settled" section with explicit
resolution notes. One genuinely open question remains (SiteId/NodeRole config binding path).

I2: SPEC.md §5 AddZbSerilog: added note that AddZbSerilog reads Serilog:MinimumLevel from
IConfiguration; callers with a different config key (e.g. ScadaBridge:Logging:MinimumLevel)
apply that override themselves — stays per-project. Shared-contract doc comment updated to match.

I3: MxAccessGateway adoption plan Meters = ["MxGateway.Server"] annotated as temporary with
note to update to ZB.MOM.WW.MxGateway when Gap N1 (Meter-rename) is closed.

m2: SPEC.md §1 now notes AddZbTelemetry also has an IServiceCollection overload for non-standard
hosts, with the IHostApplicationBuilder overload as the primary path.
2026-06-01 07:32:58 -04:00
Joseph Doherty 453ec7358d feat(audit): redactor + writer helpers (Null/Truncating/NoOp/Composite/Redacting)
Code-review fixes: CompositeAuditWriter re-throws OperationCanceledException
(honors cancellation) + evt null-guard; RedactingAuditWriter evt null-guard;
added marker-longer-than-max and cancellation-propagation regression tests.
2026-06-01 07:31:28 -04:00
Joseph Doherty 645388b1f1 feat(telemetry): options + shared OTel Resource 2026-06-01 07:30:54 -04:00
Joseph Doherty a1c3d5ec81 chore: scaffold ZB.MOM.WW.Telemetry solution and projects
Two library projects (ZB.MOM.WW.Telemetry core + Serilog) and two xUnit
test projects; central PM via Directory.Packages.props; dotnet build green.
2026-06-01 07:27:30 -04:00
Joseph Doherty 3934e528f2 feat(audit): AuditEvent record + AuditOutcome + writer/redactor seams
Includes equality-as-normalized-instant remarks on OccurredAtUtc and a
same-instant/different-offset equality regression test (code-review follow-up).
2026-06-01 07:25:31 -04:00
Joseph Doherty fba3d09eed docs(observability): current-state x3 + GAPS + README
Complete the observability normalization component docs:

- components/observability/current-state/otopcua/CURRENT-STATE.md — full
  OTel SDK (metrics + tracing) + Prometheus; 7 otopcua.* instruments + 2
  spans; Serilog with driver-scope LogContextEnricher; no Resource/service.name
  anywhere; tracing pipeline wired but no exporter; adoption plan: AddZbTelemetry
  gains shared Resource + trace↔log correlation; LogContextEnricher kept bespoke.

- components/observability/current-state/mxaccessgw/CURRENT-STATE.md — 20
  hand-rolled instruments (13 counters, 3 histograms ms-unit, 4 gauges) in
  GatewayMetrics.cs; no OTel SDK → metrics never export; MEL logging with
  GatewayLogScope correlation and GatewayLogRedactor; adoption plan: in-pass
  MEL → AddZbSerilog migration (LogContext correlation, ILogRedactor seam) +
  AddZbTelemetry wires OTel SDK so GatewayMetrics finally exports.

- components/observability/current-state/scadabridge/CURRENT-STATE.md —
  OpenTelemetry.Api is a CVE-patch override only (zero instrumentation); Serilog
  with SiteId/NodeRole/NodeHostname enrichers (strongest set in family); adoption
  plan: replace CVE ref with AddZbTelemetry; adopt AddZbSerilog (LoggerConfigurationFactory
  deleted); add first scadabridge.* instruments.

- components/observability/GAPS.md — divergence table across §1 Resource (P1,
  nobody), §2 metrics export (P1, MxGateway invisible), §3 MxGateway MEL→Serilog
  (P1, in-pass done), §4 trace↔log correlation, §5 ms→s unit, §6 Meter naming,
  §7 standard instrumentation, §8 Serilog version, §9 ScadaBridge zero
  instrumentation; 11-item prioritized backlog.

- components/observability/README.md — overview, per-project status table
  (OTel today / metrics / tracing / logging / enrichers / adoption status),
  normalized vs. left-per-project boundary, 2-package structure, component status.
2026-06-01 07:23:08 -04:00
Joseph Doherty 7d243890ed docs(observability): spec + METRIC-CONVENTIONS + ZB.MOM.WW.Telemetry shared contract
Author the three normalization docs for the observability component:
- components/observability/spec/SPEC.md — Section 0 scope (normalized vs. per-project),
  AddZbTelemetry pipeline, shared Resource attribute set, standard instrumentation baseline,
  exporter conventions, Serilog two-stage bootstrap with identity enrichers and
  TraceContextEnricher, ILogRedactor redaction seam, per-project migration table, and
  acceptance criteria.
- components/observability/spec/METRIC-CONVENTIONS.md — meter naming convention (app
  namespace; MxGateway.Server flagged as convergence target), instrument naming pattern
  (<app>.<subsystem>.<event>), mandatory duration unit = seconds (MxGateway ms histograms
  flagged), Resource attribute set table, standard instrumentation baseline, and per-app
  instrument tables (OtOpcUa 7 instruments + 2 spans; MxGateway 13 counters / 3 histograms
  / 4 gauges; ScadaBridge TBD).
- components/observability/shared-contract/ZB.MOM.WW.Telemetry.md — paper API for the two
  packages: ZbTelemetryOptions, ZbExporter enum, AddZbTelemetry (IHostApplicationBuilder +
  IServiceCollection overloads), ZbResource.Build, MapZbMetrics; AddZbSerilog,
  ZbLogEnricherNames constants, TraceContextEnricher, ILogRedactor, RedactionEnricher.
  Consumer matrix and open contract questions included.
2026-06-01 07:19:38 -04:00
Joseph Doherty 54654a49af chore(audit): scaffold ZB.MOM.WW.Audit solution 2026-06-01 07:19:36 -04:00
Joseph Doherty 76295695ee docs(health): align shared-contract to shipped API + per-lib CLAUDE.md + cleanup
- Contract: DatabaseHealthCheck<TContext> ctor now shows IServiceProvider (resolves
  IDbContextFactory<TContext> when registered, else a scoped TContext; pool-safe)
- Contract: RequireActiveNode gains retryAfterSeconds = 5 default parameter
- Packages: remove dangling AspNetCore.HealthChecks.UI.Client PackageVersion (no
  csproj referenced it)
- Tests: fix CS8625 in RoleLessCases — use object?[] so null role rows compile
  warning-free under Nullable=enable
- Add ZB.MOM.WW.Health/CLAUDE.md (packages, responsibilities, consumer matrix,
  build/test/pack commands, status + pointer to components/health/)
2026-06-01 07:17:18 -04:00
Joseph Doherty 6588e15f57 docs(audit): fix canonical record field count (10 not 8) + drop BCL-only overstatement (review fixes) 2026-06-01 07:16:18 -04:00
Joseph Doherty 0c087d150d feat(health): pack ZB.MOM.WW.Health 0.1.0 + README + register health component in indexes
- Added PackageTags to all 3 library csproj files (health-checks;aspnetcore/akka/efcore;scada;wonderware;zb-mom-ww)
- Full solution dotnet test: 58 tests green (32 Akka + 20 core + 6 EFCore)
- dotnet pack -c Release produces ZB.MOM.WW.Health.0.1.0.nupkg, ZB.MOM.WW.Health.Akka.0.1.0.nupkg, ZB.MOM.WW.Health.EntityFrameworkCore.0.1.0.nupkg; artifacts/ not committed
- ZB.MOM.WW.Health/README.md: overview, packages table, consumer matrix, versioning, build/test/pack instructions, status note
- components/README.md: Health row added to component registry
- CLAUDE.md: Health row in Component-normalization table + Health paragraph; intro updated from "two pieces" to "three pieces"
- upcoming.md: Health checks item checked off with pointer to components/health/ and ZB.MOM.WW.Health/
- components/health/README.md: status updated from "Draft / scaffolded / follow-on" to "Built @ 0.1.0"
2026-06-01 07:09:14 -04:00
Joseph Doherty 69c1be943e docs(audit): README + GAPS adoption backlog 2026-06-01 07:08:31 -04:00
Joseph Doherty ef234d3574 docs(audit): shared-contract ZB.MOM.WW.Audit 2026-06-01 07:08:31 -04:00
Joseph Doherty 8f0b70d12f docs(audit): spec + event-model 2026-06-01 07:04:54 -04:00
Joseph Doherty 1c2b23cbbb refactor(health.akka): review polish (internal decision helper, role guard, factory results, test coverage) + fix SPEC §4 gate description 2026-06-01 07:04:29 -04:00
Joseph Doherty edbc79204f refactor(health.ef): review polish (timer release, timeout test, provider disposal, drop unused dep)
- Eagerly call CancelAfter(InfiniteTimeSpan) after a successful probe so the pending OS
  timer is released on the happy path rather than held for the full timeout window.
- Add ProbeTimeout_Unhealthy test: 50 ms timeout with an infinite-blocking probe delegate
  asserts Unhealthy, covering the timeout code path.
- Fix ProbeQueryThrows_Unhealthy to use Task.FromException rather than a synchronous throw,
  accurately modelling a faulted async delegate.
- Wrap all BuildServiceProvider() results in await using so ServiceProvider is disposed
  after each test (no DI provider leak).
- Remove unused Microsoft.EntityFrameworkCore.InMemory package reference; tests use
  SQLite only (InMemory CanConnect semantics differ and the package was not exercised).
- Add <remarks> to DatabaseHealthCheck<TContext> noting the scoped-resolution path is
  safe for AddDbContextPool (scope dispose returns context to pool, not destroys it).
2026-06-01 07:03:16 -04:00
Joseph Doherty a7a8f1e493 docs(audit): correct file:line refs + split MxGateway CLI/dashboard action vocab (review fixes) 2026-06-01 07:01:46 -04:00
Joseph Doherty aa2251b93d feat(health): core review fixes (async writer, gRPC cancellation, validation, configurable retry-after) 2026-06-01 07:00:21 -04:00
Joseph Doherty cf277eb7df feat(health.akka): active/leader check with role filter + IActiveNodeGate impl 2026-06-01 06:55:46 -04:00