Commit Graph

31 Commits

Author SHA1 Message Date
Joseph Doherty 544a6ddb77 Fix all baseline code-review findings across the six shared libraries
Resolves the 35 findings from the 2026-06-01 baseline (commit 26ba1c7),
test-first for every behavioral change. +51 tests (331 -> 382 passing, 0 failed).

- Telemetry-001 (HIGH): RedactionEnricher now honours property removal, so a
  redactor that drops a key actually scrubs the secret from the event.
- Auth: LDAP validator ValidateOnStart; API-key verify no longer fails on a
  best-effort MarkUsed write or a corrupt scopes column (fail-closed); LDAP cert
  validation hook; KeyPrefix persistence aligned; README algorithm corrected.
- Health: Akka checks return Degraded (not throw) when the cluster isn't up yet;
  GrpcDependencyHealthCheck catch-all; null 'description' rendered; composite
  endpoint builder; XML docs shipped.
- Audit: CompositeAuditWriter no longer re-throws OperationCanceledException;
  TruncatingAuditRedactor over-redact scrubs Target + safe negative max; options
  record; XML docs shipped.
- Configuration: TryAddEnumerable idempotent registration; consistent port
  quoting; strict invariant port parsing; XML docs + README packaged.
- Theme: mobile toggle is now CSS-only (no Bootstrap JS); token/CSS hygiene;
  XML docs on the public parameter surface.

Shared-contract/spec docs updated where the code was the source of truth
(observability service.instance.id, MapZbMetrics, redactor reach). All changes
additive/back-compatible at v0.1.0. code-reviews bookkeeping follows separately.
2026-06-01 11:22:14 -04:00
Joseph Doherty fbf0f23e76 docs(config): correct OtOpcUa draft-validation description
The C# DraftValidator/DraftSnapshot has NO live caller in OtOpcUa src/ (verified
repo-wide) — it is dormant complement code. The enforced pre-publish draft
validation runs DB-side in the sp_ValidateDraft stored procedure (Status='Draft'
-> sp_PublishGeneration lifecycle). Reframe across current-state/SPEC/GAPS/README/
CLAUDE.md from 'runtime draft validation' + a false publish-pipeline caller to
'dormant managed validator; enforcement is DB-side'. Out-of-scope conclusion
for ZB.MOM.WW.Configuration is unchanged.
2026-06-01 10:13:29 -04:00
Joseph Doherty 3fa77b70fc docs: register ZB.MOM.WW.Configuration in indexes 2026-06-01 09:51:22 -04:00
Joseph Doherty 46c4bfae31 docs(config): components/configuration normalization (spec, shared-contract, current-state x3, GAPS, README) 2026-06-01 09:48:49 -04:00
Joseph Doherty a09cc02d46 Merge feat/zb-mom-ww-audit: Audit normalization component + ZB.MOM.WW.Audit (0.1.0)
# Conflicts:
#	CLAUDE.md
#	components/README.md
2026-06-01 09:09:44 -04:00
Joseph Doherty 8311912f40 feat(telemetry): pack ZB.MOM.WW.Telemetry 0.1.0 + README/CLAUDE + register observability component in indexes
- NuGet metadata: expanded Description and PackageTags on both library csproj files
  (opentelemetry;observability;metrics;tracing;prometheus;otlp;... / serilog;logging;...)
- Full dotnet test: 7 (Telemetry) + 12 (Serilog) = 19 tests, all green
- dotnet pack: ZB.MOM.WW.Telemetry.0.1.0.nupkg + ZB.MOM.WW.Telemetry.Serilog.0.1.0.nupkg
  (artifacts/ gitignored, not committed)
- ZB.MOM.WW.Telemetry/README.md: overview, 2 packages, unifying hinge prose,
  exporter options, OTel signals + trace-log correlation, test/pack commands, status
- ZB.MOM.WW.Telemetry/CLAUDE.md: package responsibilities, consumer matrix,
  build/test/pack commands, status + pointers to components/observability/
- components/README.md: Observability row added to component registry table
- CLAUDE.md: Telemetry row added to component-normalization table; intro count
  updated to four shared libs; observability prose paragraph added (MxGateway
  logging adoption noted)
- upcoming.md: Observability item ticked done, pointing at components/observability/
  and ZB.MOM.WW.Telemetry; MxGateway MEL->Serilog adoption noted
- components/observability/README.md: status updated to Built @ 0.1.0, library
  build/pack commands added, MxGateway adoption row updated
2026-06-01 08:20:05 -04:00
Joseph Doherty f569d537d1 fix(telemetry.serilog): don't set process-global Log.Logger in AddZbSerilog (multi-host safe)
Remove the Stage-1 bootstrap-logger line (Log.Logger = new LoggerConfiguration()
.WriteTo.Console().CreateBootstrapLogger()) from AddZbSerilog. A shared library must
not mutate process-global state: when multiple hosts are built in one process (integration
tests, Aspire multi-host, parallel test runs) the second call throws "The logger is
already frozen".

AddSerilog is now called with preserveStaticLogger: true so Serilog.Extensions.Hosting
leaves the static Log.Logger entirely untouched. The DI-registered application logger is
the only artifact AddZbSerilog produces.

Apps that want a pre-Build() bootstrap logger should set Log.Logger themselves in
Program.cs before calling AddZbSerilog — that decision belongs to the application.

Three new regression tests in MultiHostTests verify: two hosts build in the same process
without throwing; Log.Logger is not mutated; each host gets its own independent DI ILogger.

Docs (SPEC.md §5 and shared-contract ZB.MOM.WW.Telemetry.md) updated: the "two-stage
bootstrap" framing is replaced with the correct description — library registers only the
DI application logger; optional Stage-1 bootstrap is the app's responsibility.
2026-06-01 08:13:35 -04:00
Joseph Doherty 37fb84f477 feat(telemetry): core review fixes (Prometheus+OTLP coexistence, ServiceName validation, null guards) + contract overload note
- Fix #1: Prometheus exporter always wired for metrics; OTLP is additive overlay
  when Exporter == ZbExporter.Otlp so /metrics + MapZbMetrics work in all modes.
- Fix #2: BuildOptions throws ArgumentException when ServiceName is null/whitespace.
- Fix #3: AddZbTelemetry(IHostApplicationBuilder) guard: ThrowIfNull(configure)
  added alongside existing ThrowIfNull(builder).
- Fix #6: Contract doc adds IServiceCollection convenience overload signature.
- Tests: +3 new tests (OtlpExporter still serves /metrics, empty ServiceName throws,
  whitespace ServiceName throws). Total: 7 passed (was 4).
2026-06-01 07:43:47 -04:00
Joseph Doherty c284e4d68d docs(audit): register component in indexes + GAPS cross-check 2026-06-01 07:41:45 -04:00
Joseph Doherty 215a646e35 docs(observability): fix metric-convention instrument names + NodeHostname-auto + resolve settled questions
C1: NodeHostname is AUTO throughout. Shared-contract AddZbSerilog doc comment now reads
"SiteId + NodeRole from ZbTelemetryOptions; NodeHostname from Environment.MachineName (auto)".
SPEC.md §0 and §5 prose updated to match. ScadaBridge adoption snippet no longer sets
o.NodeHostname (removed; NodeHostname is auto, not caller-supplied).

C2: METRIC-CONVENTIONS §6.1 OtOpcUa instrument table replaced with code-verified set:
counters otopcua.deploy.applied / driver.lifecycle / virtualtag.eval / scriptedalarm.transition /
opcua.sink.write / redundancy.service_level_change; histogram otopcua.deploy.apply.duration (s);
ActivitySource ZB.MOM.WW.OtOpcUa with spans otopcua.deploy.apply + otopcua.opcua.address_space_rebuild.
Removed invented names (deploy.failed, tag.subscriptions, tag.reads, tag.writes, session.active,
connection.gateway).

C3: METRIC-CONVENTIONS §6.2 MxGateway instrument table replaced with code-verified names from
GatewayMetrics.cs: 13 counters (sessions.opened/closed, commands.started/succeeded/failed,
events.received, queues.overflows, faults, workers.killed/exited, heartbeats.failed,
grpc.streams.disconnected, retries.attempted); 3 histograms ms (workers.startup.duration,
commands.duration, events.stream_send.duration); 4 gauges (sessions.open, workers.running,
events.worker_queue.depth, events.grpc_stream_queue.depth). Removed invented names.

m3: §2 example table replaced mxgateway.session.active + mxgateway.worker.call.duration
(invented) with mxgateway.sessions.open + mxgateway.commands.duration (real). Also fixed
the §2 rule-2 body text example which referenced mxgateway.worker.call.duration.

I4: §5 standard instrumentation table corrected — OtOpcUa now shows  not added for all
five baseline instrumentations, matching current-state/otopcua. All three projects lack
standard instrumentation today; AddZbTelemetry adds it on adoption.

I1+m1: GAPS.md "Decisions still open" — removed the two settled questions (Prometheus-default
and ms→s/meter-rename bundling). Moved them to a new "Decisions settled" section with explicit
resolution notes. One genuinely open question remains (SiteId/NodeRole config binding path).

I2: SPEC.md §5 AddZbSerilog: added note that AddZbSerilog reads Serilog:MinimumLevel from
IConfiguration; callers with a different config key (e.g. ScadaBridge:Logging:MinimumLevel)
apply that override themselves — stays per-project. Shared-contract doc comment updated to match.

I3: MxAccessGateway adoption plan Meters = ["MxGateway.Server"] annotated as temporary with
note to update to ZB.MOM.WW.MxGateway when Gap N1 (Meter-rename) is closed.

m2: SPEC.md §1 now notes AddZbTelemetry also has an IServiceCollection overload for non-standard
hosts, with the IHostApplicationBuilder overload as the primary path.
2026-06-01 07:32:58 -04:00
Joseph Doherty fba3d09eed docs(observability): current-state x3 + GAPS + README
Complete the observability normalization component docs:

- components/observability/current-state/otopcua/CURRENT-STATE.md — full
  OTel SDK (metrics + tracing) + Prometheus; 7 otopcua.* instruments + 2
  spans; Serilog with driver-scope LogContextEnricher; no Resource/service.name
  anywhere; tracing pipeline wired but no exporter; adoption plan: AddZbTelemetry
  gains shared Resource + trace↔log correlation; LogContextEnricher kept bespoke.

- components/observability/current-state/mxaccessgw/CURRENT-STATE.md — 20
  hand-rolled instruments (13 counters, 3 histograms ms-unit, 4 gauges) in
  GatewayMetrics.cs; no OTel SDK → metrics never export; MEL logging with
  GatewayLogScope correlation and GatewayLogRedactor; adoption plan: in-pass
  MEL → AddZbSerilog migration (LogContext correlation, ILogRedactor seam) +
  AddZbTelemetry wires OTel SDK so GatewayMetrics finally exports.

- components/observability/current-state/scadabridge/CURRENT-STATE.md —
  OpenTelemetry.Api is a CVE-patch override only (zero instrumentation); Serilog
  with SiteId/NodeRole/NodeHostname enrichers (strongest set in family); adoption
  plan: replace CVE ref with AddZbTelemetry; adopt AddZbSerilog (LoggerConfigurationFactory
  deleted); add first scadabridge.* instruments.

- components/observability/GAPS.md — divergence table across §1 Resource (P1,
  nobody), §2 metrics export (P1, MxGateway invisible), §3 MxGateway MEL→Serilog
  (P1, in-pass done), §4 trace↔log correlation, §5 ms→s unit, §6 Meter naming,
  §7 standard instrumentation, §8 Serilog version, §9 ScadaBridge zero
  instrumentation; 11-item prioritized backlog.

- components/observability/README.md — overview, per-project status table
  (OTel today / metrics / tracing / logging / enrichers / adoption status),
  normalized vs. left-per-project boundary, 2-package structure, component status.
2026-06-01 07:23:08 -04:00
Joseph Doherty 7d243890ed docs(observability): spec + METRIC-CONVENTIONS + ZB.MOM.WW.Telemetry shared contract
Author the three normalization docs for the observability component:
- components/observability/spec/SPEC.md — Section 0 scope (normalized vs. per-project),
  AddZbTelemetry pipeline, shared Resource attribute set, standard instrumentation baseline,
  exporter conventions, Serilog two-stage bootstrap with identity enrichers and
  TraceContextEnricher, ILogRedactor redaction seam, per-project migration table, and
  acceptance criteria.
- components/observability/spec/METRIC-CONVENTIONS.md — meter naming convention (app
  namespace; MxGateway.Server flagged as convergence target), instrument naming pattern
  (<app>.<subsystem>.<event>), mandatory duration unit = seconds (MxGateway ms histograms
  flagged), Resource attribute set table, standard instrumentation baseline, and per-app
  instrument tables (OtOpcUa 7 instruments + 2 spans; MxGateway 13 counters / 3 histograms
  / 4 gauges; ScadaBridge TBD).
- components/observability/shared-contract/ZB.MOM.WW.Telemetry.md — paper API for the two
  packages: ZbTelemetryOptions, ZbExporter enum, AddZbTelemetry (IHostApplicationBuilder +
  IServiceCollection overloads), ZbResource.Build, MapZbMetrics; AddZbSerilog,
  ZbLogEnricherNames constants, TraceContextEnricher, ILogRedactor, RedactionEnricher.
  Consumer matrix and open contract questions included.
2026-06-01 07:19:38 -04:00
Joseph Doherty 76295695ee docs(health): align shared-contract to shipped API + per-lib CLAUDE.md + cleanup
- Contract: DatabaseHealthCheck<TContext> ctor now shows IServiceProvider (resolves
  IDbContextFactory<TContext> when registered, else a scoped TContext; pool-safe)
- Contract: RequireActiveNode gains retryAfterSeconds = 5 default parameter
- Packages: remove dangling AspNetCore.HealthChecks.UI.Client PackageVersion (no
  csproj referenced it)
- Tests: fix CS8625 in RoleLessCases — use object?[] so null role rows compile
  warning-free under Nullable=enable
- Add ZB.MOM.WW.Health/CLAUDE.md (packages, responsibilities, consumer matrix,
  build/test/pack commands, status + pointer to components/health/)
2026-06-01 07:17:18 -04:00
Joseph Doherty 6588e15f57 docs(audit): fix canonical record field count (10 not 8) + drop BCL-only overstatement (review fixes) 2026-06-01 07:16:18 -04:00
Joseph Doherty 0c087d150d feat(health): pack ZB.MOM.WW.Health 0.1.0 + README + register health component in indexes
- Added PackageTags to all 3 library csproj files (health-checks;aspnetcore/akka/efcore;scada;wonderware;zb-mom-ww)
- Full solution dotnet test: 58 tests green (32 Akka + 20 core + 6 EFCore)
- dotnet pack -c Release produces ZB.MOM.WW.Health.0.1.0.nupkg, ZB.MOM.WW.Health.Akka.0.1.0.nupkg, ZB.MOM.WW.Health.EntityFrameworkCore.0.1.0.nupkg; artifacts/ not committed
- ZB.MOM.WW.Health/README.md: overview, packages table, consumer matrix, versioning, build/test/pack instructions, status note
- components/README.md: Health row added to component registry
- CLAUDE.md: Health row in Component-normalization table + Health paragraph; intro updated from "two pieces" to "three pieces"
- upcoming.md: Health checks item checked off with pointer to components/health/ and ZB.MOM.WW.Health/
- components/health/README.md: status updated from "Draft / scaffolded / follow-on" to "Built @ 0.1.0"
2026-06-01 07:09:14 -04:00
Joseph Doherty 69c1be943e docs(audit): README + GAPS adoption backlog 2026-06-01 07:08:31 -04:00
Joseph Doherty ef234d3574 docs(audit): shared-contract ZB.MOM.WW.Audit 2026-06-01 07:08:31 -04:00
Joseph Doherty 8f0b70d12f docs(audit): spec + event-model 2026-06-01 07:04:54 -04:00
Joseph Doherty 1c2b23cbbb refactor(health.akka): review polish (internal decision helper, role guard, factory results, test coverage) + fix SPEC §4 gate description 2026-06-01 07:04:29 -04:00
Joseph Doherty a7a8f1e493 docs(audit): correct file:line refs + split MxGateway CLI/dashboard action vocab (review fixes) 2026-06-01 07:01:46 -04:00
Joseph Doherty aa2251b93d feat(health): core review fixes (async writer, gRPC cancellation, validation, configurable retry-after) 2026-06-01 07:00:21 -04:00
Joseph Doherty 9c8c1431af docs(audit): current-state ScadaBridge 2026-06-01 06:55:07 -04:00
Joseph Doherty 02cc687556 docs(audit): current-state MxAccessGateway 2026-06-01 06:55:07 -04:00
Joseph Doherty e498bb7c5a docs(audit): current-state OtOpcUa 2026-06-01 06:55:07 -04:00
Joseph Doherty 07d5907258 docs(health): resolve spec/contract/gaps consistency (review fixes)
Applies canonical resolutions for eight settled decisions:
- GAPS: remove three stale "Decisions still open" bullets (#1 IActiveNodeGate placement, #2 GrpcChannel type, #3 OtOpcUaCompat named constant)
- Shared contract: AkkaClusterHealthCheck, ActiveNodeHealthCheck constructors take IServiceProvider (lazy ActorSystem, Degraded-when-not-ready)
- Shared contract: AkkaActiveNodeGate takes IServiceProvider; reads SelfMember+leader directly, null-guarded; does not proxy ActiveNodeHealthCheck
- Shared contract: DatabaseHealthCheckOptions.Probe renamed to ProbeQuery; consumer matrix updated
- Shared contract: settled AddZbHealthChecks open question removed (spec §5 is per-project AddHealthChecks)
- SPEC §2.2: OtOpcUaCompat Leaving/Exiting cell updated from — to Degraded + footnote; §2.3 startup-safety note added
- README: status line corrected from "built and tested" to "scaffolded … implementation is follow-on (task #7)"; IActiveNodeGate "left per-project" bullet removed
- OtOpcUa current-state: AddZbHealthChecks → AddHealthChecks().AddCheck<...>(); IClusterRoleInfo note reframed as accepted trade-off
- ScadaBridge current-state: IActiveNodeGate bullet rewritten — interface moves to ZB.MOM.WW.Health on adoption, InboundApiEndpointFilter references shared interface
2026-06-01 06:33:42 -04:00
Joseph Doherty 3d25ee5090 docs(health): current-state x3 + GAPS + README
Code-verified current-state docs for OtOpcUa (three-tier full), ScadaBridge
(two-tier, no /healthz), and MxAccessGateway (bare liveness only / no probes).
GAPS backlog with P1 for MxGateway and convergence items for Akka status policy,
DB probe technique, and response writer. README with per-project status table.
2026-06-01 06:23:53 -04:00
Joseph Doherty 1dc35a8c43 docs(health): spec + ZB.MOM.WW.Health shared contract
Authors components/health/spec/SPEC.md (normalized three-tier endpoint
convention, probe catalog, response-writer contract, migration notes) and
components/health/shared-contract/ZB.MOM.WW.Health.md (paper API for the
3-package library: core, Akka, EntityFrameworkCore).
2026-06-01 06:20:19 -04:00
Joseph Doherty 2485d86205 docs: register ui-theme component in indexes 2026-06-01 05:16:58 -04:00
Joseph Doherty 029ac0719b docs(ui-theme): current-state ×3 + GAPS adoption backlog 2026-06-01 05:15:38 -04:00
Joseph Doherty 95975d0754 docs(ui-theme): spec, design tokens, shared contract 2026-06-01 05:11:43 -04:00
dohertj2 37e23cf9f2 Initial commit: scadaproj umbrella — sister-project index, auth component normalization (design + GAPS), and the built ZB.MOM.WW.Auth shared library (0.1.0, flattened in). 2026-06-01 03:59:23 -04:00