# ZB.MOM.WW.Telemetry Shared Library Implementation Plan > **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task. **Goal:** Author the `components/observability/` normalization docs and build the `ZB.MOM.WW.Telemetry` shared library (2 NuGet packages) that gives the fleet one OpenTelemetry bootstrap across all three signals (metrics + traces + logs) with a shared `Resource` and a shared Serilog logging stack, then migrate MxAccessGateway's logging from `Microsoft.Extensions.Logging` onto that shared stack — the one sister-repo adoption that proves the contract. **Architecture:** A new standalone nested repo (`~/Desktop/scadaproj/ZB.MOM.WW.Telemetry`), .NET 10, two library projects — `ZB.MOM.WW.Telemetry` (OTel metrics+traces bootstrap, shared `Resource`, standard instrumentation, Prometheus/OTLP exporters) and `ZB.MOM.WW.Telemetry.Serilog` (shared Serilog bootstrap, `SiteId`/`NodeRole`/`NodeHostname` enrichers, a new `TraceContextEnricher`, OTel log export, `ILogRedactor` seam). The unifying hinge: one `ZbTelemetryOptions` identity triple (`ServiceName`/`SiteId`/`NodeRole`) feeds **both** the OTel Resource and the Serilog enrichers. Reference implementations: OTel bootstrap from OtOpcUa `ObservabilityExtensions`, Serilog bootstrap + enrichers from ScadaBridge `LoggerConfigurationFactory`, redaction from MxGateway `GatewayLogRedactor`. **Health/Telemetry wiring into OtOpcUa & ScadaBridge stays a future `GAPS.md` item; the ONLY app touched here is MxGateway's logging.** **Tech Stack:** .NET 10, C#; xUnit + coverlet; OpenTelemetry SDK 1.15.3 (`OpenTelemetry.Extensions.Hosting`), `OpenTelemetry.Exporter.Prometheus.AspNetCore` 1.15.3-beta.1, `OpenTelemetry.Exporter.OpenTelemetryProtocol` 1.15.3, `OpenTelemetry.Instrumentation.{AspNetCore,Http,GrpcNetClient,Runtime,Process}` (~1.12–1.15); Serilog 4.3.1, `Serilog.AspNetCore` (see version note), `Serilog.Settings.Configuration`, `Serilog.Sinks.{Console,File,OpenTelemetry}`; central package management; `.slnx`; `Version` 0.1.0 lockstep. **Version note (a real convergence item):** OtOpcUa pins `Serilog.AspNetCore` 9.0.0, ScadaBridge 10.0.0. Pin **9.0.0** in this library (works on net10, lowest common); record the 9↔10 split in `GAPS.md` as a convergence task. Consumers' central package management governs the final version at adoption. **Source references (read-only, to port/generalize from):** - OTel bootstrap + Meter/ActivitySource: OtOpcUa `~/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs` + `~/Desktop/OtOpcUa/src/Core/ZB.MOM.WW.OtOpcUa.Commons/Observability/OtOpcUaTelemetry.cs` - Serilog bootstrap + enrichers: ScadaBridge `~/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/{LoggerConfigurationFactory,LoggingOptions}.cs` + `appsettings.json:3-23` - Hand-rolled metrics to re-home onto OTel export: MxGateway `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs` - Logging to migrate: MxGateway `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/Diagnostics/{GatewayRequestLoggingMiddlewareExtensions,GatewayLogScope,GatewayLoggerExtensions,GatewayLogRedactor}.cs` + `GatewayApplication.cs:34,61` - Design: `~/Desktop/scadaproj/docs/plans/2026-06-01-health-observability-components-design.md` **Conventions for every task:** TDD — failing test first, minimal impl, green, commit. File-scoped namespaces, `sealed` by default. **Never log secrets.** Commit after each green task. The `Files:` block is the `files_to_edit` contract. --- ## Phase 0 — Normalization docs (spec drives the API) ### Task 1: components/observability spec + METRIC-CONVENTIONS + shared-contract **Classification:** small **Estimated implement time:** ~5 min **Parallelizable with:** Task 2 **Files:** - Create: `components/observability/spec/SPEC.md` - Create: `components/observability/spec/METRIC-CONVENTIONS.md` - Create: `components/observability/shared-contract/ZB.MOM.WW.Telemetry.md` **Steps:** 1. `spec/SPEC.md` — Section 0 **Scope**: normalized = OTel bootstrap (3 signals), the shared `Resource` attribute set, standard instrumentation, exporter conventions (Prometheus default / OTLP opt-in), Serilog bootstrap + enrichers + trace↔log correlation + redaction seam. NOT normalized = each app's actual instruments (`otopcua.*`, `mxgateway.*`), redaction policy (which fields), the net48 worker's `IWorkerLogger`. 2. `spec/METRIC-CONVENTIONS.md` (mirrors auth `CANONICAL-ROLES.md` / theme `DESIGN-TOKENS.md`): Meter name = app namespace; instrument name = `..`; **duration unit = seconds** (OTel semconv — flag MxGateway's `ms` histograms); the Resource attribute list (`service.name`, `service.namespace=ZB.MOM.WW`, `service.version`, `site.id`, `node.role`, `host.name`); the standard instrumentation everyone enables. 3. `shared-contract/ZB.MOM.WW.Telemetry.md` — paper API of both packages: `ZbTelemetryOptions`, `AddZbTelemetry`, `MapZbMetrics`, `ZbExporter` enum; `AddZbSerilog`, `ZbLogEnricherNames`, `TraceContextEnricher`, `ILogRedactor`. **Acceptance:** Three files exist; `SPEC.md` has explicit Section 0; `METRIC-CONVENTIONS.md` states the seconds rule and the Resource set. No tests (docs). --- ### Task 2: components/observability current-state ×3 + GAPS + README **Classification:** small **Estimated implement time:** ~5 min **Parallelizable with:** Task 1 **Files:** - Create: `components/observability/current-state/otopcua/CURRENT-STATE.md` - Create: `components/observability/current-state/mxaccessgw/CURRENT-STATE.md` - Create: `components/observability/current-state/scadabridge/CURRENT-STATE.md` - Create: `components/observability/GAPS.md` - Create: `components/observability/README.md` **Steps:** 1. Transcribe the design doc's "Telemetry" + "Logging" current-state into the three docs at full `file:line` depth (re-verify against live repos). OtOpcUa = full OTel + Prometheus + Serilog (no Resource, no trace↔log correlation). MxGateway = hand-rolled `GatewayMetrics` (no export) + MEL logging — its Adoption plan = the migration in Phase 4. ScadaBridge = `OpenTelemetry.Api` dangling CVE-patch ref + Serilog (cleanest enrichers). 2. `GAPS.md` — top entries: no `Resource`/`service.name` anywhere (P1); MxGateway metrics never export (P1); MxGateway MEL→Serilog (P1, done here); `ms`→`s` unit convergence; no trace↔log correlation anywhere; `Serilog.AspNetCore` 9↔10 split; ScadaBridge has zero instrumentation. 3. `README.md` — overview + per-project status table. **Acceptance:** Five files exist; current-states cite real `file:line`; `GAPS.md` lists the migration + convergence items. No tests (docs). --- ## Phase 1 — Scaffold ### Task 3: Create repo, solution, and project shells **Classification:** small **Estimated implement time:** ~5 min **Parallelizable with:** none (gates impl tasks) **Files:** - Create: `ZB.MOM.WW.Telemetry/ZB.MOM.WW.Telemetry.slnx` - Create: `ZB.MOM.WW.Telemetry/Directory.Build.props` - Create: `ZB.MOM.WW.Telemetry/Directory.Packages.props` - Create: `ZB.MOM.WW.Telemetry/.gitignore` - Create: `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry/ZB.MOM.WW.Telemetry.csproj` - Create: `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry.Serilog/ZB.MOM.WW.Telemetry.Serilog.csproj` - Create: `ZB.MOM.WW.Telemetry/tests/ZB.MOM.WW.Telemetry.Tests/…csproj` - Create: `ZB.MOM.WW.Telemetry/tests/ZB.MOM.WW.Telemetry.Serilog.Tests/…csproj` **Steps:** 1. `cd ~/Desktop/scadaproj && mkdir ZB.MOM.WW.Telemetry && cd ZB.MOM.WW.Telemetry && git init && dotnet new gitignore` 2. `dotnet new sln -n ZB.MOM.WW.Telemetry --format slnx` (fallback `.sln`). 3. `dotnet new classlib -f net10.0` ×2 libs; `dotnet new xunit -f net10.0` ×2 tests. Delete default classes. 4. Refs: `.Serilog` → core `ZB.MOM.WW.Telemetry`; each test → its lib; core lib `` (for `MapZbMetrics` / instrumentation). 5. Copy `Directory.Build.props` from `ZB.MOM.WW.Auth`. 6. `Directory.Packages.props` — pin the OTel + Serilog versions from the Tech Stack/Version-note above + test pkgs (`Microsoft.NET.Test.Sdk` 17.14.1, `xunit` 2.9.3, `xunit.runner.visualstudio` 3.1.4, `coverlet.collector` 6.0.4) + `Serilog.Sinks.InMemory` or `Serilog.Sinks.TestCorrelator` for tests. 7. `dotnet sln add` all 4; `dotnet build`. 8. **Commit:** `chore: scaffold ZB.MOM.WW.Telemetry solution and projects` **Acceptance:** `dotnet build` green; 4 projects. --- ## Phase 2 — `ZB.MOM.WW.Telemetry` (metrics + traces) ### Task 4: `ZbTelemetryOptions` + shared `Resource` builder **Classification:** standard **Estimated implement time:** ~4 min **Parallelizable with:** none (Tasks 5-6 build on it) **Files:** - Create: `src/ZB.MOM.WW.Telemetry/ZbTelemetryOptions.cs` - Create: `src/ZB.MOM.WW.Telemetry/ZbResource.cs` - Test: `tests/ZB.MOM.WW.Telemetry.Tests/ZbResourceTests.cs` **Step 1 — failing test:** `ZbResource.Build(options)` returns a `ResourceBuilder` whose attributes include `service.name` (= `ServiceName`), `service.namespace` (= `ServiceNamespace`, default `"ZB.MOM.WW"`), `service.version`, `site.id` (= `SiteId`), `node.role` (= `NodeRole`), `host.name`. Assert all six present with expected values (build the `Resource`, inspect `Attributes`). **Step 2 — FAIL. Step 3 — implement** `ZbTelemetryOptions` (ServiceName, ServiceNamespace=ZB.MOM.WW, ServiceVersion, SiteId, NodeRole, `string[] Meters`, `string[] ActivitySources`, `ZbExporter Exporter=Prometheus`, OTLP endpoint) + `ZbResource.Build`. **Step 4 — PASS. Step 5 — commit:** `feat(telemetry): options + shared OTel Resource` --- ### Task 5: `AddZbTelemetry` (metrics + traces wiring) **Classification:** high-risk **Estimated implement time:** ~5 min **Parallelizable with:** none **Files:** - Create: `src/ZB.MOM.WW.Telemetry/ZbTelemetryExtensions.cs` - Test: `tests/ZB.MOM.WW.Telemetry.Tests/AddZbTelemetryTests.cs` **Step 1 — failing test:** build a host with `AddZbTelemetry(o => { o.ServiceName="t"; o.Meters=["Test.Meter"]; })` using an **in-memory metrics exporter** (`MetricReader`/`InMemoryExporter` test harness); emit a counter on `Test.Meter`; assert the metric is collected and the export carries the Resource `service.name="t"`. Port the builder shape from OtOpcUa `ObservabilityExtensions.cs:18-25` (`AddOpenTelemetry().WithMetrics(...).WithTracing(...)`), generalized to register `o.Meters`/`o.ActivitySources` by name + standard instrumentation (`AddAspNetCoreInstrumentation`, `AddHttpClientInstrumentation`, `AddGrpcClientInstrumentation`, `AddRuntimeInstrumentation`, `AddProcessInstrumentation`) + exporter switch (Prometheus default, OTLP when `o.Exporter==Otlp`). **Step 2 — FAIL. Step 3 — implement.** Classification high-risk → executor runs spec+code review serially (this is the fleet's telemetry front door). **Step 4 — PASS. Step 5 — commit:** `feat(telemetry): AddZbTelemetry metrics+traces bootstrap` --- ### Task 6: `MapZbMetrics` Prometheus endpoint **Classification:** small **Estimated implement time:** ~3 min **Parallelizable with:** Task 7, Task 8 (different package) **Files:** - Create: `src/ZB.MOM.WW.Telemetry/ZbMetricsEndpointExtensions.cs` - Test: `tests/ZB.MOM.WW.Telemetry.Tests/MapZbMetricsTests.cs` **Step 1 — failing test:** `WebApplicationFactory` app with `AddZbTelemetry(Prometheus)` + `app.MapZbMetrics()` → `GET /metrics` returns 200 with `text/plain; version=0.0.4` Prometheus exposition. Port `/metrics` mapping from OtOpcUa `ObservabilityExtensions.cs:36-38`. **Step 2 — FAIL. Step 3 — implement** `MapZbMetrics` delegating to `MapPrometheusScrapingEndpoint`. **Step 4 — PASS. Step 5 — commit:** `feat(telemetry): MapZbMetrics Prometheus scrape endpoint` --- ## Phase 3 — `ZB.MOM.WW.Telemetry.Serilog` (logs signal) ### Task 7: Identity enrichers + `AddZbSerilog` bootstrap **Classification:** standard **Estimated implement time:** ~5 min **Parallelizable with:** Task 6 **Files:** - Create: `src/ZB.MOM.WW.Telemetry.Serilog/ZbLogEnricherNames.cs` - Create: `src/ZB.MOM.WW.Telemetry.Serilog/ZbSerilogExtensions.cs` - Test: `tests/ZB.MOM.WW.Telemetry.Serilog.Tests/EnricherTests.cs` **Step 1 — failing test:** using `Serilog.Sinks.InMemory`, configure via `AddZbSerilog(options)` with `SiteId="s1"`, `NodeRole="Central"` and log one event; assert the event carries properties `SiteId=s1`, `NodeRole=Central`, `NodeHostname=`. Bind these from the **same `ZbTelemetryOptions`** (reference the core package) so the dimensions match the Resource. Port two-stage bootstrap + `MinimumLevel.Is` override + `ReadFrom.Configuration` from ScadaBridge `LoggerConfigurationFactory.cs:62-88`. **Step 2 — FAIL. Step 3 — implement** `AddZbSerilog(this IHostApplicationBuilder, Action)` (or `LoggerConfiguration` factory mirroring ScadaBridge) with `Enrich.WithProperty` for the triple. **Step 4 — PASS. Step 5 — commit:** `feat(telemetry.serilog): AddZbSerilog bootstrap + identity enrichers` --- ### Task 8: `TraceContextEnricher` (trace↔log correlation) **Classification:** standard **Estimated implement time:** ~4 min **Parallelizable with:** Task 6 **Files:** - Create: `src/ZB.MOM.WW.Telemetry.Serilog/TraceContextEnricher.cs` - Modify: `src/ZB.MOM.WW.Telemetry.Serilog/ZbSerilogExtensions.cs` (register enricher) - Test: `tests/ZB.MOM.WW.Telemetry.Serilog.Tests/TraceContextEnricherTests.cs` **Step 1 — failing test:** with an active `Activity` (start an `ActivitySource` span), a logged event carries `trace_id` and `span_id` equal to `Activity.Current.TraceId`/`SpanId`; with no active Activity, neither property is added (clean omission). This is **new shared glue** — no existing app has it. **Step 2 — FAIL. Step 3 — implement** `ILogEventEnricher` reading `Activity.Current`; add to `AddZbSerilog`. **Step 4 — PASS. Step 5 — commit:** `feat(telemetry.serilog): TraceContextEnricher for trace<->log correlation` --- ### Task 9: `ILogRedactor` seam + OTel log export **Classification:** standard **Estimated implement time:** ~5 min **Parallelizable with:** none **Files:** - Create: `src/ZB.MOM.WW.Telemetry.Serilog/ILogRedactor.cs` - Create: `src/ZB.MOM.WW.Telemetry.Serilog/RedactionEnricher.cs` - Modify: `src/ZB.MOM.WW.Telemetry.Serilog/ZbSerilogExtensions.cs` (optional `WriteTo.OpenTelemetry` with shared Resource) - Test: `tests/ZB.MOM.WW.Telemetry.Serilog.Tests/RedactionTests.cs` **Step 1 — failing test:** register a fake `ILogRedactor` that masks a property named `apiKey`; log an event with `apiKey="mxgw_secret"`; assert the sink sees it masked. The **seam** is shared; policy is the consumer's (generalize MxGateway `GatewayLogRedactor.cs`). Also assert (separate test) that when `o.Exporter` routes logs to OTLP, the log records carry the same Resource as metrics/traces. **Step 2 — FAIL. Step 3 — implement** `ILogRedactor { void Redact(IDictionary properties); }` + a `RedactionEnricher` that applies the registered redactor; wire optional `WriteTo.OpenTelemetry(resource: ZbResource…)`. **Step 4 — PASS. Step 5 — commit:** `feat(telemetry.serilog): ILogRedactor seam + OTel log export` --- ## Phase 4 — MxGateway MEL → Serilog migration (the one sister-repo touch) > Touches `~/Desktop/MxAccessGateway`. Prereq: Phase 3 complete (`ZB.MOM.WW.Telemetry.Serilog` packed, or referenced via local project/`nupkg` source). Add the package via a local NuGet source or `ProjectReference` to the packed lib. The net48 x86 **worker** is OUT of scope — leave `WorkerConsoleLogger`/`IWorkerLogger` untouched. ### Task 10: Swap gateway bootstrap to `AddZbSerilog` **Classification:** high-risk **Estimated implement time:** ~5 min **Parallelizable with:** none **Files:** - Modify: `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs` (replace default MEL logging with `AddZbSerilog`) - Modify: `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj` (add package ref) - Modify: `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/appsettings.json` (Serilog section: Console+File sinks, MinimumLevel) - Test: existing `~/Desktop/MxAccessGateway/src/MxGateway.Tests/` (fake worker — no MXAccess needed) **Step 1 — failing/red state:** add a focused test (or reuse an existing logging test) asserting the host builds with Serilog as the provider and a log event carries `SiteId`/`NodeRole`. **Step 2 — run, expect FAIL** (still MEL). **Step 3 — implement:** reference `ZB.MOM.WW.Telemetry.Serilog`; call `AddZbSerilog` mapping `o.ServiceName="mxgateway"`, `SiteId`/`NodeRole` from config; add the `Serilog` config section. Remove the default logging assumptions. **Step 4 — run, expect PASS;** then `dotnet build src/MxGateway.sln` + `dotnet test src/MxGateway.Tests` green. **Step 5 — commit (in MxGateway repo):** `refactor(logging): adopt ZB.MOM.WW.Telemetry.Serilog bootstrap` --- ### Task 11: Re-express correlation scope + redactor on the shared seam **Classification:** high-risk **Estimated implement time:** ~5 min **Parallelizable with:** none **Files:** - Modify: `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayRequestLoggingMiddlewareExtensions.cs` (BeginScope → `LogContext.PushProperty`) - Modify: `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogScope.cs` (emit via LogContext) - Create: `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogRedactorAdapter.cs` (implements `ILogRedactor`, delegates to existing `GatewayLogRedactor` policy) - Test: existing `MxGateway.Tests` correlation/redaction tests **Step 1 — failing test:** assert (a) a request still emits the correlation properties (`SessionId`/`CorrelationId`/etc.) now via Serilog `LogContext`, and (b) a `mxgw_`-prefixed secret is still redacted through the registered `ILogRedactor`. **Step 2 — FAIL** (still MEL `BeginScope`/old redactor path). **Step 3 — implement:** convert the scope middleware to push Serilog `LogContext` properties (keep header parsing from `GatewayRequestLoggingMiddlewareExtensions.cs:22-41`); register `GatewayLogRedactorAdapter : ILogRedactor` wrapping the existing `GatewayLogRedactor` field/command policy. **Step 4 — PASS;** full `dotnet test src/MxGateway.Tests` green (record counts); verify no secret leakage. **Step 5 — commit (MxGateway repo):** `refactor(logging): correlation scope + redaction on shared ILogRedactor seam` --- ## Phase 5 — Package & register ### Task 12: Pack, README, register in indexes **Classification:** small **Estimated implement time:** ~5 min **Parallelizable with:** none (final) **Files:** - Create: `ZB.MOM.WW.Telemetry/README.md` - Modify: both lib `.csproj` (PackageId/Description/metadata) - Modify: `components/README.md` (registry row) - Modify: `CLAUDE.md` (Component-normalization table row) - Modify: `upcoming.md` (check off Observability) **Steps:** 1. NuGet metadata on both lib `.csproj`s. 2. `dotnet test` (both test projects green) — record counts. 3. `dotnet pack -c Release -o ./artifacts` → confirm 2 `*.0.1.0.nupkg`. 4. `README.md` — packages, the identity-triple hinge, exporter options (Prometheus default / OTLP opt-in), consumer matrix, "built; MxGateway logging adopted; broader adoption deferred" note. 5. Register: `components/README.md` row (status `Draft`), `CLAUDE.md` row, tick Observability in `upcoming.md`. 6. **Commit:** lib repo `docs: README + pack metadata`; scadaproj `git add components/observability CLAUDE.md components/README.md upcoming.md docs/plans && git commit -m "feat(telemetry): ZB.MOM.WW.Telemetry library + observability normalization component + MxGateway logging adoption"` **Acceptance:** 2 nupkgs @ 0.1.0; all library tests green + MxGateway tests green (counts recorded); indexes updated; design-doc build-order steps 2-6 (telemetry side) complete. --- ## Summary of parallelism - **Phase 0** docs: Task 1 ∥ Task 2. - **Phase 1** scaffold: Task 3 (barrier). - **Phase 2** core: Task 4 → Task 5 (sequential); Task 6 ∥ the Serilog tasks. - **Phase 3** serilog: Task 7 ∥ Task 8 (Task 8 modifies the same extensions file as Task 7 — sequence if conflict), then Task 9. - **Phase 4** migration: Task 10 → Task 11 (serial, same repo; needs Phase 3). - **Phase 5**: Task 12 (barrier).