docs: implementation plan for ZB.MOM.WW.Telemetry adoption across the 3 sister apps

13 tasks: Task 0 publishes/verifies the 2 nupkgs on Gitea (gates all); then 3
independent per-repo phases — OtOpcUa (1-3), ScadaBridge (4-6), MxGateway (7-11,
incl. the high-risk MEL->Serilog swap) — and Task 12 scadaproj bookkeeping last.
Records two behaviour-preserving refinements vs the design: ScadaBridge keeps
LoggerConfigurationFactory (+TraceContextEnricher) instead of AddZbSerilog, and
MxGateway keeps GatewayLogScope as-is. Breaking items #6/#7 deferred.
This commit is contained in:
Joseph Doherty
2026-06-01 15:24:28 -04:00
parent 3729ff2152
commit 30425726d4
2 changed files with 868 additions and 0 deletions
@@ -0,0 +1,848 @@
# ZB.MOM.WW.Telemetry Adoption Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans (or subagent-driven-development) to implement this plan task-by-task.
**Goal:** Adopt the shared `ZB.MOM.WW.Telemetry` + `ZB.MOM.WW.Telemetry.Serilog` packages across OtOpcUa, MxAccessGateway, and ScadaBridge — giving all three the OTel Resource identity triple, standard instrumentation, Prometheus `/metrics`, and shared Serilog correlation — behaviour-preserving, with breaking items deferred.
**Architecture:** Gitea-registry distribution (`dohertj2-gitea`, creds-only at user level). Each app references the shared packages and swaps its bespoke wiring for `AddZbTelemetry` / `AddZbSerilog`, keeping existing meter names, units, log messages, and the `/metrics` path. Each sister repo is its own git repo; work happens on branch `feat/adopt-zb-telemetry`, one commit per task, **never skip hooks, never force-push.**
**Tech Stack:** .NET 10, OpenTelemetry SDK, Prometheus exporter, Serilog, NuGet Central Package Management (OtOpcUa + ScadaBridge; MxGateway has none).
**Source design:** [`2026-06-01-telemetry-library-adoption-design.md`](2026-06-01-telemetry-library-adoption-design.md)
---
## Two refinements discovered during planning (deviations from the design doc)
Both serve the approved **behaviour-preserving** acceptance bar:
1. **ScadaBridge logging — KEEP `LoggerConfigurationFactory`.** The design doc said "delete the
factory and swap to `AddZbSerilog`." Code review showed the factory implements a documented
governance contract (REQ-HOST-8 / Host-011/014/020/022): `ScadaBridge:Logging:MinimumLevel` is
the floor and **overrides** `Serilog:MinimumLevel`, with operator warnings when both are set or
a level is mistyped. `AddZbSerilog` hard-codes `MinimumLevel.Is(Information)` *before*
`ReadFrom.Configuration`, which inverts that precedence and silently drops the
`ScadaBridge:Logging:MinimumLevel` knob (and breaks its tests). **Plan: keep the factory, add the
shared `TraceContextEnricher` to it** (gaining trace↔log correlation) and do NOT adopt
`AddZbSerilog` for ScadaBridge. ScadaBridge still fully adopts the metrics/Resource half.
2. **MxGateway logging — keep `GatewayLogScope` + request-logging middleware as-is.** The Serilog
MEL provider captures MEL `BeginScope` dictionaries as structured properties, so the existing
middleware keeps producing the same scope properties once Serilog is the provider. The only
logging code changes are: register Serilog as the provider (`AddZbSerilog`), migrate the
`appsettings` `Logging` section to a `Serilog` section, and wrap the static `GatewayLogRedactor`
behind the `ILogRedactor` seam. No rewrite of working scope code.
---
## Execution order & parallelism
- **Task 0 gates everything** (packages must be on the feed before any repo can restore).
- After Task 0, the **three repo phases are independent** (separate working directories) and may run
concurrently: OtOpcUa (Tasks 13), ScadaBridge (Tasks 46), MxGateway (Tasks 711).
- **Within a repo, tasks are sequential** (same working tree / same branch — do not dispatch two
implementers against one repo concurrently).
- **Task 12** (scadaproj bookkeeping) runs last, after all three phases land.
Branch setup (first task in each repo creates it): `git checkout -b feat/adopt-zb-telemetry` from the
repo's default branch (`master` for OtOpcUa, `main` for the others).
---
## Task 0: Publish/verify Telemetry packages on the Gitea feed
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** none (gates all)
**Files:**
- Work in: `/Users/dohertj2/Desktop/scadaproj/ZB.MOM.WW.Telemetry/`
- No repo files edited (publish only). Credentials already at `~/.nuget/NuGet/NuGet.Config`.
**Context:** The library CLAUDE.md claims these are "published to the Gitea NuGet feed." The Health
round proved that claim unreliable. Verify; pack + push only if missing. Mirrors Health Task 0.
**Step 1: Check whether `ZB.MOM.WW.Telemetry` 0.1.0 is already on the feed**
```bash
cd /Users/dohertj2/Desktop/scadaproj/ZB.MOM.WW.Telemetry
# Use the user-level creds (source name dohertj2-gitea) already configured.
dotnet nuget list source # confirm dohertj2-gitea is NOT registered globally (creds are user-level only)
curl -s -u "dohertj2:$(grep -A2 dohertj2-gitea ~/.nuget/NuGet/NuGet.Config | grep ClearTextPassword | sed -E 's/.*value="([^"]+)".*/\1/')" \
"https://gitea.dohertylan.com/api/packages/dohertj2/nuget/registration/zb.mom.ww.telemetry/index.json" -o /tmp/tele.json -w "%{http_code}\n"
```
Expected: `200` if already published (then SKIP to Step 4), `404` if missing (continue).
**Step 2: Pack the two packages (only if missing)**
```bash
dotnet pack ZB.MOM.WW.Telemetry.slnx -c Release -o ./artifacts
ls ./artifacts/*.nupkg
```
Expected: `ZB.MOM.WW.Telemetry.0.1.0.nupkg` and `ZB.MOM.WW.Telemetry.Serilog.0.1.0.nupkg`.
**Step 3: Push both to Gitea (only if missing)**
```bash
TOKEN=$(grep -A2 dohertj2-gitea ~/.nuget/NuGet/NuGet.Config | grep ClearTextPassword | sed -E 's/.*value="([^"]+)".*/\1/')
for pkg in ./artifacts/ZB.MOM.WW.Telemetry.0.1.0.nupkg ./artifacts/ZB.MOM.WW.Telemetry.Serilog.0.1.0.nupkg; do
dotnet nuget push "$pkg" --source "https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json" --api-key "$TOKEN"
done
```
Expected: `Your package was pushed.` for each (or `409 Conflict` if a version already exists — acceptable).
**Step 4: Verify both ids resolve**
```bash
for id in zb.mom.ww.telemetry zb.mom.ww.telemetry.serilog; do
curl -s -u "dohertj2:$TOKEN" "https://gitea.dohertylan.com/api/packages/dohertj2/nuget/registration/$id/index.json" -w " -> %{http_code}\n" -o /dev/null
done
```
Expected: `-> 200` for both.
**Step 5: No commit** (publish-only task). Record completion.
> **SECURITY:** the Gitea token must NEVER be written into any repo file or commit. It lives only in
> `~/.nuget/NuGet/NuGet.Config`. The `curl`/`push` commands read it from there at runtime.
---
## Task 1: OtOpcUa — distribution wiring (source mapping + package refs)
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 4, Task 7 (other repos)
**Files:**
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/NuGet.config`
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/Directory.Packages.props`
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj`
**Step 1: Branch**
```bash
cd /Users/dohertj2/Desktop/OtOpcUa && git checkout master && git pull --ff-only && git checkout -b feat/adopt-zb-telemetry
```
**Step 2: Add Telemetry patterns to `NuGet.config`** — under `<packageSource key="dohertj2-gitea">`, add BOTH patterns (the `.*` glob does NOT match the bare core id):
```xml
<packageSource key="dohertj2-gitea">
<package pattern="ZB.MOM.WW.Health" />
<package pattern="ZB.MOM.WW.Health.*" />
<package pattern="ZB.MOM.WW.Telemetry" />
<package pattern="ZB.MOM.WW.Telemetry.*" />
</packageSource>
```
**Step 3: Add versions to `Directory.Packages.props`** (next to the Health `<PackageVersion>` lines):
```xml
<PackageVersion Include="ZB.MOM.WW.Telemetry" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Telemetry.Serilog" Version="0.1.0" />
```
**Step 4: Add versionless refs to the Host csproj** (next to the `ZB.MOM.WW.Health` refs):
```xml
<PackageReference Include="ZB.MOM.WW.Telemetry" />
<PackageReference Include="ZB.MOM.WW.Telemetry.Serilog" />
```
**Step 5: Restore + build to confirm the Gitea feed resolves and Serilog floor is satisfied**
```bash
dotnet restore ZB.MOM.WW.OtOpcUa.slnx
dotnet build ZB.MOM.WW.OtOpcUa.slnx -c Debug
```
Expected: restore pulls both packages from `dohertj2-gitea`; build succeeds. If restore fails on a
`Serilog.AspNetCore` floor (OtOpcUa pins 9.0.0), bump `Serilog.AspNetCore` (and the related
`Serilog.*` 9.x lines) in `Directory.Packages.props` to the floor the package requires, then rebuild.
**Step 6: Commit**
```bash
git add NuGet.config Directory.Packages.props src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj
git commit -m "build(otopcua): reference ZB.MOM.WW.Telemetry packages from Gitea feed"
```
---
## Task 2: OtOpcUa — swap OTel wiring to AddZbTelemetry
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** none (within OtOpcUa)
**Files:**
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs` (rewrite body; keep both method names + signatures)
- Test (oracle, do not edit): `/Users/dohertj2/Desktop/OtOpcUa/tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Observability/OtOpcUaTelemetryHookTests.cs`
**Context:** Today `AddOtOpcUaObservability()` (called at `Program.cs:138`) hand-wires
`AddOpenTelemetry().WithMetrics(...AddMeter("ZB.MOM.WW.OtOpcUa")...AddPrometheusExporter()).WithTracing(...AddSource("ZB.MOM.WW.OtOpcUa"))`,
and `MapOtOpcUaMetrics()` (called at `Program.cs:160`) maps `/metrics`. Keep both call sites
unchanged; rewrite the extension bodies to delegate to the shared library. **Same meter/source
names + same `/metrics` path** ⇒ behaviour-preserving; gains the Resource identity triple +
standard instrumentation.
**Step 1: Rewrite `ObservabilityExtensions.cs`** preserving the two public method signatures:
```csharp
using Microsoft.AspNetCore.Routing;
using Microsoft.Extensions.DependencyInjection;
using ZB.MOM.WW.OtOpcUa.Commons.Observability; // OtOpcUaTelemetry
using ZB.MOM.WW.Telemetry;
namespace ZB.MOM.WW.OtOpcUa.Host.Observability;
/// <summary>
/// OtOpcUa observability wiring, delegated to the shared ZB.MOM.WW.Telemetry library.
/// Keeps the existing meter/ActivitySource names ("ZB.MOM.WW.OtOpcUa") and the "/metrics"
/// scrape path, and adds the shared OTel Resource + standard instrumentation.
/// </summary>
public static class ObservabilityExtensions
{
public static IServiceCollection AddOtOpcUaObservability(this IServiceCollection services)
{
ArgumentNullException.ThrowIfNull(services);
return services.AddZbTelemetry(o =>
{
o.ServiceName = "otopcua";
o.Meters = [OtOpcUaTelemetry.MeterName]; // "ZB.MOM.WW.OtOpcUa"
o.ActivitySources = [OtOpcUaTelemetry.ActivitySourceName]; // "ZB.MOM.WW.OtOpcUa"
// Exporter defaults to Prometheus — preserves the existing /metrics posture.
});
}
// Keep the SAME signature the Program.cs:160 call site uses (app.MapOtOpcUaMetrics()).
// MapZbMetrics() maps MapPrometheusScrapingEndpoint() whose default path is "/metrics".
public static IEndpointRouteBuilder MapOtOpcUaMetrics(this IEndpointRouteBuilder endpoints)
{
ArgumentNullException.ThrowIfNull(endpoints);
endpoints.MapZbMetrics();
return endpoints;
}
}
```
> If the existing `MapOtOpcUaMetrics` extends `WebApplication`/`IApplicationBuilder` rather than
> `IEndpointRouteBuilder`, keep THAT receiver type and call `app.MapZbMetrics();` — match the
> current signature so `Program.cs:160` compiles unchanged.
**Step 2: Build**
```bash
cd /Users/dohertj2/Desktop/OtOpcUa && dotnet build ZB.MOM.WW.OtOpcUa.slnx -c Debug
```
Expected: PASS. (The now-redundant direct `OpenTelemetry.Extensions.Hosting` /
`OpenTelemetry.Exporter.Prometheus.AspNetCore` refs may stay — they resolve the same assemblies the
shared package brings; leaving them is lower-risk than pruning.)
**Step 3: Run the telemetry hook tests (the behaviour oracle)**
```bash
dotnet test ZB.MOM.WW.OtOpcUa.slnx --filter "FullyQualifiedName~OtOpcUaTelemetryHookTests"
```
Expected: PASS — the meter `ZB.MOM.WW.OtOpcUa` and ActivitySource still emit (the shared
`AddZbTelemetry` registered them via `o.Meters`/`o.ActivitySources`).
**Step 4: Commit**
```bash
git add src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs
git commit -m "feat(otopcua): wire OTel via AddZbTelemetry (shared Resource + std instrumentation)"
```
---
## Task 3: OtOpcUa — swap Serilog to AddZbSerilog + move sinks to config
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (within OtOpcUa)
**Files:**
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs:49-52` (the inline `UseSerilog` block)
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json` (currently `{}`)
- Test (oracle): `/Users/dohertj2/Desktop/OtOpcUa/tests/Core/ZB.MOM.WW.OtOpcUa.Core.Tests/Observability/LogContextEnricherTests.cs`
**Context:** Today `Program.cs:49-52` configures Serilog in code with `ReadFrom.Configuration` +
`WriteTo.Console()` + `WriteTo.File("logs/otopcua-.log", rollingInterval: Day)`. `AddZbSerilog` uses
`ReadFrom.Configuration` only, so the Console/File sinks must move into config to be reproduced. The
role-specific `appsettings.*.json` already carry `Serilog:MinimumLevel` overrides — those keep
working through `ReadFrom.Configuration`.
**Step 1: Add the sinks to `appsettings.json`** (replace the empty `{}`):
```json
{
"Serilog": {
"Using": [ "Serilog.Sinks.Console", "Serilog.Sinks.File" ],
"WriteTo": [
{ "Name": "Console" },
{ "Name": "File", "Args": { "path": "logs/otopcua-.log", "rollingInterval": "Day" } }
]
}
}
```
> Do NOT add `"Enrich": ["FromLogContext"]` unless it is already enabled today — adding it would
> newly surface driver-scope properties and change output. Preserve the current enrich set.
**Step 2: Replace the inline `UseSerilog` block in `Program.cs`.** Remove lines 49-52:
```csharp
builder.Host.UseSerilog((ctx, lc) => lc
.ReadFrom.Configuration(ctx.Configuration)
.WriteTo.Console()
.WriteTo.File("logs/otopcua-.log", rollingInterval: RollingInterval.Day));
```
and replace with:
```csharp
builder.AddZbSerilog(o => o.ServiceName = "otopcua");
```
Add `using ZB.MOM.WW.Telemetry.Serilog;` to the `using` block. Keep `app.UseSerilogRequestLogging();`
(line 141) unchanged. Keep the existing `using Serilog;` if still referenced; remove
`RollingInterval` import only if now unused.
**Step 3: Build + run the LogContextEnricher tests**
```bash
cd /Users/dohertj2/Desktop/OtOpcUa
dotnet build ZB.MOM.WW.OtOpcUa.slnx -c Debug
dotnet test ZB.MOM.WW.OtOpcUa.slnx --filter "FullyQualifiedName~LogContextEnricherTests"
```
Expected: build PASS; tests PASS (the static `LogContextEnricher.Push` helper is unaffected — it is
not registered in DI and AddZbSerilog does not change its disposable contract).
**Step 4: Sanity-check that logs still emit** (no automated log-output harness here):
```bash
# Quick smoke: build runs; optionally run the host briefly in a role that doesn't need infra
# and confirm console log lines appear. If no safe role exists, rely on the build + the request-
# logging path remaining wired (UseSerilogRequestLogging at Program.cs:141).
```
**Step 5: Commit**
```bash
git add src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json
git commit -m "feat(otopcua): adopt AddZbSerilog (shared enrichers + trace correlation); sinks to config"
```
---
## Task 4: ScadaBridge — distribution wiring (source mapping + package refs)
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 1, Task 7 (other repos)
**Files:**
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/nuget.config`
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/Directory.Packages.props`
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj`
**Step 1: Branch**
```bash
cd /Users/dohertj2/Desktop/ScadaBridge && git checkout main && git pull --ff-only && git checkout -b feat/adopt-zb-telemetry
```
**Step 2: Add Telemetry patterns to `nuget.config`** under `<packageSource key="dohertj2-gitea">`:
```xml
<package pattern="ZB.MOM.WW.Telemetry" />
<package pattern="ZB.MOM.WW.Telemetry.*" />
```
**Step 3: Add versions to `Directory.Packages.props`** (next to the Health lines):
```xml
<PackageVersion Include="ZB.MOM.WW.Telemetry" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Telemetry.Serilog" Version="0.1.0" />
```
**Step 4: Add versionless refs to the Host csproj** (next to the Health refs):
```xml
<PackageReference Include="ZB.MOM.WW.Telemetry" />
<PackageReference Include="ZB.MOM.WW.Telemetry.Serilog" />
```
> `ZB.MOM.WW.Telemetry.Serilog` is referenced here only for the public `TraceContextEnricher` type
> used in Task 6 — ScadaBridge does NOT call `AddZbSerilog`.
**Step 5: Restore + build** (watch for OTel version conflicts with the pinned `OpenTelemetry.Api 1.15.3`)
```bash
dotnet restore ZB.MOM.WW.ScadaBridge.slnx
dotnet build ZB.MOM.WW.ScadaBridge.slnx -c Debug
```
Expected: PASS. If a transitive OTel version conflicts with the CVE-override `OpenTelemetry.Api`,
align the override version to what the shared package requires.
**Step 6: Commit**
```bash
git add nuget.config Directory.Packages.props src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj
git commit -m "build(scadabridge): reference ZB.MOM.WW.Telemetry packages from Gitea feed"
```
---
## Task 5: ScadaBridge — AddZbTelemetry in both composition roots + MapZbMetrics
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (within ScadaBridge)
**Files:**
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/SiteServiceRegistration.cs` (`BindSharedOptions`, ~lines 100-117 — add the registration; called by BOTH roots)
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/Program.cs` (Central endpoint section ~206-259; Site endpoint section ~307-320 — add `app.MapZbMetrics()` in each)
- Test: `/Users/dohertj2/Desktop/ScadaBridge/tests/ZB.MOM.WW.ScadaBridge.Host.Tests/` (add a `/metrics`-served assertion; HealthCheckTests pattern with `WebApplicationFactory<Program>`)
**Context:** ScadaBridge has NO OTel today (only the `OpenTelemetry.Api` CVE override). `SiteId`,
`NodeRole`, `NodeHostname` are available from config (`ScadaBridge:Node:*`). `BindSharedOptions` is
called by both the Central and Site roots, so registering telemetry there covers both without
duplication. This is purely additive (no metrics exist to break).
**Step 1: Register telemetry in `BindSharedOptions`.** Inside `SiteServiceRegistration.BindSharedOptions(IServiceCollection services, IConfiguration config)`, after the existing `services.Configure<...>` calls, add:
```csharp
// Shared OTel: Resource identity (service.name / site.id / node.role) + standard instrumentation
// + Prometheus exporter. Mounted at /metrics by app.MapZbMetrics() in each composition root.
services.AddZbTelemetry(o =>
{
o.ServiceName = "scadabridge";
o.SiteId = config["ScadaBridge:Node:SiteId"] ?? "central";
o.NodeRole = config["ScadaBridge:Node:Role"];
// o.Meters left empty — application instruments are a deferred follow-on (GAPS #9).
});
```
Add `using ZB.MOM.WW.Telemetry;`. (Use the SAME default `?? "central"` for SiteId that
`Program.cs:45` uses, so the Resource attribute matches the log enricher value.)
**Step 2: Map `/metrics` in BOTH roots.** In `Program.cs`:
- Central block — after `app.UseRouting()` and alongside the other `Map*` calls (e.g. just after `app.MapZbHealth();`), add:
```csharp
app.MapZbMetrics();
```
- Site block — in its endpoint section (where `app.MapGrpcService<...>()` is mapped, ~307-320), add:
```csharp
app.MapZbMetrics();
```
Add `using ZB.MOM.WW.Telemetry;` to `Program.cs` if not already present. `MapZbMetrics()` requires
routing; the Central block already calls `UseRouting()`, and the Site block's `MapGrpcService`
implies endpoint routing — if the Site app lacks `UseRouting()`, add it before `MapZbMetrics()`.
**Step 3: Add a `/metrics` integration test** in the Host.Tests project (mirror `HealthCheckTests`):
```csharp
[Fact]
public async Task Metrics_Endpoint_IsMapped()
{
using var factory = /* existing WebApplicationFactory<Program> setup for Central role */;
using var client = factory.CreateClient();
var response = await client.GetAsync("/metrics");
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
var body = await response.Content.ReadAsStringAsync();
Assert.Contains("# ", body); // Prometheus exposition format (HELP/TYPE comments)
}
```
> Reuse the exact `WebApplicationFactory<Program>` + in-memory config bootstrapping that
> `HealthCheckTests.cs` already uses for the Central role (it sets the env to "Central" and removes
> the Akka hosted service). Do not invent a new harness.
**Step 4: Build + test**
```bash
cd /Users/dohertj2/Desktop/ScadaBridge
dotnet build ZB.MOM.WW.ScadaBridge.slnx -c Debug
dotnet test ZB.MOM.WW.ScadaBridge.slnx --filter "FullyQualifiedName~HealthCheckTests|FullyQualifiedName~Metrics_Endpoint_IsMapped|FullyQualifiedName~CompositionRoot"
```
Expected: PASS (existing composition-root + health tests stay green; new metrics test passes).
**Step 5: Commit**
```bash
git add src/ZB.MOM.WW.ScadaBridge.Host/SiteServiceRegistration.cs src/ZB.MOM.WW.ScadaBridge.Host/Program.cs tests/ZB.MOM.WW.ScadaBridge.Host.Tests/
git commit -m "feat(scadabridge): wire AddZbTelemetry + /metrics in both composition roots"
```
---
## Task 6: ScadaBridge — add shared TraceContextEnricher to LoggerConfigurationFactory
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** none (within ScadaBridge)
**Files:**
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/LoggerConfigurationFactory.cs` (the `Build` return expression)
- Test (oracle): `/Users/dohertj2/Desktop/ScadaBridge/tests/ZB.MOM.WW.ScadaBridge.Host.Tests/SerilogTests.cs` (+ any `LoggerConfigurationFactory` tests)
**Context (deviation from design doc — see top of plan):** KEEP `LoggerConfigurationFactory` intact
(it owns the Host-011/014/020/022 minimum-level governance). Only add the shared
`TraceContextEnricher` so logs emitted inside a span carry `trace_id`/`span_id` and can be joined to
traces. This gains the cross-cutting correlation win without regressing ScadaBridge's logging
contract.
**Step 1: Add the enricher to the `Build` return.** In `LoggerConfigurationFactory.Build(...)`, the
final expression currently ends:
```csharp
return new LoggerConfiguration()
.ReadFrom.Configuration(configuration)
.MinimumLevel.Is(minimumLevel)
.Enrich.WithProperty("SiteId", siteId)
.Enrich.WithProperty("NodeHostname", nodeHostname)
.Enrich.WithProperty("NodeRole", nodeRole);
```
Add the shared enricher as the last `.Enrich`:
```csharp
.Enrich.WithProperty("NodeRole", nodeRole)
.Enrich.With(new ZB.MOM.WW.Telemetry.Serilog.TraceContextEnricher());
```
(Or add `using ZB.MOM.WW.Telemetry.Serilog;` and use `.Enrich.With(new TraceContextEnricher())`.)
**Step 2: Build + run the Serilog tests**
```bash
cd /Users/dohertj2/Desktop/ScadaBridge
dotnet build ZB.MOM.WW.ScadaBridge.slnx -c Debug
dotnet test ZB.MOM.WW.ScadaBridge.slnx --filter "FullyQualifiedName~SerilogTests|FullyQualifiedName~LoggerConfiguration"
```
Expected: PASS. The three node-identity enrichers and the min-level governance are untouched;
`trace_id`/`span_id` only appear when an `Activity.Current` exists (none in these tests → no change
to asserted properties).
**Step 3: Commit**
```bash
git add src/ZB.MOM.WW.ScadaBridge.Host/LoggerConfigurationFactory.cs
git commit -m "feat(scadabridge): add shared TraceContextEnricher to log pipeline (trace correlation)"
```
---
## Task 7: MxAccessGateway — distribution wiring (source mapping + package refs)
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 1, Task 4 (other repos)
**Files:**
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/nuget.config`
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj` (NO CPM — direct versioned refs)
**Step 1: Branch**
```bash
cd /Users/dohertj2/Desktop/MxAccessGateway && git checkout main && git pull --ff-only && git checkout -b feat/adopt-zb-telemetry
```
**Step 2: Add Telemetry patterns to `nuget.config`** under `<packageSource key="dohertj2-gitea">`:
```xml
<package pattern="ZB.MOM.WW.Telemetry" />
<package pattern="ZB.MOM.WW.Telemetry.*" />
```
**Step 3: Add direct versioned refs to the Server csproj** (in the main `<ItemGroup>` of `<PackageReference>`s). MxGateway has no Serilog/OTel today, so it needs the shared packages AND the concrete sink assemblies referenced by the `appsettings` `Using` block:
```xml
<PackageReference Include="ZB.MOM.WW.Telemetry" Version="0.1.0" />
<PackageReference Include="ZB.MOM.WW.Telemetry.Serilog" Version="0.1.0" />
<PackageReference Include="Serilog.AspNetCore" Version="10.0.0" />
<PackageReference Include="Serilog.Sinks.Console" Version="6.1.1" />
<PackageReference Include="Serilog.Sinks.File" Version="7.0.0" />
```
> Versions align with ScadaBridge's pins (Serilog.AspNetCore 10.0.0, Console 6.1.1, File 7.0.0). If
> the `.Serilog` package requires a different `Serilog.AspNetCore` floor, match it.
**Step 4: Restore + build**
```bash
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
```
Expected: PASS (packages resolve from Gitea + nuget.org).
**Step 5: Commit**
```bash
git add nuget.config src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj
git commit -m "build(mxgateway): reference ZB.MOM.WW.Telemetry + Serilog packages"
```
---
## Task 8: MxAccessGateway — migrate appsettings Logging → Serilog section
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** none (within MxGateway)
**Files:**
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/appsettings.json`
**Context:** Current `Logging` (MEL) section: `Default: Information`, `Microsoft.AspNetCore: Warning`.
`AddZbSerilog` reads sinks/levels via `ReadFrom.Configuration` from a `Serilog` section. Translate
the levels and add Console + File sinks so logging output is preserved after the provider swap.
**Step 1: Replace the `Logging` block with a `Serilog` block.** Remove:
```json
"Logging": {
"LogLevel": { "Default": "Information", "Microsoft.AspNetCore": "Warning" }
},
```
Add:
```json
"Serilog": {
"Using": [ "Serilog.Sinks.Console", "Serilog.Sinks.File" ],
"MinimumLevel": {
"Default": "Information",
"Override": { "Microsoft.AspNetCore": "Warning" }
},
"WriteTo": [
{ "Name": "Console" },
{ "Name": "File", "Args": { "path": "logs/mxgateway-.log", "rollingInterval": "Day" } }
]
},
```
> Keep the rest of `appsettings.json` (gateway config) unchanged. Note: `AddZbSerilog` applies its
> own `MinimumLevel.Is(Information)` before `ReadFrom.Configuration`, so the `Serilog:MinimumLevel`
> above is honoured (raising the floor to Information and overriding Microsoft.AspNetCore to Warning
> — matching today's MEL levels).
**Step 2: Commit** (config-only; build happens in Task 9 once the provider is wired)
```bash
git add src/ZB.MOM.WW.MxGateway.Server/appsettings.json
git commit -m "config(mxgateway): translate MEL Logging section to Serilog"
```
---
## Task 9: MxAccessGateway — wire AddZbSerilog (MEL → Serilog provider swap)
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** none (within MxGateway)
**Files:**
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs` (`CreateBuilder`, after `ConfigureSelfSignedTls(builder)` ~line 63)
- Test: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs` (add a provider-swap assertion)
**Context (high-risk — logging on the most operational app):** Register Serilog as the host's
logging provider so all existing MEL `ILogger`/`ILoggerFactory` calls (including
`UseGatewayRequestLoggingScope`'s middleware) route through Serilog. The Serilog MEL provider
captures MEL `BeginScope` dictionaries as structured properties, so `GatewayLogScope` and the
request-logging middleware keep working unchanged. The temporary `LoggerFactory.Create(...AddConsole())`
at lines 96-100 (used only by the TLS cert provider) may remain as-is.
**Step 1: Add the failing test** in `GatewayApplicationTests.cs` — assert the logger factory is now Serilog-backed:
```csharp
[Fact]
public void Build_UsesSerilogLoggerProvider()
{
using var app = GatewayApplication.Build([]);
var factory = app.Services.GetRequiredService<ILoggerFactory>();
// Serilog.Extensions.Hosting registers SerilogLoggerFactory when AddSerilog replaces the factory.
Assert.Equal("SerilogLoggerFactory", factory.GetType().Name);
}
```
**Step 2: Run it — expect FAIL** (`dotnet test ... --filter Build_UsesSerilogLoggerProvider`) → today the factory is the default MEL `LoggerFactory`.
**Step 3: Wire `AddZbSerilog`.** In `GatewayApplication.CreateBuilder`, immediately after
`ConfigureSelfSignedTls(builder);`, add:
```csharp
builder.AddZbSerilog(o => o.ServiceName = "mxgateway");
```
Add `using ZB.MOM.WW.Telemetry.Serilog;`. (`AddZbSerilog` calls `services.AddSerilog(..., preserveStaticLogger: true)`,
which registers `SerilogLoggerFactory` — replacing the MEL factory, so default providers do not
double-log.)
**Step 4: Run the test — expect PASS**, then run the broader logging-adjacent suites:
```bash
cd /Users/dohertj2/Desktop/MxAccessGateway
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~GatewayApplicationTests"
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~FakeWorker"
```
Expected: PASS — `Build_MapsCanonicalHealthEndpoints`, `Build_RegistersGatewayMetrics`, the
config-validation cases, and the fake-worker smoke all stay green; the new provider-swap test passes.
**Step 5: Verify no double console logging** — if `SerilogLoggerFactory` is confirmed in Step 4, the
default providers are bypassed and no extra step is needed. If you observe duplicated console lines
in any manual run, add `builder.Logging.ClearProviders();` immediately before `AddZbSerilog`.
**Step 6: Commit**
```bash
git add src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs
git commit -m "feat(mxgateway): adopt AddZbSerilog — MEL→Serilog provider swap (behaviour-preserving)"
```
---
## Task 10: MxAccessGateway — wrap GatewayLogRedactor behind the ILogRedactor seam
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** none (within MxGateway)
**Files:**
- Create: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogRedactorSeam.cs`
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs` (register the seam in DI in `CreateBuilder`)
- Test: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorSeamTests.cs`
**Context:** The shared `RedactionEnricher` applies any DI-registered `ILogRedactor` to every log
event before it reaches a sink. MxGateway's redaction lives in the static `GatewayLogRedactor`
(API-key Bearer tokens, client identity). Provide a thin `ILogRedactor` that redacts the relevant
log-event properties (`ClientIdentity`, `authorization`) via the existing static helper. Keep
`GatewayLogRedactor` for its current callers (`GatewayLogScope`, `DashboardRedactor`).
**Step 1: Write the failing test** (`GatewayLogRedactorSeamTests.cs`):
```csharp
using System.Collections.Generic;
using ZB.MOM.WW.MxGateway.Server.Diagnostics;
using Xunit;
public class GatewayLogRedactorSeamTests
{
[Fact]
public void Redact_MasksApiKeyInClientIdentity()
{
var redactor = new GatewayLogRedactorSeam();
var props = new Dictionary<string, object?>
{
["ClientIdentity"] = "Bearer mxgw_operator01_super-secret"
};
redactor.Redact(props);
Assert.Equal("Bearer mxgw_operator01_[redacted]", props["ClientIdentity"]);
}
}
```
**Step 2: Run it — expect FAIL** (type doesn't exist).
**Step 3: Implement `GatewayLogRedactorSeam.cs`:**
```csharp
using ZB.MOM.WW.Telemetry.Serilog;
namespace ZB.MOM.WW.MxGateway.Server.Diagnostics;
/// <summary>
/// Adapts the static <see cref="GatewayLogRedactor"/> to the shared <see cref="ILogRedactor"/> seam
/// so the telemetry RedactionEnricher masks API-key/credential material on every log event.
/// </summary>
public sealed class GatewayLogRedactorSeam : ILogRedactor
{
private static readonly string[] IdentityKeys = ["ClientIdentity", "authorization", "Authorization"];
public void Redact(IDictionary<string, object?> properties)
{
ArgumentNullException.ThrowIfNull(properties);
foreach (var key in IdentityKeys)
{
if (properties.TryGetValue(key, out var value) && value is string s)
{
properties[key] = GatewayLogRedactor.RedactClientIdentity(s);
}
}
}
}
```
**Step 4: Register in DI.** In `GatewayApplication.CreateBuilder`, alongside the other singletons, add:
```csharp
builder.Services.AddSingleton<ZB.MOM.WW.Telemetry.Serilog.ILogRedactor, Diagnostics.GatewayLogRedactorSeam>();
```
**Step 5: Run the test + build**
```bash
cd /Users/dohertj2/Desktop/MxAccessGateway
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~GatewayLogRedactorSeamTests"
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
```
Expected: PASS.
**Step 6: Commit**
```bash
git add src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogRedactorSeam.cs src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorSeamTests.cs
git commit -m "feat(mxgateway): expose GatewayLogRedactor via shared ILogRedactor seam"
```
---
## Task 11: MxAccessGateway — wire AddZbTelemetry (export GatewayMetrics) + MapZbMetrics
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** none (within MxGateway)
**Files:**
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs` (`CreateBuilder` after `AddSingleton<GatewayMetrics>()` ~line 72; `MapGatewayEndpoints` after `MapZbHealth()` ~line 177)
- Test: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs` (add `/metrics`-served assertion) + existing `GatewayMetricsTests` as oracle
**Context:** The `MxGateway.Server` meter (13 counters, 3 ms-histograms, 4 gauges) exists but is
never exported (no OTel SDK, no `/metrics`). `AddZbTelemetry` with `Meters = ["MxGateway.Server"]`
registers the meter with the OTel MeterProvider + Prometheus exporter; `MapZbMetrics()` mounts
`/metrics`. **Keep the `MxGateway.Server` name and the `ms` histogram units** (rename #7 + unit #6
are deferred). `GetSnapshot()` is untouched.
**Step 1: Add `AddZbTelemetry` in `CreateBuilder`**, immediately after `builder.Services.AddSingleton<GatewayMetrics>();`:
```csharp
builder.AddZbTelemetry(o =>
{
o.ServiceName = "mxgateway";
o.Meters = [GatewayMetrics.MeterName]; // "MxGateway.Server" — unchanged (rename deferred)
});
```
Add `using ZB.MOM.WW.Telemetry;`.
**Step 2: Map `/metrics` in `MapGatewayEndpoints`**, after `endpoints.MapZbHealth();`:
```csharp
endpoints.MapZbMetrics();
```
**Step 3: Add the served-endpoint test** in `GatewayApplicationTests.cs`:
```csharp
[Fact]
public async Task Build_MapsMetricsEndpoint()
{
using var app = GatewayApplication.Build([]);
await app.StartAsync();
try
{
using var client = new HttpClient { BaseAddress = new Uri(app.Urls.First()) };
var response = await client.GetAsync("/metrics");
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
}
finally { await app.StopAsync(); }
}
```
> If the existing test class already has a started-host helper (the config-validation tests call
> `StartAsync`), reuse it rather than starting a fresh host. Tests bind ephemeral ports (`:0`).
**Step 4: Build + test**
```bash
cd /Users/dohertj2/Desktop/MxAccessGateway
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~GatewayApplicationTests|FullyQualifiedName~GatewayMetricsTests"
```
Expected: PASS — the `MeterListener`-based `GatewayMetricsTests` (Tests-027 isolation) stay green
because the meter name/instruments are unchanged; the new `/metrics` test passes.
**Step 5: Commit**
```bash
git add src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs
git commit -m "feat(mxgateway): export GatewayMetrics via AddZbTelemetry + /metrics (name/units unchanged)"
```
---
## Task 12: scadaproj — bookkeeping (GAPS + correct the false "MxGateway logging adopted" claim)
**Classification:** trivial
**Estimated implement time:** ~4 min
**Parallelizable with:** none (runs after all repo phases)
**Files:**
- Modify: `/Users/dohertj2/Desktop/scadaproj/components/observability/GAPS.md` (add "Adoption status — 2026-06-01 (DONE)" section)
- Modify: `/Users/dohertj2/Desktop/scadaproj/components/observability/README.md` (correct the "MxGateway logging adopted" claim)
- Modify: `/Users/dohertj2/Desktop/scadaproj/ZB.MOM.WW.Telemetry/CLAUDE.md` (same correction)
- Modify: `/Users/dohertj2/Desktop/scadaproj/CLAUDE.md` (observability row + "MxAccessGateway logging adopted" note)
**Step 1: Add an adoption-status section to `GAPS.md`** with a per-repo table (what each app now
does), the **accepted scope note** (ScadaBridge keeps `LoggerConfigurationFactory` + adds
`TraceContextEnricher` rather than adopting `AddZbSerilog`; MxGateway keeps `GatewayLogScope`), and a
**Deferred** subsection listing #6 (histogram ms→s), #7 (meter rename), #9 (ScadaBridge app
instruments), #10/#11 (OTLP) as still-open.
**Step 2: Correct the false claim** everywhere it appears — the prior text said MxGateway's MEL→Serilog
migration was "done on its own branch." Replace with: "MxGateway MEL→Serilog migration + metrics
export landed on `main` via the 2026-06-01 telemetry adoption (branch `feat/adopt-zb-telemetry`)."
**Step 3: Commit**
```bash
cd /Users/dohertj2/Desktop/scadaproj
git add components/observability/GAPS.md components/observability/README.md ZB.MOM.WW.Telemetry/CLAUDE.md CLAUDE.md
git commit -m "docs(observability): record ZB.MOM.WW.Telemetry adoption across 3 apps; correct MxGateway logging-status claim"
```
---
## Acceptance checklist (whole plan)
- [ ] Both Telemetry packages resolve from the Gitea feed (Task 0 verified `200`).
- [ ] OtOpcUa: builds; `OtOpcUaTelemetryHookTests` + `LogContextEnricherTests` green; `/metrics` still served; meter `ZB.MOM.WW.OtOpcUa` unchanged.
- [ ] ScadaBridge: builds; composition-root + health + new metrics tests green; `/metrics` served in both roles; `LoggerConfigurationFactory` governance intact.
- [ ] MxGateway: builds; `GatewayApplicationTests` + `GatewayMetricsTests` + fake-worker smoke green; logger is Serilog-backed; redaction applied via seam; `/metrics` served; `MxGateway.Server` name + `ms` units unchanged.
- [ ] No secrets committed to any repo (token stays in `~/.nuget/NuGet/NuGet.Config`).
- [ ] `components/observability/GAPS.md` updated; the false "MxGateway logging adopted" claim corrected.
- [ ] All three feature branches committed (one commit per task), no hooks skipped, no force-push.
@@ -0,0 +1,20 @@
{
"planPath": "docs/plans/2026-06-01-telemetry-library-adoption.md",
"tasks": [
{"id": 0, "taskId": 23, "subject": "Task 0: Publish/verify Telemetry packages on Gitea", "status": "pending", "classification": "small"},
{"id": 1, "taskId": 24, "subject": "Task 1: OtOpcUa — distribution wiring", "status": "pending", "classification": "small", "blockedBy": [0]},
{"id": 2, "taskId": 25, "subject": "Task 2: OtOpcUa — swap OTel to AddZbTelemetry", "status": "pending", "classification": "standard", "blockedBy": [1]},
{"id": 3, "taskId": 26, "subject": "Task 3: OtOpcUa — swap Serilog to AddZbSerilog", "status": "pending", "classification": "standard", "blockedBy": [2]},
{"id": 4, "taskId": 27, "subject": "Task 4: ScadaBridge — distribution wiring", "status": "pending", "classification": "small", "blockedBy": [0]},
{"id": 5, "taskId": 28, "subject": "Task 5: ScadaBridge — AddZbTelemetry both roots + MapZbMetrics", "status": "pending", "classification": "standard", "blockedBy": [4]},
{"id": 6, "taskId": 29, "subject": "Task 6: ScadaBridge — TraceContextEnricher in LoggerConfigurationFactory", "status": "pending", "classification": "small", "blockedBy": [5]},
{"id": 7, "taskId": 30, "subject": "Task 7: MxAccessGateway — distribution wiring", "status": "pending", "classification": "small", "blockedBy": [0]},
{"id": 8, "taskId": 31, "subject": "Task 8: MxAccessGateway — appsettings Logging → Serilog", "status": "pending", "classification": "small", "blockedBy": [7]},
{"id": 9, "taskId": 32, "subject": "Task 9: MxAccessGateway — AddZbSerilog (MEL→Serilog provider swap)", "status": "pending", "classification": "high-risk", "blockedBy": [8]},
{"id": 10, "taskId": 33, "subject": "Task 10: MxAccessGateway — ILogRedactor seam", "status": "pending", "classification": "standard", "blockedBy": [9]},
{"id": 11, "taskId": 34, "subject": "Task 11: MxAccessGateway — AddZbTelemetry metrics export + MapZbMetrics", "status": "pending", "classification": "standard", "blockedBy": [10]},
{"id": 12, "taskId": 35, "subject": "Task 12: scadaproj — bookkeeping + correct false claim", "status": "pending", "classification": "trivial", "blockedBy": [3, 6, 11]}
],
"notes": "Task 0 gates all. After Task 0 the three repo phases (OtOpcUa 1-3, ScadaBridge 4-6, MxGateway 7-11) are independent and may run concurrently across their separate working directories; within a repo tasks are sequential. Task 12 last.",
"lastUpdated": "2026-06-01"
}