Files
scadaproj/docs/plans/2026-06-01-telemetry-library-adoption.md
T
Joseph Doherty 30425726d4 docs: implementation plan for ZB.MOM.WW.Telemetry adoption across the 3 sister apps
13 tasks: Task 0 publishes/verifies the 2 nupkgs on Gitea (gates all); then 3
independent per-repo phases — OtOpcUa (1-3), ScadaBridge (4-6), MxGateway (7-11,
incl. the high-risk MEL->Serilog swap) — and Task 12 scadaproj bookkeeping last.
Records two behaviour-preserving refinements vs the design: ScadaBridge keeps
LoggerConfigurationFactory (+TraceContextEnricher) instead of AddZbSerilog, and
MxGateway keeps GatewayLogScope as-is. Breaking items #6/#7 deferred.
2026-06-01 15:24:28 -04:00

849 lines
39 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ZB.MOM.WW.Telemetry Adoption Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans (or subagent-driven-development) to implement this plan task-by-task.
**Goal:** Adopt the shared `ZB.MOM.WW.Telemetry` + `ZB.MOM.WW.Telemetry.Serilog` packages across OtOpcUa, MxAccessGateway, and ScadaBridge — giving all three the OTel Resource identity triple, standard instrumentation, Prometheus `/metrics`, and shared Serilog correlation — behaviour-preserving, with breaking items deferred.
**Architecture:** Gitea-registry distribution (`dohertj2-gitea`, creds-only at user level). Each app references the shared packages and swaps its bespoke wiring for `AddZbTelemetry` / `AddZbSerilog`, keeping existing meter names, units, log messages, and the `/metrics` path. Each sister repo is its own git repo; work happens on branch `feat/adopt-zb-telemetry`, one commit per task, **never skip hooks, never force-push.**
**Tech Stack:** .NET 10, OpenTelemetry SDK, Prometheus exporter, Serilog, NuGet Central Package Management (OtOpcUa + ScadaBridge; MxGateway has none).
**Source design:** [`2026-06-01-telemetry-library-adoption-design.md`](2026-06-01-telemetry-library-adoption-design.md)
---
## Two refinements discovered during planning (deviations from the design doc)
Both serve the approved **behaviour-preserving** acceptance bar:
1. **ScadaBridge logging — KEEP `LoggerConfigurationFactory`.** The design doc said "delete the
factory and swap to `AddZbSerilog`." Code review showed the factory implements a documented
governance contract (REQ-HOST-8 / Host-011/014/020/022): `ScadaBridge:Logging:MinimumLevel` is
the floor and **overrides** `Serilog:MinimumLevel`, with operator warnings when both are set or
a level is mistyped. `AddZbSerilog` hard-codes `MinimumLevel.Is(Information)` *before*
`ReadFrom.Configuration`, which inverts that precedence and silently drops the
`ScadaBridge:Logging:MinimumLevel` knob (and breaks its tests). **Plan: keep the factory, add the
shared `TraceContextEnricher` to it** (gaining trace↔log correlation) and do NOT adopt
`AddZbSerilog` for ScadaBridge. ScadaBridge still fully adopts the metrics/Resource half.
2. **MxGateway logging — keep `GatewayLogScope` + request-logging middleware as-is.** The Serilog
MEL provider captures MEL `BeginScope` dictionaries as structured properties, so the existing
middleware keeps producing the same scope properties once Serilog is the provider. The only
logging code changes are: register Serilog as the provider (`AddZbSerilog`), migrate the
`appsettings` `Logging` section to a `Serilog` section, and wrap the static `GatewayLogRedactor`
behind the `ILogRedactor` seam. No rewrite of working scope code.
---
## Execution order & parallelism
- **Task 0 gates everything** (packages must be on the feed before any repo can restore).
- After Task 0, the **three repo phases are independent** (separate working directories) and may run
concurrently: OtOpcUa (Tasks 13), ScadaBridge (Tasks 46), MxGateway (Tasks 711).
- **Within a repo, tasks are sequential** (same working tree / same branch — do not dispatch two
implementers against one repo concurrently).
- **Task 12** (scadaproj bookkeeping) runs last, after all three phases land.
Branch setup (first task in each repo creates it): `git checkout -b feat/adopt-zb-telemetry` from the
repo's default branch (`master` for OtOpcUa, `main` for the others).
---
## Task 0: Publish/verify Telemetry packages on the Gitea feed
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** none (gates all)
**Files:**
- Work in: `/Users/dohertj2/Desktop/scadaproj/ZB.MOM.WW.Telemetry/`
- No repo files edited (publish only). Credentials already at `~/.nuget/NuGet/NuGet.Config`.
**Context:** The library CLAUDE.md claims these are "published to the Gitea NuGet feed." The Health
round proved that claim unreliable. Verify; pack + push only if missing. Mirrors Health Task 0.
**Step 1: Check whether `ZB.MOM.WW.Telemetry` 0.1.0 is already on the feed**
```bash
cd /Users/dohertj2/Desktop/scadaproj/ZB.MOM.WW.Telemetry
# Use the user-level creds (source name dohertj2-gitea) already configured.
dotnet nuget list source # confirm dohertj2-gitea is NOT registered globally (creds are user-level only)
curl -s -u "dohertj2:$(grep -A2 dohertj2-gitea ~/.nuget/NuGet/NuGet.Config | grep ClearTextPassword | sed -E 's/.*value="([^"]+)".*/\1/')" \
"https://gitea.dohertylan.com/api/packages/dohertj2/nuget/registration/zb.mom.ww.telemetry/index.json" -o /tmp/tele.json -w "%{http_code}\n"
```
Expected: `200` if already published (then SKIP to Step 4), `404` if missing (continue).
**Step 2: Pack the two packages (only if missing)**
```bash
dotnet pack ZB.MOM.WW.Telemetry.slnx -c Release -o ./artifacts
ls ./artifacts/*.nupkg
```
Expected: `ZB.MOM.WW.Telemetry.0.1.0.nupkg` and `ZB.MOM.WW.Telemetry.Serilog.0.1.0.nupkg`.
**Step 3: Push both to Gitea (only if missing)**
```bash
TOKEN=$(grep -A2 dohertj2-gitea ~/.nuget/NuGet/NuGet.Config | grep ClearTextPassword | sed -E 's/.*value="([^"]+)".*/\1/')
for pkg in ./artifacts/ZB.MOM.WW.Telemetry.0.1.0.nupkg ./artifacts/ZB.MOM.WW.Telemetry.Serilog.0.1.0.nupkg; do
dotnet nuget push "$pkg" --source "https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json" --api-key "$TOKEN"
done
```
Expected: `Your package was pushed.` for each (or `409 Conflict` if a version already exists — acceptable).
**Step 4: Verify both ids resolve**
```bash
for id in zb.mom.ww.telemetry zb.mom.ww.telemetry.serilog; do
curl -s -u "dohertj2:$TOKEN" "https://gitea.dohertylan.com/api/packages/dohertj2/nuget/registration/$id/index.json" -w " -> %{http_code}\n" -o /dev/null
done
```
Expected: `-> 200` for both.
**Step 5: No commit** (publish-only task). Record completion.
> **SECURITY:** the Gitea token must NEVER be written into any repo file or commit. It lives only in
> `~/.nuget/NuGet/NuGet.Config`. The `curl`/`push` commands read it from there at runtime.
---
## Task 1: OtOpcUa — distribution wiring (source mapping + package refs)
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 4, Task 7 (other repos)
**Files:**
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/NuGet.config`
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/Directory.Packages.props`
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj`
**Step 1: Branch**
```bash
cd /Users/dohertj2/Desktop/OtOpcUa && git checkout master && git pull --ff-only && git checkout -b feat/adopt-zb-telemetry
```
**Step 2: Add Telemetry patterns to `NuGet.config`** — under `<packageSource key="dohertj2-gitea">`, add BOTH patterns (the `.*` glob does NOT match the bare core id):
```xml
<packageSource key="dohertj2-gitea">
<package pattern="ZB.MOM.WW.Health" />
<package pattern="ZB.MOM.WW.Health.*" />
<package pattern="ZB.MOM.WW.Telemetry" />
<package pattern="ZB.MOM.WW.Telemetry.*" />
</packageSource>
```
**Step 3: Add versions to `Directory.Packages.props`** (next to the Health `<PackageVersion>` lines):
```xml
<PackageVersion Include="ZB.MOM.WW.Telemetry" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Telemetry.Serilog" Version="0.1.0" />
```
**Step 4: Add versionless refs to the Host csproj** (next to the `ZB.MOM.WW.Health` refs):
```xml
<PackageReference Include="ZB.MOM.WW.Telemetry" />
<PackageReference Include="ZB.MOM.WW.Telemetry.Serilog" />
```
**Step 5: Restore + build to confirm the Gitea feed resolves and Serilog floor is satisfied**
```bash
dotnet restore ZB.MOM.WW.OtOpcUa.slnx
dotnet build ZB.MOM.WW.OtOpcUa.slnx -c Debug
```
Expected: restore pulls both packages from `dohertj2-gitea`; build succeeds. If restore fails on a
`Serilog.AspNetCore` floor (OtOpcUa pins 9.0.0), bump `Serilog.AspNetCore` (and the related
`Serilog.*` 9.x lines) in `Directory.Packages.props` to the floor the package requires, then rebuild.
**Step 6: Commit**
```bash
git add NuGet.config Directory.Packages.props src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj
git commit -m "build(otopcua): reference ZB.MOM.WW.Telemetry packages from Gitea feed"
```
---
## Task 2: OtOpcUa — swap OTel wiring to AddZbTelemetry
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** none (within OtOpcUa)
**Files:**
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs` (rewrite body; keep both method names + signatures)
- Test (oracle, do not edit): `/Users/dohertj2/Desktop/OtOpcUa/tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Observability/OtOpcUaTelemetryHookTests.cs`
**Context:** Today `AddOtOpcUaObservability()` (called at `Program.cs:138`) hand-wires
`AddOpenTelemetry().WithMetrics(...AddMeter("ZB.MOM.WW.OtOpcUa")...AddPrometheusExporter()).WithTracing(...AddSource("ZB.MOM.WW.OtOpcUa"))`,
and `MapOtOpcUaMetrics()` (called at `Program.cs:160`) maps `/metrics`. Keep both call sites
unchanged; rewrite the extension bodies to delegate to the shared library. **Same meter/source
names + same `/metrics` path** ⇒ behaviour-preserving; gains the Resource identity triple +
standard instrumentation.
**Step 1: Rewrite `ObservabilityExtensions.cs`** preserving the two public method signatures:
```csharp
using Microsoft.AspNetCore.Routing;
using Microsoft.Extensions.DependencyInjection;
using ZB.MOM.WW.OtOpcUa.Commons.Observability; // OtOpcUaTelemetry
using ZB.MOM.WW.Telemetry;
namespace ZB.MOM.WW.OtOpcUa.Host.Observability;
/// <summary>
/// OtOpcUa observability wiring, delegated to the shared ZB.MOM.WW.Telemetry library.
/// Keeps the existing meter/ActivitySource names ("ZB.MOM.WW.OtOpcUa") and the "/metrics"
/// scrape path, and adds the shared OTel Resource + standard instrumentation.
/// </summary>
public static class ObservabilityExtensions
{
public static IServiceCollection AddOtOpcUaObservability(this IServiceCollection services)
{
ArgumentNullException.ThrowIfNull(services);
return services.AddZbTelemetry(o =>
{
o.ServiceName = "otopcua";
o.Meters = [OtOpcUaTelemetry.MeterName]; // "ZB.MOM.WW.OtOpcUa"
o.ActivitySources = [OtOpcUaTelemetry.ActivitySourceName]; // "ZB.MOM.WW.OtOpcUa"
// Exporter defaults to Prometheus — preserves the existing /metrics posture.
});
}
// Keep the SAME signature the Program.cs:160 call site uses (app.MapOtOpcUaMetrics()).
// MapZbMetrics() maps MapPrometheusScrapingEndpoint() whose default path is "/metrics".
public static IEndpointRouteBuilder MapOtOpcUaMetrics(this IEndpointRouteBuilder endpoints)
{
ArgumentNullException.ThrowIfNull(endpoints);
endpoints.MapZbMetrics();
return endpoints;
}
}
```
> If the existing `MapOtOpcUaMetrics` extends `WebApplication`/`IApplicationBuilder` rather than
> `IEndpointRouteBuilder`, keep THAT receiver type and call `app.MapZbMetrics();` — match the
> current signature so `Program.cs:160` compiles unchanged.
**Step 2: Build**
```bash
cd /Users/dohertj2/Desktop/OtOpcUa && dotnet build ZB.MOM.WW.OtOpcUa.slnx -c Debug
```
Expected: PASS. (The now-redundant direct `OpenTelemetry.Extensions.Hosting` /
`OpenTelemetry.Exporter.Prometheus.AspNetCore` refs may stay — they resolve the same assemblies the
shared package brings; leaving them is lower-risk than pruning.)
**Step 3: Run the telemetry hook tests (the behaviour oracle)**
```bash
dotnet test ZB.MOM.WW.OtOpcUa.slnx --filter "FullyQualifiedName~OtOpcUaTelemetryHookTests"
```
Expected: PASS — the meter `ZB.MOM.WW.OtOpcUa` and ActivitySource still emit (the shared
`AddZbTelemetry` registered them via `o.Meters`/`o.ActivitySources`).
**Step 4: Commit**
```bash
git add src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs
git commit -m "feat(otopcua): wire OTel via AddZbTelemetry (shared Resource + std instrumentation)"
```
---
## Task 3: OtOpcUa — swap Serilog to AddZbSerilog + move sinks to config
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (within OtOpcUa)
**Files:**
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs:49-52` (the inline `UseSerilog` block)
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json` (currently `{}`)
- Test (oracle): `/Users/dohertj2/Desktop/OtOpcUa/tests/Core/ZB.MOM.WW.OtOpcUa.Core.Tests/Observability/LogContextEnricherTests.cs`
**Context:** Today `Program.cs:49-52` configures Serilog in code with `ReadFrom.Configuration` +
`WriteTo.Console()` + `WriteTo.File("logs/otopcua-.log", rollingInterval: Day)`. `AddZbSerilog` uses
`ReadFrom.Configuration` only, so the Console/File sinks must move into config to be reproduced. The
role-specific `appsettings.*.json` already carry `Serilog:MinimumLevel` overrides — those keep
working through `ReadFrom.Configuration`.
**Step 1: Add the sinks to `appsettings.json`** (replace the empty `{}`):
```json
{
"Serilog": {
"Using": [ "Serilog.Sinks.Console", "Serilog.Sinks.File" ],
"WriteTo": [
{ "Name": "Console" },
{ "Name": "File", "Args": { "path": "logs/otopcua-.log", "rollingInterval": "Day" } }
]
}
}
```
> Do NOT add `"Enrich": ["FromLogContext"]` unless it is already enabled today — adding it would
> newly surface driver-scope properties and change output. Preserve the current enrich set.
**Step 2: Replace the inline `UseSerilog` block in `Program.cs`.** Remove lines 49-52:
```csharp
builder.Host.UseSerilog((ctx, lc) => lc
.ReadFrom.Configuration(ctx.Configuration)
.WriteTo.Console()
.WriteTo.File("logs/otopcua-.log", rollingInterval: RollingInterval.Day));
```
and replace with:
```csharp
builder.AddZbSerilog(o => o.ServiceName = "otopcua");
```
Add `using ZB.MOM.WW.Telemetry.Serilog;` to the `using` block. Keep `app.UseSerilogRequestLogging();`
(line 141) unchanged. Keep the existing `using Serilog;` if still referenced; remove
`RollingInterval` import only if now unused.
**Step 3: Build + run the LogContextEnricher tests**
```bash
cd /Users/dohertj2/Desktop/OtOpcUa
dotnet build ZB.MOM.WW.OtOpcUa.slnx -c Debug
dotnet test ZB.MOM.WW.OtOpcUa.slnx --filter "FullyQualifiedName~LogContextEnricherTests"
```
Expected: build PASS; tests PASS (the static `LogContextEnricher.Push` helper is unaffected — it is
not registered in DI and AddZbSerilog does not change its disposable contract).
**Step 4: Sanity-check that logs still emit** (no automated log-output harness here):
```bash
# Quick smoke: build runs; optionally run the host briefly in a role that doesn't need infra
# and confirm console log lines appear. If no safe role exists, rely on the build + the request-
# logging path remaining wired (UseSerilogRequestLogging at Program.cs:141).
```
**Step 5: Commit**
```bash
git add src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json
git commit -m "feat(otopcua): adopt AddZbSerilog (shared enrichers + trace correlation); sinks to config"
```
---
## Task 4: ScadaBridge — distribution wiring (source mapping + package refs)
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 1, Task 7 (other repos)
**Files:**
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/nuget.config`
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/Directory.Packages.props`
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj`
**Step 1: Branch**
```bash
cd /Users/dohertj2/Desktop/ScadaBridge && git checkout main && git pull --ff-only && git checkout -b feat/adopt-zb-telemetry
```
**Step 2: Add Telemetry patterns to `nuget.config`** under `<packageSource key="dohertj2-gitea">`:
```xml
<package pattern="ZB.MOM.WW.Telemetry" />
<package pattern="ZB.MOM.WW.Telemetry.*" />
```
**Step 3: Add versions to `Directory.Packages.props`** (next to the Health lines):
```xml
<PackageVersion Include="ZB.MOM.WW.Telemetry" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Telemetry.Serilog" Version="0.1.0" />
```
**Step 4: Add versionless refs to the Host csproj** (next to the Health refs):
```xml
<PackageReference Include="ZB.MOM.WW.Telemetry" />
<PackageReference Include="ZB.MOM.WW.Telemetry.Serilog" />
```
> `ZB.MOM.WW.Telemetry.Serilog` is referenced here only for the public `TraceContextEnricher` type
> used in Task 6 — ScadaBridge does NOT call `AddZbSerilog`.
**Step 5: Restore + build** (watch for OTel version conflicts with the pinned `OpenTelemetry.Api 1.15.3`)
```bash
dotnet restore ZB.MOM.WW.ScadaBridge.slnx
dotnet build ZB.MOM.WW.ScadaBridge.slnx -c Debug
```
Expected: PASS. If a transitive OTel version conflicts with the CVE-override `OpenTelemetry.Api`,
align the override version to what the shared package requires.
**Step 6: Commit**
```bash
git add nuget.config Directory.Packages.props src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj
git commit -m "build(scadabridge): reference ZB.MOM.WW.Telemetry packages from Gitea feed"
```
---
## Task 5: ScadaBridge — AddZbTelemetry in both composition roots + MapZbMetrics
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (within ScadaBridge)
**Files:**
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/SiteServiceRegistration.cs` (`BindSharedOptions`, ~lines 100-117 — add the registration; called by BOTH roots)
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/Program.cs` (Central endpoint section ~206-259; Site endpoint section ~307-320 — add `app.MapZbMetrics()` in each)
- Test: `/Users/dohertj2/Desktop/ScadaBridge/tests/ZB.MOM.WW.ScadaBridge.Host.Tests/` (add a `/metrics`-served assertion; HealthCheckTests pattern with `WebApplicationFactory<Program>`)
**Context:** ScadaBridge has NO OTel today (only the `OpenTelemetry.Api` CVE override). `SiteId`,
`NodeRole`, `NodeHostname` are available from config (`ScadaBridge:Node:*`). `BindSharedOptions` is
called by both the Central and Site roots, so registering telemetry there covers both without
duplication. This is purely additive (no metrics exist to break).
**Step 1: Register telemetry in `BindSharedOptions`.** Inside `SiteServiceRegistration.BindSharedOptions(IServiceCollection services, IConfiguration config)`, after the existing `services.Configure<...>` calls, add:
```csharp
// Shared OTel: Resource identity (service.name / site.id / node.role) + standard instrumentation
// + Prometheus exporter. Mounted at /metrics by app.MapZbMetrics() in each composition root.
services.AddZbTelemetry(o =>
{
o.ServiceName = "scadabridge";
o.SiteId = config["ScadaBridge:Node:SiteId"] ?? "central";
o.NodeRole = config["ScadaBridge:Node:Role"];
// o.Meters left empty — application instruments are a deferred follow-on (GAPS #9).
});
```
Add `using ZB.MOM.WW.Telemetry;`. (Use the SAME default `?? "central"` for SiteId that
`Program.cs:45` uses, so the Resource attribute matches the log enricher value.)
**Step 2: Map `/metrics` in BOTH roots.** In `Program.cs`:
- Central block — after `app.UseRouting()` and alongside the other `Map*` calls (e.g. just after `app.MapZbHealth();`), add:
```csharp
app.MapZbMetrics();
```
- Site block — in its endpoint section (where `app.MapGrpcService<...>()` is mapped, ~307-320), add:
```csharp
app.MapZbMetrics();
```
Add `using ZB.MOM.WW.Telemetry;` to `Program.cs` if not already present. `MapZbMetrics()` requires
routing; the Central block already calls `UseRouting()`, and the Site block's `MapGrpcService`
implies endpoint routing — if the Site app lacks `UseRouting()`, add it before `MapZbMetrics()`.
**Step 3: Add a `/metrics` integration test** in the Host.Tests project (mirror `HealthCheckTests`):
```csharp
[Fact]
public async Task Metrics_Endpoint_IsMapped()
{
using var factory = /* existing WebApplicationFactory<Program> setup for Central role */;
using var client = factory.CreateClient();
var response = await client.GetAsync("/metrics");
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
var body = await response.Content.ReadAsStringAsync();
Assert.Contains("# ", body); // Prometheus exposition format (HELP/TYPE comments)
}
```
> Reuse the exact `WebApplicationFactory<Program>` + in-memory config bootstrapping that
> `HealthCheckTests.cs` already uses for the Central role (it sets the env to "Central" and removes
> the Akka hosted service). Do not invent a new harness.
**Step 4: Build + test**
```bash
cd /Users/dohertj2/Desktop/ScadaBridge
dotnet build ZB.MOM.WW.ScadaBridge.slnx -c Debug
dotnet test ZB.MOM.WW.ScadaBridge.slnx --filter "FullyQualifiedName~HealthCheckTests|FullyQualifiedName~Metrics_Endpoint_IsMapped|FullyQualifiedName~CompositionRoot"
```
Expected: PASS (existing composition-root + health tests stay green; new metrics test passes).
**Step 5: Commit**
```bash
git add src/ZB.MOM.WW.ScadaBridge.Host/SiteServiceRegistration.cs src/ZB.MOM.WW.ScadaBridge.Host/Program.cs tests/ZB.MOM.WW.ScadaBridge.Host.Tests/
git commit -m "feat(scadabridge): wire AddZbTelemetry + /metrics in both composition roots"
```
---
## Task 6: ScadaBridge — add shared TraceContextEnricher to LoggerConfigurationFactory
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** none (within ScadaBridge)
**Files:**
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/LoggerConfigurationFactory.cs` (the `Build` return expression)
- Test (oracle): `/Users/dohertj2/Desktop/ScadaBridge/tests/ZB.MOM.WW.ScadaBridge.Host.Tests/SerilogTests.cs` (+ any `LoggerConfigurationFactory` tests)
**Context (deviation from design doc — see top of plan):** KEEP `LoggerConfigurationFactory` intact
(it owns the Host-011/014/020/022 minimum-level governance). Only add the shared
`TraceContextEnricher` so logs emitted inside a span carry `trace_id`/`span_id` and can be joined to
traces. This gains the cross-cutting correlation win without regressing ScadaBridge's logging
contract.
**Step 1: Add the enricher to the `Build` return.** In `LoggerConfigurationFactory.Build(...)`, the
final expression currently ends:
```csharp
return new LoggerConfiguration()
.ReadFrom.Configuration(configuration)
.MinimumLevel.Is(minimumLevel)
.Enrich.WithProperty("SiteId", siteId)
.Enrich.WithProperty("NodeHostname", nodeHostname)
.Enrich.WithProperty("NodeRole", nodeRole);
```
Add the shared enricher as the last `.Enrich`:
```csharp
.Enrich.WithProperty("NodeRole", nodeRole)
.Enrich.With(new ZB.MOM.WW.Telemetry.Serilog.TraceContextEnricher());
```
(Or add `using ZB.MOM.WW.Telemetry.Serilog;` and use `.Enrich.With(new TraceContextEnricher())`.)
**Step 2: Build + run the Serilog tests**
```bash
cd /Users/dohertj2/Desktop/ScadaBridge
dotnet build ZB.MOM.WW.ScadaBridge.slnx -c Debug
dotnet test ZB.MOM.WW.ScadaBridge.slnx --filter "FullyQualifiedName~SerilogTests|FullyQualifiedName~LoggerConfiguration"
```
Expected: PASS. The three node-identity enrichers and the min-level governance are untouched;
`trace_id`/`span_id` only appear when an `Activity.Current` exists (none in these tests → no change
to asserted properties).
**Step 3: Commit**
```bash
git add src/ZB.MOM.WW.ScadaBridge.Host/LoggerConfigurationFactory.cs
git commit -m "feat(scadabridge): add shared TraceContextEnricher to log pipeline (trace correlation)"
```
---
## Task 7: MxAccessGateway — distribution wiring (source mapping + package refs)
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 1, Task 4 (other repos)
**Files:**
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/nuget.config`
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj` (NO CPM — direct versioned refs)
**Step 1: Branch**
```bash
cd /Users/dohertj2/Desktop/MxAccessGateway && git checkout main && git pull --ff-only && git checkout -b feat/adopt-zb-telemetry
```
**Step 2: Add Telemetry patterns to `nuget.config`** under `<packageSource key="dohertj2-gitea">`:
```xml
<package pattern="ZB.MOM.WW.Telemetry" />
<package pattern="ZB.MOM.WW.Telemetry.*" />
```
**Step 3: Add direct versioned refs to the Server csproj** (in the main `<ItemGroup>` of `<PackageReference>`s). MxGateway has no Serilog/OTel today, so it needs the shared packages AND the concrete sink assemblies referenced by the `appsettings` `Using` block:
```xml
<PackageReference Include="ZB.MOM.WW.Telemetry" Version="0.1.0" />
<PackageReference Include="ZB.MOM.WW.Telemetry.Serilog" Version="0.1.0" />
<PackageReference Include="Serilog.AspNetCore" Version="10.0.0" />
<PackageReference Include="Serilog.Sinks.Console" Version="6.1.1" />
<PackageReference Include="Serilog.Sinks.File" Version="7.0.0" />
```
> Versions align with ScadaBridge's pins (Serilog.AspNetCore 10.0.0, Console 6.1.1, File 7.0.0). If
> the `.Serilog` package requires a different `Serilog.AspNetCore` floor, match it.
**Step 4: Restore + build**
```bash
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
```
Expected: PASS (packages resolve from Gitea + nuget.org).
**Step 5: Commit**
```bash
git add nuget.config src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj
git commit -m "build(mxgateway): reference ZB.MOM.WW.Telemetry + Serilog packages"
```
---
## Task 8: MxAccessGateway — migrate appsettings Logging → Serilog section
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** none (within MxGateway)
**Files:**
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/appsettings.json`
**Context:** Current `Logging` (MEL) section: `Default: Information`, `Microsoft.AspNetCore: Warning`.
`AddZbSerilog` reads sinks/levels via `ReadFrom.Configuration` from a `Serilog` section. Translate
the levels and add Console + File sinks so logging output is preserved after the provider swap.
**Step 1: Replace the `Logging` block with a `Serilog` block.** Remove:
```json
"Logging": {
"LogLevel": { "Default": "Information", "Microsoft.AspNetCore": "Warning" }
},
```
Add:
```json
"Serilog": {
"Using": [ "Serilog.Sinks.Console", "Serilog.Sinks.File" ],
"MinimumLevel": {
"Default": "Information",
"Override": { "Microsoft.AspNetCore": "Warning" }
},
"WriteTo": [
{ "Name": "Console" },
{ "Name": "File", "Args": { "path": "logs/mxgateway-.log", "rollingInterval": "Day" } }
]
},
```
> Keep the rest of `appsettings.json` (gateway config) unchanged. Note: `AddZbSerilog` applies its
> own `MinimumLevel.Is(Information)` before `ReadFrom.Configuration`, so the `Serilog:MinimumLevel`
> above is honoured (raising the floor to Information and overriding Microsoft.AspNetCore to Warning
> — matching today's MEL levels).
**Step 2: Commit** (config-only; build happens in Task 9 once the provider is wired)
```bash
git add src/ZB.MOM.WW.MxGateway.Server/appsettings.json
git commit -m "config(mxgateway): translate MEL Logging section to Serilog"
```
---
## Task 9: MxAccessGateway — wire AddZbSerilog (MEL → Serilog provider swap)
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** none (within MxGateway)
**Files:**
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs` (`CreateBuilder`, after `ConfigureSelfSignedTls(builder)` ~line 63)
- Test: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs` (add a provider-swap assertion)
**Context (high-risk — logging on the most operational app):** Register Serilog as the host's
logging provider so all existing MEL `ILogger`/`ILoggerFactory` calls (including
`UseGatewayRequestLoggingScope`'s middleware) route through Serilog. The Serilog MEL provider
captures MEL `BeginScope` dictionaries as structured properties, so `GatewayLogScope` and the
request-logging middleware keep working unchanged. The temporary `LoggerFactory.Create(...AddConsole())`
at lines 96-100 (used only by the TLS cert provider) may remain as-is.
**Step 1: Add the failing test** in `GatewayApplicationTests.cs` — assert the logger factory is now Serilog-backed:
```csharp
[Fact]
public void Build_UsesSerilogLoggerProvider()
{
using var app = GatewayApplication.Build([]);
var factory = app.Services.GetRequiredService<ILoggerFactory>();
// Serilog.Extensions.Hosting registers SerilogLoggerFactory when AddSerilog replaces the factory.
Assert.Equal("SerilogLoggerFactory", factory.GetType().Name);
}
```
**Step 2: Run it — expect FAIL** (`dotnet test ... --filter Build_UsesSerilogLoggerProvider`) → today the factory is the default MEL `LoggerFactory`.
**Step 3: Wire `AddZbSerilog`.** In `GatewayApplication.CreateBuilder`, immediately after
`ConfigureSelfSignedTls(builder);`, add:
```csharp
builder.AddZbSerilog(o => o.ServiceName = "mxgateway");
```
Add `using ZB.MOM.WW.Telemetry.Serilog;`. (`AddZbSerilog` calls `services.AddSerilog(..., preserveStaticLogger: true)`,
which registers `SerilogLoggerFactory` — replacing the MEL factory, so default providers do not
double-log.)
**Step 4: Run the test — expect PASS**, then run the broader logging-adjacent suites:
```bash
cd /Users/dohertj2/Desktop/MxAccessGateway
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~GatewayApplicationTests"
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~FakeWorker"
```
Expected: PASS — `Build_MapsCanonicalHealthEndpoints`, `Build_RegistersGatewayMetrics`, the
config-validation cases, and the fake-worker smoke all stay green; the new provider-swap test passes.
**Step 5: Verify no double console logging** — if `SerilogLoggerFactory` is confirmed in Step 4, the
default providers are bypassed and no extra step is needed. If you observe duplicated console lines
in any manual run, add `builder.Logging.ClearProviders();` immediately before `AddZbSerilog`.
**Step 6: Commit**
```bash
git add src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs
git commit -m "feat(mxgateway): adopt AddZbSerilog — MEL→Serilog provider swap (behaviour-preserving)"
```
---
## Task 10: MxAccessGateway — wrap GatewayLogRedactor behind the ILogRedactor seam
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** none (within MxGateway)
**Files:**
- Create: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogRedactorSeam.cs`
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs` (register the seam in DI in `CreateBuilder`)
- Test: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorSeamTests.cs`
**Context:** The shared `RedactionEnricher` applies any DI-registered `ILogRedactor` to every log
event before it reaches a sink. MxGateway's redaction lives in the static `GatewayLogRedactor`
(API-key Bearer tokens, client identity). Provide a thin `ILogRedactor` that redacts the relevant
log-event properties (`ClientIdentity`, `authorization`) via the existing static helper. Keep
`GatewayLogRedactor` for its current callers (`GatewayLogScope`, `DashboardRedactor`).
**Step 1: Write the failing test** (`GatewayLogRedactorSeamTests.cs`):
```csharp
using System.Collections.Generic;
using ZB.MOM.WW.MxGateway.Server.Diagnostics;
using Xunit;
public class GatewayLogRedactorSeamTests
{
[Fact]
public void Redact_MasksApiKeyInClientIdentity()
{
var redactor = new GatewayLogRedactorSeam();
var props = new Dictionary<string, object?>
{
["ClientIdentity"] = "Bearer mxgw_operator01_super-secret"
};
redactor.Redact(props);
Assert.Equal("Bearer mxgw_operator01_[redacted]", props["ClientIdentity"]);
}
}
```
**Step 2: Run it — expect FAIL** (type doesn't exist).
**Step 3: Implement `GatewayLogRedactorSeam.cs`:**
```csharp
using ZB.MOM.WW.Telemetry.Serilog;
namespace ZB.MOM.WW.MxGateway.Server.Diagnostics;
/// <summary>
/// Adapts the static <see cref="GatewayLogRedactor"/> to the shared <see cref="ILogRedactor"/> seam
/// so the telemetry RedactionEnricher masks API-key/credential material on every log event.
/// </summary>
public sealed class GatewayLogRedactorSeam : ILogRedactor
{
private static readonly string[] IdentityKeys = ["ClientIdentity", "authorization", "Authorization"];
public void Redact(IDictionary<string, object?> properties)
{
ArgumentNullException.ThrowIfNull(properties);
foreach (var key in IdentityKeys)
{
if (properties.TryGetValue(key, out var value) && value is string s)
{
properties[key] = GatewayLogRedactor.RedactClientIdentity(s);
}
}
}
}
```
**Step 4: Register in DI.** In `GatewayApplication.CreateBuilder`, alongside the other singletons, add:
```csharp
builder.Services.AddSingleton<ZB.MOM.WW.Telemetry.Serilog.ILogRedactor, Diagnostics.GatewayLogRedactorSeam>();
```
**Step 5: Run the test + build**
```bash
cd /Users/dohertj2/Desktop/MxAccessGateway
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~GatewayLogRedactorSeamTests"
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
```
Expected: PASS.
**Step 6: Commit**
```bash
git add src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogRedactorSeam.cs src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorSeamTests.cs
git commit -m "feat(mxgateway): expose GatewayLogRedactor via shared ILogRedactor seam"
```
---
## Task 11: MxAccessGateway — wire AddZbTelemetry (export GatewayMetrics) + MapZbMetrics
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** none (within MxGateway)
**Files:**
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs` (`CreateBuilder` after `AddSingleton<GatewayMetrics>()` ~line 72; `MapGatewayEndpoints` after `MapZbHealth()` ~line 177)
- Test: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs` (add `/metrics`-served assertion) + existing `GatewayMetricsTests` as oracle
**Context:** The `MxGateway.Server` meter (13 counters, 3 ms-histograms, 4 gauges) exists but is
never exported (no OTel SDK, no `/metrics`). `AddZbTelemetry` with `Meters = ["MxGateway.Server"]`
registers the meter with the OTel MeterProvider + Prometheus exporter; `MapZbMetrics()` mounts
`/metrics`. **Keep the `MxGateway.Server` name and the `ms` histogram units** (rename #7 + unit #6
are deferred). `GetSnapshot()` is untouched.
**Step 1: Add `AddZbTelemetry` in `CreateBuilder`**, immediately after `builder.Services.AddSingleton<GatewayMetrics>();`:
```csharp
builder.AddZbTelemetry(o =>
{
o.ServiceName = "mxgateway";
o.Meters = [GatewayMetrics.MeterName]; // "MxGateway.Server" — unchanged (rename deferred)
});
```
Add `using ZB.MOM.WW.Telemetry;`.
**Step 2: Map `/metrics` in `MapGatewayEndpoints`**, after `endpoints.MapZbHealth();`:
```csharp
endpoints.MapZbMetrics();
```
**Step 3: Add the served-endpoint test** in `GatewayApplicationTests.cs`:
```csharp
[Fact]
public async Task Build_MapsMetricsEndpoint()
{
using var app = GatewayApplication.Build([]);
await app.StartAsync();
try
{
using var client = new HttpClient { BaseAddress = new Uri(app.Urls.First()) };
var response = await client.GetAsync("/metrics");
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
}
finally { await app.StopAsync(); }
}
```
> If the existing test class already has a started-host helper (the config-validation tests call
> `StartAsync`), reuse it rather than starting a fresh host. Tests bind ephemeral ports (`:0`).
**Step 4: Build + test**
```bash
cd /Users/dohertj2/Desktop/MxAccessGateway
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~GatewayApplicationTests|FullyQualifiedName~GatewayMetricsTests"
```
Expected: PASS — the `MeterListener`-based `GatewayMetricsTests` (Tests-027 isolation) stay green
because the meter name/instruments are unchanged; the new `/metrics` test passes.
**Step 5: Commit**
```bash
git add src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs
git commit -m "feat(mxgateway): export GatewayMetrics via AddZbTelemetry + /metrics (name/units unchanged)"
```
---
## Task 12: scadaproj — bookkeeping (GAPS + correct the false "MxGateway logging adopted" claim)
**Classification:** trivial
**Estimated implement time:** ~4 min
**Parallelizable with:** none (runs after all repo phases)
**Files:**
- Modify: `/Users/dohertj2/Desktop/scadaproj/components/observability/GAPS.md` (add "Adoption status — 2026-06-01 (DONE)" section)
- Modify: `/Users/dohertj2/Desktop/scadaproj/components/observability/README.md` (correct the "MxGateway logging adopted" claim)
- Modify: `/Users/dohertj2/Desktop/scadaproj/ZB.MOM.WW.Telemetry/CLAUDE.md` (same correction)
- Modify: `/Users/dohertj2/Desktop/scadaproj/CLAUDE.md` (observability row + "MxAccessGateway logging adopted" note)
**Step 1: Add an adoption-status section to `GAPS.md`** with a per-repo table (what each app now
does), the **accepted scope note** (ScadaBridge keeps `LoggerConfigurationFactory` + adds
`TraceContextEnricher` rather than adopting `AddZbSerilog`; MxGateway keeps `GatewayLogScope`), and a
**Deferred** subsection listing #6 (histogram ms→s), #7 (meter rename), #9 (ScadaBridge app
instruments), #10/#11 (OTLP) as still-open.
**Step 2: Correct the false claim** everywhere it appears — the prior text said MxGateway's MEL→Serilog
migration was "done on its own branch." Replace with: "MxGateway MEL→Serilog migration + metrics
export landed on `main` via the 2026-06-01 telemetry adoption (branch `feat/adopt-zb-telemetry`)."
**Step 3: Commit**
```bash
cd /Users/dohertj2/Desktop/scadaproj
git add components/observability/GAPS.md components/observability/README.md ZB.MOM.WW.Telemetry/CLAUDE.md CLAUDE.md
git commit -m "docs(observability): record ZB.MOM.WW.Telemetry adoption across 3 apps; correct MxGateway logging-status claim"
```
---
## Acceptance checklist (whole plan)
- [ ] Both Telemetry packages resolve from the Gitea feed (Task 0 verified `200`).
- [ ] OtOpcUa: builds; `OtOpcUaTelemetryHookTests` + `LogContextEnricherTests` green; `/metrics` still served; meter `ZB.MOM.WW.OtOpcUa` unchanged.
- [ ] ScadaBridge: builds; composition-root + health + new metrics tests green; `/metrics` served in both roles; `LoggerConfigurationFactory` governance intact.
- [ ] MxGateway: builds; `GatewayApplicationTests` + `GatewayMetricsTests` + fake-worker smoke green; logger is Serilog-backed; redaction applied via seam; `/metrics` served; `MxGateway.Server` name + `ms` units unchanged.
- [ ] No secrets committed to any repo (token stays in `~/.nuget/NuGet/NuGet.Config`).
- [ ] `components/observability/GAPS.md` updated; the false "MxGateway logging adopted" claim corrected.
- [ ] All three feature branches committed (one commit per task), no hooks skipped, no force-push.