Compare commits
3 Commits
19f7ea5eeb
...
dee55aadc6
| Author | SHA1 | Date | |
|---|---|---|---|
| dee55aadc6 | |||
| 30425726d4 | |||
| 3729ff2152 |
@@ -183,9 +183,14 @@ enrichers, and redaction policies.
|
||||
|
||||
The shared library is **built and lives in this repo** at [`ZB.MOM.WW.Telemetry/`](ZB.MOM.WW.Telemetry/)
|
||||
(.NET 10; 2 packages — `ZB.MOM.WW.Telemetry`, `ZB.MOM.WW.Telemetry.Serilog`; 19 tests;
|
||||
`dotnet pack` → 2 nupkgs @ 0.1.0). **MxAccessGateway logging adopted** (MEL → Serilog migration done on
|
||||
its own branch) — the one in-pass adoption. Broader OtOpcUa and ScadaBridge telemetry adoption is
|
||||
follow-on, tracked in [`components/observability/GAPS.md`](components/observability/GAPS.md).
|
||||
`dotnet pack` → 2 nupkgs @ 0.1.0). **Adopted across all three apps on 2026-06-01** (branch
|
||||
`feat/adopt-zb-telemetry` per repo, behaviour-preserving): `AddZbTelemetry` (Resource + standard
|
||||
instrumentation + Prometheus `/metrics`) everywhere; OtOpcUa + MxGateway on `AddZbSerilog` (MxGateway's
|
||||
MEL→Serilog migration + metrics export both landed in this pass — they were *not* actually done
|
||||
beforehand despite an earlier claim); ScadaBridge keeps its `LoggerConfigurationFactory` (min-level
|
||||
governance) and only adds the shared `TraceContextEnricher`. Deferred: MxGateway `ms`→`s` + Meter
|
||||
rename, ScadaBridge app instruments + Site-node HTTP/1.1 metrics listener, OTLP wiring. Per-repo
|
||||
result tracked in [`components/observability/GAPS.md`](components/observability/GAPS.md).
|
||||
Build/test from `ZB.MOM.WW.Telemetry/`: `dotnet test`. Consumer matrix: all three apps consume both
|
||||
packages after adoption (OtOpcUa, MxGateway Server, ScadaBridge Host + any instrumented project).
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@ Observability libraries for the **ZB.MOM.WW SCADA family** (OtOpcUa, MxAccessGat
|
||||
|
||||
The library normalizes the three-project observability surface: a shared OpenTelemetry Resource driven by a single identity triple (`service.name` / `site.id` / `node.role`), standard instrumentation wiring, Prometheus and OTLP export, and a Serilog bootstrap with enrichers and `TraceContextEnricher` for trace↔log correlation.
|
||||
|
||||
**Built at 0.1.0. MxAccessGateway logging adopted (MEL → Serilog migration done on its own branch). OtOpcUa and ScadaBridge telemetry adoption is follow-on.** Adoption tracked in `~/Desktop/scadaproj/components/observability/GAPS.md`.
|
||||
**Built at 0.1.0, published to the Gitea NuGet feed, and adopted across all three apps on 2026-06-01** (branch `feat/adopt-zb-telemetry` per repo, behaviour-preserving). MxAccessGateway's MEL→Serilog migration + metrics export both landed in this pass — they were *not* actually done beforehand despite the earlier claim. ScadaBridge keeps its `LoggerConfigurationFactory` (min-level governance) and only adds the shared `TraceContextEnricher`; it does not call `AddZbSerilog`. Per-repo result + deferred follow-ons tracked in `~/Desktop/scadaproj/components/observability/GAPS.md`.
|
||||
|
||||
---
|
||||
|
||||
@@ -21,12 +21,13 @@ The library normalizes the three-project observability surface: a shared OpenTel
|
||||
|
||||
| Consumer | `ZB.MOM.WW.Telemetry` (core) | `ZB.MOM.WW.Telemetry.Serilog` |
|
||||
|---|:---:|:---:|
|
||||
| **OtOpcUa** | yes (after adoption) | yes (after adoption) |
|
||||
| **MxAccessGateway** | yes (after adoption) | yes (MEL → Serilog adopted now) |
|
||||
| **ScadaBridge** | yes (after adoption) | yes (after adoption) |
|
||||
| **OtOpcUa** | ✅ adopted | ✅ adopted (`AddZbSerilog`) |
|
||||
| **MxAccessGateway** | ✅ adopted (`GatewayMetrics` exported) | ✅ adopted (MEL→Serilog migrated in this pass) |
|
||||
| **ScadaBridge** | ✅ adopted (both roots) | ⚠️ referenced for `TraceContextEnricher` only — keeps `LoggerConfigurationFactory`, does **not** call `AddZbSerilog` |
|
||||
|
||||
MxAccessGateway's logging adoption is the one in-pass migration. Full metrics/tracing wiring
|
||||
for all three apps is follow-on.
|
||||
All three adopted on 2026-06-01 (branch `feat/adopt-zb-telemetry` per repo). ScadaBridge's logging
|
||||
deviates: it keeps its own `LoggerConfigurationFactory` (min-level governance contract) and only
|
||||
adds the shared `TraceContextEnricher`. See `components/observability/GAPS.md` for the full result.
|
||||
|
||||
---
|
||||
|
||||
@@ -60,11 +61,13 @@ All test assemblies run offline:
|
||||
|
||||
## Status
|
||||
|
||||
Built at **0.1.0** and published to the Gitea NuGet feed. MxAccessGateway logging (MEL → Serilog)
|
||||
adopted on its own branch. **OtOpcUa and ScadaBridge telemetry adoption not yet started** —
|
||||
tracked in the component backlog:
|
||||
Built at **0.1.0**, published to the Gitea NuGet feed, and **adopted across all three apps on
|
||||
2026-06-01** (branch `feat/adopt-zb-telemetry` per repo, behaviour-preserving). MxAccessGateway's
|
||||
MEL→Serilog migration and metrics export both landed in this pass (not beforehand, despite the
|
||||
earlier claim). Deferred follow-ons (MxGateway `ms`→`s` + Meter rename, ScadaBridge app instruments
|
||||
+ Site-node HTTP/1.1 metrics listener, OTLP wiring) are tracked in the component backlog:
|
||||
|
||||
- `~/Desktop/scadaproj/components/observability/GAPS.md` — adoption order, effort, and risk
|
||||
- `~/Desktop/scadaproj/components/observability/GAPS.md` — adoption status + deferred follow-ons
|
||||
|
||||
Design documentation:
|
||||
|
||||
|
||||
@@ -181,3 +181,47 @@ app is opt-in and tracked here, not forced.
|
||||
unit migration (Gap U1) and the Meter rename (Gap N1) are deferred from the initial MxGateway
|
||||
adoption (Task #9). They are breaking dashboard/alert changes requiring ops coordination and
|
||||
are tracked as separate backlog items #6 and #7 in the adoption backlog above.
|
||||
|
||||
## Adoption status — 2026-06-01 (DONE)
|
||||
|
||||
`ZB.MOM.WW.Telemetry` + `ZB.MOM.WW.Telemetry.Serilog` (`0.1.0`) were adopted across **all three**
|
||||
sister apps in one pass, behaviour-preserving. Each adoption landed on a per-repo branch
|
||||
`feat/adopt-zb-telemetry` (one commit per task). Plan + design:
|
||||
[`docs/plans/2026-06-01-telemetry-library-adoption.md`](../../docs/plans/2026-06-01-telemetry-library-adoption.md).
|
||||
|
||||
> **Correction:** the prior claim that *"MxAccessGateway logging was adopted (MEL → Serilog) on its
|
||||
> own branch"* was **false on `main`** — MxGateway was still MEL-only, and its `MxGateway.Server`
|
||||
> meter was never exported. The full MEL→Serilog migration **and** the metrics export both landed
|
||||
> in this 2026-06-01 pass.
|
||||
|
||||
| Repo | `AddZbTelemetry` (Resource + std instrumentation + Prometheus) | `/metrics` | Logging | Meter (unchanged) |
|
||||
|---|---|---|---|---|
|
||||
| **OtOpcUa** | ✅ replaced hand-rolled `ObservabilityExtensions` | ✅ `/metrics` (path unchanged) | ✅ `AddZbSerilog` (sinks moved to `appsettings`; `LogContextEnricher` kept) | `ZB.MOM.WW.OtOpcUa` |
|
||||
| **ScadaBridge** | ✅ added in `BindSharedOptions` (both Central + Site roots) | ✅ Central; mapped on Site too (see follow-on) | ⚠️ **kept `LoggerConfigurationFactory`** + added shared `TraceContextEnricher` — did **not** adopt `AddZbSerilog` | (none yet; #9) |
|
||||
| **MxAccessGateway** | ✅ exports existing `GatewayMetrics` | ✅ new `/metrics` | ✅ MEL→`AddZbSerilog`; `GatewayLogRedactor` exposed via `ILogRedactor` seam (`GatewayLogRedactorSeam`); `GatewayLogScope`/middleware kept as-is | `MxGateway.Server` (name + `ms` units unchanged) |
|
||||
|
||||
### Accepted scope decisions (deviations from the original backlog)
|
||||
|
||||
- **ScadaBridge keeps `LoggerConfigurationFactory` (backlog #5 revised).** The factory implements a
|
||||
documented governance contract (REQ-HOST-8 / Host-011/014/020/022): `ScadaBridge:Logging:MinimumLevel`
|
||||
is the floor and **overrides** `Serilog:MinimumLevel`, with operator warnings. `AddZbSerilog`
|
||||
hard-codes `MinimumLevel.Is(Information)` before `ReadFrom.Configuration`, which would invert that
|
||||
precedence and silently drop the knob. So ScadaBridge keeps the factory and only **adds the shared
|
||||
`TraceContextEnricher`** to it — gaining trace↔log correlation without regressing the contract. Full
|
||||
`AddZbSerilog` adoption for ScadaBridge would first require teaching the shared bootstrap to accept a
|
||||
caller-supplied minimum-level governance hook.
|
||||
- **MxGateway keeps `GatewayLogScope` + request-logging middleware as-is.** The Serilog MEL provider
|
||||
captures MEL `BeginScope` dictionaries as structured properties, so the scope/correlation code keeps
|
||||
producing the same properties under Serilog. Only the provider swap + the `ILogRedactor` adapter were
|
||||
needed.
|
||||
|
||||
### Deferred (still open follow-ons)
|
||||
|
||||
- **#6** MxGateway histogram `ms`→`s`, **#7** Meter rename `MxGateway.Server`→`ZB.MOM.WW.MxGateway`
|
||||
(both break dashboards — ops-coordinated).
|
||||
- **#9** ScadaBridge application instruments (`ScadaBridgeTelemetry` + `scadabridge.*`).
|
||||
- **#10/#11** OTLP exporter wiring; OtOpcUa trace export is still a no-op (Prometheus is metrics-only).
|
||||
- **NEW — ScadaBridge Site-node `/metrics` scrape:** the Site role's Kestrel is HTTP/2-only (gRPC),
|
||||
so the mapped `/metrics` is not HTTP/1.1-scrapable on that listener. The in-process metrics + Resource
|
||||
still apply; Central serves `/metrics` normally. A follow-on should add a dedicated HTTP/1.1 (or
|
||||
`Http1AndHttp2`) listener/port for site-node scraping.
|
||||
|
||||
@@ -40,16 +40,20 @@ Serilog with the same options as enricher properties and adds `TraceContextEnric
|
||||
`node.role`) populates both the OTel Resource and the Serilog enrichers, so a metric, a span, and
|
||||
a log line from the same node carry identical dimensions and join up in a backend.
|
||||
|
||||
One adoption happens **in this task**: MxAccessGateway migrates off MEL onto `AddZbSerilog`. All
|
||||
other app wiring is follow-on, consistent with how Auth and UI-Theme are structured.
|
||||
**Adopted across all three apps on 2026-06-01** (branch `feat/adopt-zb-telemetry` per repo,
|
||||
behaviour-preserving). Note: MxAccessGateway's MEL→Serilog migration was *not* actually done at
|
||||
library-build time despite an earlier claim — it landed in this adoption pass, along with the
|
||||
metrics export. See [`GAPS.md` → Adoption status — 2026-06-01](GAPS.md) for the per-repo result,
|
||||
the accepted scope decisions (ScadaBridge keeps `LoggerConfigurationFactory`; MxGateway keeps its
|
||||
log-scope code), and the deferred follow-ons.
|
||||
|
||||
## Status by project
|
||||
|
||||
| Project | OTel SDK today | Metrics today | Tracing today | Logging today | Enrichers today | Adoption status |
|
||||
|---|---|---|---|---|---|---|
|
||||
| **OtOpcUa** | ✅ full SDK (`WithMetrics`+`WithTracing`) | ✅ 7 instruments (`otopcua.*`); Prometheus `/metrics` | 🟡 2 spans defined; no exporter | Serilog (Console+File) | `DriverInstanceId`/`DriverType`/`CapabilityName`/`CorrelationId` (driver-scope) | Not started (follow-on) |
|
||||
| **MxAccessGateway** | ⛔ none (hand-rolled `Meter`) | 🟡 20 instruments (`mxgateway.*`); **never exported** | ⛔ none | **Serilog (migrated from MEL — adopted)** | `SiteId`/`NodeRole`/`NodeHostname` (via `AddZbSerilog`); session/worker enrichers via `LogContext.PushProperty` | **Logging adopted; OTel metrics/traces follow-on** |
|
||||
| **ScadaBridge** | ⛔ (`OpenTelemetry.Api` CVE-patch only) | ⛔ zero instruments | ⛔ none | Serilog (Console+File) | `SiteId`/`NodeRole`/`NodeHostname` (process-level; strongest set) | Not started (follow-on) |
|
||||
| **OtOpcUa** | ✅ full SDK via `AddZbTelemetry` | ✅ 7 instruments (`otopcua.*`); Prometheus `/metrics` | 🟡 2 spans defined; no exporter | Serilog via `AddZbSerilog` (sinks in `appsettings`) | `DriverInstanceId`/`DriverType`/`CapabilityName`/`CorrelationId` (driver-scope, kept) + shared | ✅ **Adopted 2026-06-01** |
|
||||
| **MxAccessGateway** | ✅ `AddZbTelemetry` exports `GatewayMetrics` | ✅ 20 instruments (`mxgateway.*`) now exported; new `/metrics` | ⛔ none | ✅ **Serilog (migrated from MEL in this pass)** | `SiteId`/`NodeRole`/`NodeHostname` via `AddZbSerilog`; `GatewayLogScope` kept; `ILogRedactor` seam | ✅ **Adopted 2026-06-01** |
|
||||
| **ScadaBridge** | ✅ `AddZbTelemetry` (both roots) | ✅ Resource + std instrumentation; `/metrics` (Central) | ⛔ none | Serilog via `LoggerConfigurationFactory` (kept) + shared `TraceContextEnricher` | `SiteId`/`NodeRole`/`NodeHostname` (process-level) + trace context | ✅ **Adopted 2026-06-01** (logging via factory, not `AddZbSerilog` — see GAPS) |
|
||||
|
||||
See each project's [`current-state/<project>/CURRENT-STATE.md`](current-state/) for the
|
||||
code-verified detail and its adoption plan.
|
||||
@@ -100,8 +104,11 @@ hinge that makes a metric, a span, and a log line from the same node carry ident
|
||||
|
||||
## Component status
|
||||
|
||||
**Status: Built @ 0.1.0. MxAccessGateway MEL → Serilog logging adopted (on its own branch).
|
||||
OtOpcUa and ScadaBridge telemetry adoption is follow-on, tracked in [`GAPS.md`](GAPS.md).**
|
||||
**Status: Built @ 0.1.0 and published to the Gitea NuGet feed. Adopted across all three apps on
|
||||
2026-06-01** (OtOpcUa, MxAccessGateway, ScadaBridge — branch `feat/adopt-zb-telemetry` per repo).
|
||||
The MxAccessGateway MEL→Serilog migration and metrics export both landed in this pass (they were
|
||||
not actually done beforehand despite an earlier claim). Per-repo result + deferred follow-ons:
|
||||
[`GAPS.md` → Adoption status — 2026-06-01](GAPS.md).
|
||||
|
||||
The shared library lives at
|
||||
[`~/Desktop/scadaproj/ZB.MOM.WW.Telemetry/`](../../ZB.MOM.WW.Telemetry/) (.NET 10; 2 packages —
|
||||
|
||||
@@ -0,0 +1,234 @@
|
||||
# Adopt `ZB.MOM.WW.Telemetry` across the three sister apps — design
|
||||
|
||||
**Date:** 2026-06-01
|
||||
**Status:** Approved (design); implementation plan to follow via writing-plans.
|
||||
**Scope:** Integrate the built-but-unadopted `ZB.MOM.WW.Telemetry` (+ `.Serilog`) shared library
|
||||
into all three sister apps — **OtOpcUa**, **MxAccessGateway**, **ScadaBridge** — wiring the shared
|
||||
OpenTelemetry Resource, standard instrumentation, Prometheus `/metrics`, and the shared Serilog
|
||||
bootstrap with identity enrichers and trace↔log correlation.
|
||||
|
||||
This is the second full cross-fleet adoption of one of the six shared `ZB.MOM.WW.*` libraries
|
||||
(after `ZB.MOM.WW.Health`). It follows the adoption backlog in
|
||||
[`components/observability/GAPS.md`](../../components/observability/GAPS.md), re-verified against
|
||||
current code on 2026-06-01.
|
||||
|
||||
> **Correction recorded during design:** the library CLAUDE.md and
|
||||
> [`components/observability/README.md`](../../components/observability/README.md) claim
|
||||
> *"MxAccessGateway logging adopted (MEL → Serilog migration done on its own branch)."* This is
|
||||
> **false on `main`** — MxGateway is still MEL-only (no Serilog packages, `GatewayLogScope` /
|
||||
> `GatewayLogRedactor` still bespoke), and its `MxGateway.Server` meter is **not exported at all**
|
||||
> (no `AddOpenTelemetry`, no `/metrics`). That branch never landed. This design therefore includes
|
||||
> the full MxGateway MEL→Serilog migration, and the bookkeeping task corrects the false claim.
|
||||
|
||||
---
|
||||
|
||||
## 1. Goal & scope
|
||||
|
||||
Wire the two shared packages into all three apps:
|
||||
|
||||
- **`ZB.MOM.WW.Telemetry`** — `AddZbTelemetry(options)`: shared OTel Resource (the identity triple
|
||||
`service.name` / `site.id` / `node.role` + `service.namespace` / `service.version` / `host.name`),
|
||||
caller-supplied Meters/ActivitySources, standard instrumentation (ASP.NET Core, HttpClient, gRPC
|
||||
client, runtime, process), Prometheus always-on exporter (OTLP opt-in), and `app.MapZbMetrics()`
|
||||
to mount `/metrics`.
|
||||
- **`ZB.MOM.WW.Telemetry.Serilog`** — `AddZbSerilog(options)`: two-stage Serilog bootstrap,
|
||||
`ReadFrom.Configuration` sinks, `SiteId`/`NodeRole`/`NodeHostname` enrichers, `TraceContextEnricher`
|
||||
(writes `trace_id`/`span_id` from `Activity.Current`), and the `ILogRedactor` seam via
|
||||
`RedactionEnricher`. Uses `preserveStaticLogger: true` so it is test-safe.
|
||||
|
||||
**The headline gap (§1 of GAPS):** *no* app sets a single OTel Resource attribute today, so every
|
||||
metric and span from every node is indistinguishable in a backend — no service identity, no
|
||||
site/role topology, no version label. `AddZbTelemetry` closes this for all three at once. This is
|
||||
the single highest-value observability gap across the fleet.
|
||||
|
||||
**Behaviour-preserving bar** (same as the Health adoption): same log messages at the same levels,
|
||||
same metric series with the same names and units, same `/metrics` path. New series produced by
|
||||
standard instrumentation are *additive*. All genuinely breaking items are **deferred** (see §6).
|
||||
|
||||
---
|
||||
|
||||
## 2. Distribution
|
||||
|
||||
- **Feed:** Gitea NuGet registry `dohertj2-gitea`
|
||||
(`https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json`). Credentials live
|
||||
**creds-only at the user level** (`~/.nuget/NuGet/NuGet.Config` `<packageSourceCredentials>`),
|
||||
matched by source name — **never committed to any repo**. Already configured during the Health
|
||||
round; no change needed here.
|
||||
- **Source-mapping — the two-pattern gotcha (carried from Health):** under
|
||||
`packageSourceMapping`, the glob `ZB.MOM.WW.Telemetry.*` matches `ZB.MOM.WW.Telemetry.Serilog`
|
||||
but **not** the bare core id `ZB.MOM.WW.Telemetry`. Each repo therefore needs **both**:
|
||||
```xml
|
||||
<package pattern="ZB.MOM.WW.Telemetry" />
|
||||
<package pattern="ZB.MOM.WW.Telemetry.*" />
|
||||
```
|
||||
- **Per-repo wiring:**
|
||||
| Repo | CPM? | Change |
|
||||
|---|---|---|
|
||||
| OtOpcUa | yes (`Directory.Packages.props`) | add 2 `<PackageVersion>` @ `0.1.0`; extend existing `NuGet.config` mapping with both Telemetry patterns; add 2 versionless `<PackageReference>` to the Host csproj |
|
||||
| ScadaBridge | yes | add 2 `<PackageVersion>` @ `0.1.0`; extend existing `nuget.config` mapping; add 2 versionless `<PackageReference>` to the Host csproj |
|
||||
| MxAccessGateway | **no CPM** | add 2 direct versioned `<PackageReference>` to the Server csproj; extend its `nuget.config` mapping (the file created during the Health round) |
|
||||
- **Task 0 (gating, like Health):** the library docs claim these two packages are already on the
|
||||
feed. **Verify first; pack + push the two `.nupkg`s if missing** — the Health round proved this
|
||||
claim cannot be trusted.
|
||||
- **Serilog version floor (Gap V1):** OtOpcUa pins `Serilog.AspNetCore` 9.0.0, ScadaBridge 10.0.0.
|
||||
Confirm the `.Serilog` package's Serilog dependency floor is satisfied by both (bump if not), and
|
||||
pick MxGateway's fresh `Serilog.AspNetCore` version to align.
|
||||
|
||||
---
|
||||
|
||||
## 3. Per-app adoption surface
|
||||
|
||||
### OtOpcUa (`master`) — moderate
|
||||
|
||||
Already has Serilog (inline `UseSerilog`), full OTel, and Prometheus `/metrics`.
|
||||
|
||||
- **Metrics/traces:** replace the hand-rolled
|
||||
`src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs`
|
||||
(`AddOpenTelemetry().WithMetrics(...AddPrometheusExporter()).WithTracing(...)` +
|
||||
`MapPrometheusScrapingEndpoint("/metrics")`) with
|
||||
```csharp
|
||||
builder.AddZbTelemetry(o =>
|
||||
{
|
||||
o.ServiceName = "otopcua";
|
||||
o.ServiceVersion = /* AssemblyInformationalVersion */;
|
||||
o.Meters = ["ZB.MOM.WW.OtOpcUa"];
|
||||
o.ActivitySources = ["ZB.MOM.WW.OtOpcUa"];
|
||||
// Exporter defaults to Prometheus
|
||||
});
|
||||
// ...
|
||||
app.MapZbMetrics();
|
||||
```
|
||||
**Same meter/source names and same `/metrics` path** → behaviour-preserving; *gains* the Resource
|
||||
identity + standard instrumentation. (OtOpcUa records spans but has no trace exporter today;
|
||||
Prometheus is metrics-only, so traces remain a no-op exporter-wise — unchanged. OTLP trace wiring
|
||||
is deferred, §6.)
|
||||
- **Logging:** replace the inline
|
||||
`builder.Host.UseSerilog((ctx, lc) => lc.ReadFrom.Configuration(...).WriteTo.Console().WriteTo.File(...))`
|
||||
with `builder.AddZbSerilog(o => { o.ServiceName = "otopcua"; })`, moving the Console/File sinks
|
||||
into `appsettings` `Serilog:WriteTo` so `ReadFrom.Configuration` reproduces them. Keep the
|
||||
existing driver-scope `LogContextEnricher` alongside the shared enrichers.
|
||||
- **Identity:** `ServiceName="otopcua"`; `SiteId`/`NodeRole` omitted (none in config).
|
||||
|
||||
### ScadaBridge (`main`) — moderate, two composition roots
|
||||
|
||||
Serilog already (via `LoggerConfigurationFactory`); **no OTel at all**; `SiteId` + `NodeRole`
|
||||
already read from config (`ScadaBridge:Node:*`, `NodeOptions`).
|
||||
|
||||
- **Metrics:** add `builder.AddZbTelemetry(o => { o.ServiceName="scadabridge"; o.SiteId=siteId; o.NodeRole=nodeRole; })`
|
||||
+ `app.MapZbMetrics()` in **both** composition roots — the Central block and the Site block of
|
||||
`Program.cs` (the same two-root pattern the Health adoption used). `Meters=[]` for now (app
|
||||
instruments are deferred, §6). Purely additive — no metrics exist today to break.
|
||||
- **Logging:** replace `LoggerConfigurationFactory.Build(config, nodeRole, siteId, nodeHostname)` +
|
||||
`builder.Host.UseSerilog()` with
|
||||
`builder.AddZbSerilog(o => { o.ServiceName="scadabridge"; o.SiteId=siteId; o.NodeRole=nodeRole; })`
|
||||
— its enrichers reproduce the factory's `SiteId`/`NodeRole`/`NodeHostname`. Keep a minimal
|
||||
`CreateBootstrapLogger()` line for early-startup capture per the library's documented pattern,
|
||||
then delete `LoggerConfigurationFactory`. Verify the existing sinks are config-driven (`Serilog`
|
||||
section in `appsettings`) so the swap is byte-equivalent; mirror any code-side sinks into config.
|
||||
|
||||
### MxAccessGateway (`main`) — heaviest (the MEL→Serilog migration)
|
||||
|
||||
MEL-only; custom `MxGateway.Server` meter **not exported**; no `/metrics`. The x86 net48 worker is
|
||||
a separate process and **out of scope** — telemetry is for the Server.
|
||||
|
||||
- **Logging (MEL → Serilog):**
|
||||
- Add Serilog packages (`Serilog.AspNetCore` + sinks) to the Server csproj (direct versioned ref).
|
||||
- Replace the temporary `LoggerFactory.Create(...)` MEL bootstrap in `GatewayApplication.cs`
|
||||
(and `builder.Logging` config) with `builder.AddZbSerilog(o => { o.ServiceName="mxgateway"; })`
|
||||
+ a `CreateBootstrapLogger()` line.
|
||||
- `GatewayLogScope` → `Serilog.Context.LogContext.PushProperty(...)`.
|
||||
- `GatewayLogRedactor` → implement the `ILogRedactor` seam, register in DI (picked up by
|
||||
`RedactionEnricher`).
|
||||
- Request-logging middleware → `UseSerilogRequestLogging()` (or keep the middleware but emit via
|
||||
a Serilog `ILogger`). Sinks to `appsettings`.
|
||||
- **Metrics:** `builder.AddZbTelemetry(o => { o.ServiceName="mxgateway"; o.Meters=["MxGateway.Server"]; })`
|
||||
+ `app.MapZbMetrics()` → the 20 existing instruments (13 counters, 3 histograms, 4 gauges) finally
|
||||
export. **Keep the `MxGateway.Server` meter name and the `ms` histogram units** (rename and unit
|
||||
conversion are deferred, §6). `GetSnapshot()` in-memory read path stays untouched.
|
||||
|
||||
---
|
||||
|
||||
## 4. Shared seam
|
||||
|
||||
```
|
||||
ZbTelemetryOptions (ServiceName / SiteId / NodeRole / Meters / ActivitySources / Exporter)
|
||||
│
|
||||
┌─────────────────┴──────────────────┐
|
||||
AddZbTelemetry (core) AddZbSerilog (.Serilog)
|
||||
• ZbResource (identity triple) • ReadFrom.Configuration sinks
|
||||
• app Meters + ActivitySources • SiteId / NodeRole / NodeHostname enrichers
|
||||
• standard instrumentation • TraceContextEnricher (trace_id / span_id)
|
||||
• Prometheus always + OTLP opt-in • ILogRedactor seam (RedactionEnricher)
|
||||
│ │
|
||||
app.MapZbMetrics() → /metrics preserveStaticLogger: true (test-safe)
|
||||
```
|
||||
|
||||
Both packages share the single `ZbTelemetryOptions`. The Serilog OTLP log sink derives its Resource
|
||||
attributes from `ZbResource.BuildAttributes` (single source of truth), so logs can never drift from
|
||||
metrics and traces in a backend.
|
||||
|
||||
---
|
||||
|
||||
## 5. Sequencing & execution
|
||||
|
||||
Subagent-driven, classification-driven review chain. **Task 0 gates everything** (verify/publish the
|
||||
feed). Then three **independent** per-repo phases — each its own git repo, branch
|
||||
**`feat/adopt-zb-telemetry`**, commit per task, **never skip hooks, never force-push**:
|
||||
|
||||
1. **Task 0 (gating):** verify the two Telemetry `.nupkg`s are on the Gitea feed; pack + push if
|
||||
missing (creds-only user config, already set).
|
||||
2. **OtOpcUa:** source-mapping + package refs → `AddZbTelemetry` swap → `AddZbSerilog` swap → tests.
|
||||
3. **ScadaBridge:** source-mapping + package refs → `AddZbTelemetry` (both roots) → `AddZbSerilog`
|
||||
(replace `LoggerConfigurationFactory`) → tests.
|
||||
4. **MxAccessGateway:** source-mapping + package refs → **MEL→Serilog** (sub-tasked, `high-risk`)
|
||||
→ `AddZbTelemetry` metrics export → tests.
|
||||
5. **scadaproj bookkeeping:** add an "Adoption status — DONE" section to
|
||||
`components/observability/GAPS.md` (per-repo table + deferred items), **and correct the false
|
||||
"MxGateway logging already adopted" claim** in CLAUDE.md, the library CLAUDE.md, and
|
||||
`components/observability/README.md`.
|
||||
|
||||
The MxGateway MEL→Serilog migration is the one `high-risk` change (logging behaviour on the most
|
||||
operational app) and gets the full spec→code serial review chain. The other per-app swaps are
|
||||
`standard`.
|
||||
|
||||
---
|
||||
|
||||
## 6. Deferred (out of scope this round; recorded in GAPS)
|
||||
|
||||
| # | Item | Why deferred |
|
||||
|---|---|---|
|
||||
| #6 | MxGateway histogram `ms` → `s` | Breaking dashboard/alert change — needs ops coordination |
|
||||
| #7 | MxGateway meter rename `MxGateway.Server` → `ZB.MOM.WW.MxGateway` | Breaking Prometheus label change — needs ops coordination |
|
||||
| #9 | ScadaBridge app instruments (`ScadaBridgeTelemetry` + `scadabridge.*`) | Application-specific work, not shared-library adoption |
|
||||
| #10 | OtOpcUa OTLP exporter alongside Prometheus | Opt-in; no consumer for OTLP yet |
|
||||
| #11 | OtOpcUa trace-export no-op (spans recorded, no exporter) | Resolved by #10 / OTLP; or document |
|
||||
|
||||
None of these block the behaviour-preserving initial adoption.
|
||||
|
||||
---
|
||||
|
||||
## 7. Testing
|
||||
|
||||
All tests run **offline** — Prometheus is in-process, no OTLP collector required, and the library's
|
||||
own test suites are network-free.
|
||||
|
||||
- **OtOpcUa:** assert `/metrics` is still served, the `ZB.MOM.WW.OtOpcUa` meter is present, the
|
||||
Resource carries `service.name`, and the shared Serilog enrichers are wired.
|
||||
- **ScadaBridge:** assert `/metrics` is served in **both** roles, the logger carries
|
||||
`SiteId`/`NodeRole` enrichers, and startup is clean after `LoggerConfigurationFactory` removal.
|
||||
- **MxAccessGateway** (the careful one): assert log messages are still emitted at the same levels,
|
||||
redaction still applies, request logging still fires, `/metrics` is now served, and the
|
||||
`GetSnapshot()` path is unchanged — using the existing fake-worker test harness (no MXAccess
|
||||
needed).
|
||||
|
||||
---
|
||||
|
||||
## 8. Acceptance bar
|
||||
|
||||
- Each app builds and its test suite is green.
|
||||
- `/metrics` serves the same existing series (plus additive standard-instrumentation series); meter
|
||||
names and units unchanged.
|
||||
- Logs carry the same messages at the same levels, plus the shared identity enrichers and
|
||||
`trace_id`/`span_id` correlation.
|
||||
- No secrets committed to any repo (the Gitea token stays creds-only at the user level).
|
||||
- `components/observability/GAPS.md` updated; the false "MxGateway logging adopted" claim corrected.
|
||||
@@ -0,0 +1,848 @@
|
||||
# ZB.MOM.WW.Telemetry Adoption Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans (or subagent-driven-development) to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Adopt the shared `ZB.MOM.WW.Telemetry` + `ZB.MOM.WW.Telemetry.Serilog` packages across OtOpcUa, MxAccessGateway, and ScadaBridge — giving all three the OTel Resource identity triple, standard instrumentation, Prometheus `/metrics`, and shared Serilog correlation — behaviour-preserving, with breaking items deferred.
|
||||
|
||||
**Architecture:** Gitea-registry distribution (`dohertj2-gitea`, creds-only at user level). Each app references the shared packages and swaps its bespoke wiring for `AddZbTelemetry` / `AddZbSerilog`, keeping existing meter names, units, log messages, and the `/metrics` path. Each sister repo is its own git repo; work happens on branch `feat/adopt-zb-telemetry`, one commit per task, **never skip hooks, never force-push.**
|
||||
|
||||
**Tech Stack:** .NET 10, OpenTelemetry SDK, Prometheus exporter, Serilog, NuGet Central Package Management (OtOpcUa + ScadaBridge; MxGateway has none).
|
||||
|
||||
**Source design:** [`2026-06-01-telemetry-library-adoption-design.md`](2026-06-01-telemetry-library-adoption-design.md)
|
||||
|
||||
---
|
||||
|
||||
## Two refinements discovered during planning (deviations from the design doc)
|
||||
|
||||
Both serve the approved **behaviour-preserving** acceptance bar:
|
||||
|
||||
1. **ScadaBridge logging — KEEP `LoggerConfigurationFactory`.** The design doc said "delete the
|
||||
factory and swap to `AddZbSerilog`." Code review showed the factory implements a documented
|
||||
governance contract (REQ-HOST-8 / Host-011/014/020/022): `ScadaBridge:Logging:MinimumLevel` is
|
||||
the floor and **overrides** `Serilog:MinimumLevel`, with operator warnings when both are set or
|
||||
a level is mistyped. `AddZbSerilog` hard-codes `MinimumLevel.Is(Information)` *before*
|
||||
`ReadFrom.Configuration`, which inverts that precedence and silently drops the
|
||||
`ScadaBridge:Logging:MinimumLevel` knob (and breaks its tests). **Plan: keep the factory, add the
|
||||
shared `TraceContextEnricher` to it** (gaining trace↔log correlation) and do NOT adopt
|
||||
`AddZbSerilog` for ScadaBridge. ScadaBridge still fully adopts the metrics/Resource half.
|
||||
|
||||
2. **MxGateway logging — keep `GatewayLogScope` + request-logging middleware as-is.** The Serilog
|
||||
MEL provider captures MEL `BeginScope` dictionaries as structured properties, so the existing
|
||||
middleware keeps producing the same scope properties once Serilog is the provider. The only
|
||||
logging code changes are: register Serilog as the provider (`AddZbSerilog`), migrate the
|
||||
`appsettings` `Logging` section to a `Serilog` section, and wrap the static `GatewayLogRedactor`
|
||||
behind the `ILogRedactor` seam. No rewrite of working scope code.
|
||||
|
||||
---
|
||||
|
||||
## Execution order & parallelism
|
||||
|
||||
- **Task 0 gates everything** (packages must be on the feed before any repo can restore).
|
||||
- After Task 0, the **three repo phases are independent** (separate working directories) and may run
|
||||
concurrently: OtOpcUa (Tasks 1–3), ScadaBridge (Tasks 4–6), MxGateway (Tasks 7–11).
|
||||
- **Within a repo, tasks are sequential** (same working tree / same branch — do not dispatch two
|
||||
implementers against one repo concurrently).
|
||||
- **Task 12** (scadaproj bookkeeping) runs last, after all three phases land.
|
||||
|
||||
Branch setup (first task in each repo creates it): `git checkout -b feat/adopt-zb-telemetry` from the
|
||||
repo's default branch (`master` for OtOpcUa, `main` for the others).
|
||||
|
||||
---
|
||||
|
||||
## Task 0: Publish/verify Telemetry packages on the Gitea feed
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none (gates all)
|
||||
|
||||
**Files:**
|
||||
- Work in: `/Users/dohertj2/Desktop/scadaproj/ZB.MOM.WW.Telemetry/`
|
||||
- No repo files edited (publish only). Credentials already at `~/.nuget/NuGet/NuGet.Config`.
|
||||
|
||||
**Context:** The library CLAUDE.md claims these are "published to the Gitea NuGet feed." The Health
|
||||
round proved that claim unreliable. Verify; pack + push only if missing. Mirrors Health Task 0.
|
||||
|
||||
**Step 1: Check whether `ZB.MOM.WW.Telemetry` 0.1.0 is already on the feed**
|
||||
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/scadaproj/ZB.MOM.WW.Telemetry
|
||||
# Use the user-level creds (source name dohertj2-gitea) already configured.
|
||||
dotnet nuget list source # confirm dohertj2-gitea is NOT registered globally (creds are user-level only)
|
||||
curl -s -u "dohertj2:$(grep -A2 dohertj2-gitea ~/.nuget/NuGet/NuGet.Config | grep ClearTextPassword | sed -E 's/.*value="([^"]+)".*/\1/')" \
|
||||
"https://gitea.dohertylan.com/api/packages/dohertj2/nuget/registration/zb.mom.ww.telemetry/index.json" -o /tmp/tele.json -w "%{http_code}\n"
|
||||
```
|
||||
Expected: `200` if already published (then SKIP to Step 4), `404` if missing (continue).
|
||||
|
||||
**Step 2: Pack the two packages (only if missing)**
|
||||
|
||||
```bash
|
||||
dotnet pack ZB.MOM.WW.Telemetry.slnx -c Release -o ./artifacts
|
||||
ls ./artifacts/*.nupkg
|
||||
```
|
||||
Expected: `ZB.MOM.WW.Telemetry.0.1.0.nupkg` and `ZB.MOM.WW.Telemetry.Serilog.0.1.0.nupkg`.
|
||||
|
||||
**Step 3: Push both to Gitea (only if missing)**
|
||||
|
||||
```bash
|
||||
TOKEN=$(grep -A2 dohertj2-gitea ~/.nuget/NuGet/NuGet.Config | grep ClearTextPassword | sed -E 's/.*value="([^"]+)".*/\1/')
|
||||
for pkg in ./artifacts/ZB.MOM.WW.Telemetry.0.1.0.nupkg ./artifacts/ZB.MOM.WW.Telemetry.Serilog.0.1.0.nupkg; do
|
||||
dotnet nuget push "$pkg" --source "https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json" --api-key "$TOKEN"
|
||||
done
|
||||
```
|
||||
Expected: `Your package was pushed.` for each (or `409 Conflict` if a version already exists — acceptable).
|
||||
|
||||
**Step 4: Verify both ids resolve**
|
||||
|
||||
```bash
|
||||
for id in zb.mom.ww.telemetry zb.mom.ww.telemetry.serilog; do
|
||||
curl -s -u "dohertj2:$TOKEN" "https://gitea.dohertylan.com/api/packages/dohertj2/nuget/registration/$id/index.json" -w " -> %{http_code}\n" -o /dev/null
|
||||
done
|
||||
```
|
||||
Expected: `-> 200` for both.
|
||||
|
||||
**Step 5: No commit** (publish-only task). Record completion.
|
||||
|
||||
> **SECURITY:** the Gitea token must NEVER be written into any repo file or commit. It lives only in
|
||||
> `~/.nuget/NuGet/NuGet.Config`. The `curl`/`push` commands read it from there at runtime.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: OtOpcUa — distribution wiring (source mapping + package refs)
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** Task 4, Task 7 (other repos)
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/NuGet.config`
|
||||
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/Directory.Packages.props`
|
||||
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj`
|
||||
|
||||
**Step 1: Branch**
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/OtOpcUa && git checkout master && git pull --ff-only && git checkout -b feat/adopt-zb-telemetry
|
||||
```
|
||||
|
||||
**Step 2: Add Telemetry patterns to `NuGet.config`** — under `<packageSource key="dohertj2-gitea">`, add BOTH patterns (the `.*` glob does NOT match the bare core id):
|
||||
```xml
|
||||
<packageSource key="dohertj2-gitea">
|
||||
<package pattern="ZB.MOM.WW.Health" />
|
||||
<package pattern="ZB.MOM.WW.Health.*" />
|
||||
<package pattern="ZB.MOM.WW.Telemetry" />
|
||||
<package pattern="ZB.MOM.WW.Telemetry.*" />
|
||||
</packageSource>
|
||||
```
|
||||
|
||||
**Step 3: Add versions to `Directory.Packages.props`** (next to the Health `<PackageVersion>` lines):
|
||||
```xml
|
||||
<PackageVersion Include="ZB.MOM.WW.Telemetry" Version="0.1.0" />
|
||||
<PackageVersion Include="ZB.MOM.WW.Telemetry.Serilog" Version="0.1.0" />
|
||||
```
|
||||
|
||||
**Step 4: Add versionless refs to the Host csproj** (next to the `ZB.MOM.WW.Health` refs):
|
||||
```xml
|
||||
<PackageReference Include="ZB.MOM.WW.Telemetry" />
|
||||
<PackageReference Include="ZB.MOM.WW.Telemetry.Serilog" />
|
||||
```
|
||||
|
||||
**Step 5: Restore + build to confirm the Gitea feed resolves and Serilog floor is satisfied**
|
||||
```bash
|
||||
dotnet restore ZB.MOM.WW.OtOpcUa.slnx
|
||||
dotnet build ZB.MOM.WW.OtOpcUa.slnx -c Debug
|
||||
```
|
||||
Expected: restore pulls both packages from `dohertj2-gitea`; build succeeds. If restore fails on a
|
||||
`Serilog.AspNetCore` floor (OtOpcUa pins 9.0.0), bump `Serilog.AspNetCore` (and the related
|
||||
`Serilog.*` 9.x lines) in `Directory.Packages.props` to the floor the package requires, then rebuild.
|
||||
|
||||
**Step 6: Commit**
|
||||
```bash
|
||||
git add NuGet.config Directory.Packages.props src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj
|
||||
git commit -m "build(otopcua): reference ZB.MOM.WW.Telemetry packages from Gitea feed"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 2: OtOpcUa — swap OTel wiring to AddZbTelemetry
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none (within OtOpcUa)
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs` (rewrite body; keep both method names + signatures)
|
||||
- Test (oracle, do not edit): `/Users/dohertj2/Desktop/OtOpcUa/tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/Observability/OtOpcUaTelemetryHookTests.cs`
|
||||
|
||||
**Context:** Today `AddOtOpcUaObservability()` (called at `Program.cs:138`) hand-wires
|
||||
`AddOpenTelemetry().WithMetrics(...AddMeter("ZB.MOM.WW.OtOpcUa")...AddPrometheusExporter()).WithTracing(...AddSource("ZB.MOM.WW.OtOpcUa"))`,
|
||||
and `MapOtOpcUaMetrics()` (called at `Program.cs:160`) maps `/metrics`. Keep both call sites
|
||||
unchanged; rewrite the extension bodies to delegate to the shared library. **Same meter/source
|
||||
names + same `/metrics` path** ⇒ behaviour-preserving; gains the Resource identity triple +
|
||||
standard instrumentation.
|
||||
|
||||
**Step 1: Rewrite `ObservabilityExtensions.cs`** preserving the two public method signatures:
|
||||
```csharp
|
||||
using Microsoft.AspNetCore.Routing;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using ZB.MOM.WW.OtOpcUa.Commons.Observability; // OtOpcUaTelemetry
|
||||
using ZB.MOM.WW.Telemetry;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Host.Observability;
|
||||
|
||||
/// <summary>
|
||||
/// OtOpcUa observability wiring, delegated to the shared ZB.MOM.WW.Telemetry library.
|
||||
/// Keeps the existing meter/ActivitySource names ("ZB.MOM.WW.OtOpcUa") and the "/metrics"
|
||||
/// scrape path, and adds the shared OTel Resource + standard instrumentation.
|
||||
/// </summary>
|
||||
public static class ObservabilityExtensions
|
||||
{
|
||||
public static IServiceCollection AddOtOpcUaObservability(this IServiceCollection services)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(services);
|
||||
return services.AddZbTelemetry(o =>
|
||||
{
|
||||
o.ServiceName = "otopcua";
|
||||
o.Meters = [OtOpcUaTelemetry.MeterName]; // "ZB.MOM.WW.OtOpcUa"
|
||||
o.ActivitySources = [OtOpcUaTelemetry.ActivitySourceName]; // "ZB.MOM.WW.OtOpcUa"
|
||||
// Exporter defaults to Prometheus — preserves the existing /metrics posture.
|
||||
});
|
||||
}
|
||||
|
||||
// Keep the SAME signature the Program.cs:160 call site uses (app.MapOtOpcUaMetrics()).
|
||||
// MapZbMetrics() maps MapPrometheusScrapingEndpoint() whose default path is "/metrics".
|
||||
public static IEndpointRouteBuilder MapOtOpcUaMetrics(this IEndpointRouteBuilder endpoints)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(endpoints);
|
||||
endpoints.MapZbMetrics();
|
||||
return endpoints;
|
||||
}
|
||||
}
|
||||
```
|
||||
> If the existing `MapOtOpcUaMetrics` extends `WebApplication`/`IApplicationBuilder` rather than
|
||||
> `IEndpointRouteBuilder`, keep THAT receiver type and call `app.MapZbMetrics();` — match the
|
||||
> current signature so `Program.cs:160` compiles unchanged.
|
||||
|
||||
**Step 2: Build**
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/OtOpcUa && dotnet build ZB.MOM.WW.OtOpcUa.slnx -c Debug
|
||||
```
|
||||
Expected: PASS. (The now-redundant direct `OpenTelemetry.Extensions.Hosting` /
|
||||
`OpenTelemetry.Exporter.Prometheus.AspNetCore` refs may stay — they resolve the same assemblies the
|
||||
shared package brings; leaving them is lower-risk than pruning.)
|
||||
|
||||
**Step 3: Run the telemetry hook tests (the behaviour oracle)**
|
||||
```bash
|
||||
dotnet test ZB.MOM.WW.OtOpcUa.slnx --filter "FullyQualifiedName~OtOpcUaTelemetryHookTests"
|
||||
```
|
||||
Expected: PASS — the meter `ZB.MOM.WW.OtOpcUa` and ActivitySource still emit (the shared
|
||||
`AddZbTelemetry` registered them via `o.Meters`/`o.ActivitySources`).
|
||||
|
||||
**Step 4: Commit**
|
||||
```bash
|
||||
git add src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs
|
||||
git commit -m "feat(otopcua): wire OTel via AddZbTelemetry (shared Resource + std instrumentation)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 3: OtOpcUa — swap Serilog to AddZbSerilog + move sinks to config
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (within OtOpcUa)
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs:49-52` (the inline `UseSerilog` block)
|
||||
- Modify: `/Users/dohertj2/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json` (currently `{}`)
|
||||
- Test (oracle): `/Users/dohertj2/Desktop/OtOpcUa/tests/Core/ZB.MOM.WW.OtOpcUa.Core.Tests/Observability/LogContextEnricherTests.cs`
|
||||
|
||||
**Context:** Today `Program.cs:49-52` configures Serilog in code with `ReadFrom.Configuration` +
|
||||
`WriteTo.Console()` + `WriteTo.File("logs/otopcua-.log", rollingInterval: Day)`. `AddZbSerilog` uses
|
||||
`ReadFrom.Configuration` only, so the Console/File sinks must move into config to be reproduced. The
|
||||
role-specific `appsettings.*.json` already carry `Serilog:MinimumLevel` overrides — those keep
|
||||
working through `ReadFrom.Configuration`.
|
||||
|
||||
**Step 1: Add the sinks to `appsettings.json`** (replace the empty `{}`):
|
||||
```json
|
||||
{
|
||||
"Serilog": {
|
||||
"Using": [ "Serilog.Sinks.Console", "Serilog.Sinks.File" ],
|
||||
"WriteTo": [
|
||||
{ "Name": "Console" },
|
||||
{ "Name": "File", "Args": { "path": "logs/otopcua-.log", "rollingInterval": "Day" } }
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
> Do NOT add `"Enrich": ["FromLogContext"]` unless it is already enabled today — adding it would
|
||||
> newly surface driver-scope properties and change output. Preserve the current enrich set.
|
||||
|
||||
**Step 2: Replace the inline `UseSerilog` block in `Program.cs`.** Remove lines 49-52:
|
||||
```csharp
|
||||
builder.Host.UseSerilog((ctx, lc) => lc
|
||||
.ReadFrom.Configuration(ctx.Configuration)
|
||||
.WriteTo.Console()
|
||||
.WriteTo.File("logs/otopcua-.log", rollingInterval: RollingInterval.Day));
|
||||
```
|
||||
and replace with:
|
||||
```csharp
|
||||
builder.AddZbSerilog(o => o.ServiceName = "otopcua");
|
||||
```
|
||||
Add `using ZB.MOM.WW.Telemetry.Serilog;` to the `using` block. Keep `app.UseSerilogRequestLogging();`
|
||||
(line 141) unchanged. Keep the existing `using Serilog;` if still referenced; remove
|
||||
`RollingInterval` import only if now unused.
|
||||
|
||||
**Step 3: Build + run the LogContextEnricher tests**
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/OtOpcUa
|
||||
dotnet build ZB.MOM.WW.OtOpcUa.slnx -c Debug
|
||||
dotnet test ZB.MOM.WW.OtOpcUa.slnx --filter "FullyQualifiedName~LogContextEnricherTests"
|
||||
```
|
||||
Expected: build PASS; tests PASS (the static `LogContextEnricher.Push` helper is unaffected — it is
|
||||
not registered in DI and AddZbSerilog does not change its disposable contract).
|
||||
|
||||
**Step 4: Sanity-check that logs still emit** (no automated log-output harness here):
|
||||
```bash
|
||||
# Quick smoke: build runs; optionally run the host briefly in a role that doesn't need infra
|
||||
# and confirm console log lines appear. If no safe role exists, rely on the build + the request-
|
||||
# logging path remaining wired (UseSerilogRequestLogging at Program.cs:141).
|
||||
```
|
||||
|
||||
**Step 5: Commit**
|
||||
```bash
|
||||
git add src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs src/Server/ZB.MOM.WW.OtOpcUa.Host/appsettings.json
|
||||
git commit -m "feat(otopcua): adopt AddZbSerilog (shared enrichers + trace correlation); sinks to config"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4: ScadaBridge — distribution wiring (source mapping + package refs)
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** Task 1, Task 7 (other repos)
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/nuget.config`
|
||||
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/Directory.Packages.props`
|
||||
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj`
|
||||
|
||||
**Step 1: Branch**
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/ScadaBridge && git checkout main && git pull --ff-only && git checkout -b feat/adopt-zb-telemetry
|
||||
```
|
||||
|
||||
**Step 2: Add Telemetry patterns to `nuget.config`** under `<packageSource key="dohertj2-gitea">`:
|
||||
```xml
|
||||
<package pattern="ZB.MOM.WW.Telemetry" />
|
||||
<package pattern="ZB.MOM.WW.Telemetry.*" />
|
||||
```
|
||||
|
||||
**Step 3: Add versions to `Directory.Packages.props`** (next to the Health lines):
|
||||
```xml
|
||||
<PackageVersion Include="ZB.MOM.WW.Telemetry" Version="0.1.0" />
|
||||
<PackageVersion Include="ZB.MOM.WW.Telemetry.Serilog" Version="0.1.0" />
|
||||
```
|
||||
|
||||
**Step 4: Add versionless refs to the Host csproj** (next to the Health refs):
|
||||
```xml
|
||||
<PackageReference Include="ZB.MOM.WW.Telemetry" />
|
||||
<PackageReference Include="ZB.MOM.WW.Telemetry.Serilog" />
|
||||
```
|
||||
> `ZB.MOM.WW.Telemetry.Serilog` is referenced here only for the public `TraceContextEnricher` type
|
||||
> used in Task 6 — ScadaBridge does NOT call `AddZbSerilog`.
|
||||
|
||||
**Step 5: Restore + build** (watch for OTel version conflicts with the pinned `OpenTelemetry.Api 1.15.3`)
|
||||
```bash
|
||||
dotnet restore ZB.MOM.WW.ScadaBridge.slnx
|
||||
dotnet build ZB.MOM.WW.ScadaBridge.slnx -c Debug
|
||||
```
|
||||
Expected: PASS. If a transitive OTel version conflicts with the CVE-override `OpenTelemetry.Api`,
|
||||
align the override version to what the shared package requires.
|
||||
|
||||
**Step 6: Commit**
|
||||
```bash
|
||||
git add nuget.config Directory.Packages.props src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj
|
||||
git commit -m "build(scadabridge): reference ZB.MOM.WW.Telemetry packages from Gitea feed"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 5: ScadaBridge — AddZbTelemetry in both composition roots + MapZbMetrics
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (within ScadaBridge)
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/SiteServiceRegistration.cs` (`BindSharedOptions`, ~lines 100-117 — add the registration; called by BOTH roots)
|
||||
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/Program.cs` (Central endpoint section ~206-259; Site endpoint section ~307-320 — add `app.MapZbMetrics()` in each)
|
||||
- Test: `/Users/dohertj2/Desktop/ScadaBridge/tests/ZB.MOM.WW.ScadaBridge.Host.Tests/` (add a `/metrics`-served assertion; HealthCheckTests pattern with `WebApplicationFactory<Program>`)
|
||||
|
||||
**Context:** ScadaBridge has NO OTel today (only the `OpenTelemetry.Api` CVE override). `SiteId`,
|
||||
`NodeRole`, `NodeHostname` are available from config (`ScadaBridge:Node:*`). `BindSharedOptions` is
|
||||
called by both the Central and Site roots, so registering telemetry there covers both without
|
||||
duplication. This is purely additive (no metrics exist to break).
|
||||
|
||||
**Step 1: Register telemetry in `BindSharedOptions`.** Inside `SiteServiceRegistration.BindSharedOptions(IServiceCollection services, IConfiguration config)`, after the existing `services.Configure<...>` calls, add:
|
||||
```csharp
|
||||
// Shared OTel: Resource identity (service.name / site.id / node.role) + standard instrumentation
|
||||
// + Prometheus exporter. Mounted at /metrics by app.MapZbMetrics() in each composition root.
|
||||
services.AddZbTelemetry(o =>
|
||||
{
|
||||
o.ServiceName = "scadabridge";
|
||||
o.SiteId = config["ScadaBridge:Node:SiteId"] ?? "central";
|
||||
o.NodeRole = config["ScadaBridge:Node:Role"];
|
||||
// o.Meters left empty — application instruments are a deferred follow-on (GAPS #9).
|
||||
});
|
||||
```
|
||||
Add `using ZB.MOM.WW.Telemetry;`. (Use the SAME default `?? "central"` for SiteId that
|
||||
`Program.cs:45` uses, so the Resource attribute matches the log enricher value.)
|
||||
|
||||
**Step 2: Map `/metrics` in BOTH roots.** In `Program.cs`:
|
||||
- Central block — after `app.UseRouting()` and alongside the other `Map*` calls (e.g. just after `app.MapZbHealth();`), add:
|
||||
```csharp
|
||||
app.MapZbMetrics();
|
||||
```
|
||||
- Site block — in its endpoint section (where `app.MapGrpcService<...>()` is mapped, ~307-320), add:
|
||||
```csharp
|
||||
app.MapZbMetrics();
|
||||
```
|
||||
Add `using ZB.MOM.WW.Telemetry;` to `Program.cs` if not already present. `MapZbMetrics()` requires
|
||||
routing; the Central block already calls `UseRouting()`, and the Site block's `MapGrpcService`
|
||||
implies endpoint routing — if the Site app lacks `UseRouting()`, add it before `MapZbMetrics()`.
|
||||
|
||||
**Step 3: Add a `/metrics` integration test** in the Host.Tests project (mirror `HealthCheckTests`):
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task Metrics_Endpoint_IsMapped()
|
||||
{
|
||||
using var factory = /* existing WebApplicationFactory<Program> setup for Central role */;
|
||||
using var client = factory.CreateClient();
|
||||
var response = await client.GetAsync("/metrics");
|
||||
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
|
||||
var body = await response.Content.ReadAsStringAsync();
|
||||
Assert.Contains("# ", body); // Prometheus exposition format (HELP/TYPE comments)
|
||||
}
|
||||
```
|
||||
> Reuse the exact `WebApplicationFactory<Program>` + in-memory config bootstrapping that
|
||||
> `HealthCheckTests.cs` already uses for the Central role (it sets the env to "Central" and removes
|
||||
> the Akka hosted service). Do not invent a new harness.
|
||||
|
||||
**Step 4: Build + test**
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/ScadaBridge
|
||||
dotnet build ZB.MOM.WW.ScadaBridge.slnx -c Debug
|
||||
dotnet test ZB.MOM.WW.ScadaBridge.slnx --filter "FullyQualifiedName~HealthCheckTests|FullyQualifiedName~Metrics_Endpoint_IsMapped|FullyQualifiedName~CompositionRoot"
|
||||
```
|
||||
Expected: PASS (existing composition-root + health tests stay green; new metrics test passes).
|
||||
|
||||
**Step 5: Commit**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.ScadaBridge.Host/SiteServiceRegistration.cs src/ZB.MOM.WW.ScadaBridge.Host/Program.cs tests/ZB.MOM.WW.ScadaBridge.Host.Tests/
|
||||
git commit -m "feat(scadabridge): wire AddZbTelemetry + /metrics in both composition roots"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 6: ScadaBridge — add shared TraceContextEnricher to LoggerConfigurationFactory
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** none (within ScadaBridge)
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/LoggerConfigurationFactory.cs` (the `Build` return expression)
|
||||
- Test (oracle): `/Users/dohertj2/Desktop/ScadaBridge/tests/ZB.MOM.WW.ScadaBridge.Host.Tests/SerilogTests.cs` (+ any `LoggerConfigurationFactory` tests)
|
||||
|
||||
**Context (deviation from design doc — see top of plan):** KEEP `LoggerConfigurationFactory` intact
|
||||
(it owns the Host-011/014/020/022 minimum-level governance). Only add the shared
|
||||
`TraceContextEnricher` so logs emitted inside a span carry `trace_id`/`span_id` and can be joined to
|
||||
traces. This gains the cross-cutting correlation win without regressing ScadaBridge's logging
|
||||
contract.
|
||||
|
||||
**Step 1: Add the enricher to the `Build` return.** In `LoggerConfigurationFactory.Build(...)`, the
|
||||
final expression currently ends:
|
||||
```csharp
|
||||
return new LoggerConfiguration()
|
||||
.ReadFrom.Configuration(configuration)
|
||||
.MinimumLevel.Is(minimumLevel)
|
||||
.Enrich.WithProperty("SiteId", siteId)
|
||||
.Enrich.WithProperty("NodeHostname", nodeHostname)
|
||||
.Enrich.WithProperty("NodeRole", nodeRole);
|
||||
```
|
||||
Add the shared enricher as the last `.Enrich`:
|
||||
```csharp
|
||||
.Enrich.WithProperty("NodeRole", nodeRole)
|
||||
.Enrich.With(new ZB.MOM.WW.Telemetry.Serilog.TraceContextEnricher());
|
||||
```
|
||||
(Or add `using ZB.MOM.WW.Telemetry.Serilog;` and use `.Enrich.With(new TraceContextEnricher())`.)
|
||||
|
||||
**Step 2: Build + run the Serilog tests**
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/ScadaBridge
|
||||
dotnet build ZB.MOM.WW.ScadaBridge.slnx -c Debug
|
||||
dotnet test ZB.MOM.WW.ScadaBridge.slnx --filter "FullyQualifiedName~SerilogTests|FullyQualifiedName~LoggerConfiguration"
|
||||
```
|
||||
Expected: PASS. The three node-identity enrichers and the min-level governance are untouched;
|
||||
`trace_id`/`span_id` only appear when an `Activity.Current` exists (none in these tests → no change
|
||||
to asserted properties).
|
||||
|
||||
**Step 3: Commit**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.ScadaBridge.Host/LoggerConfigurationFactory.cs
|
||||
git commit -m "feat(scadabridge): add shared TraceContextEnricher to log pipeline (trace correlation)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 7: MxAccessGateway — distribution wiring (source mapping + package refs)
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** Task 1, Task 4 (other repos)
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/nuget.config`
|
||||
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj` (NO CPM — direct versioned refs)
|
||||
|
||||
**Step 1: Branch**
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/MxAccessGateway && git checkout main && git pull --ff-only && git checkout -b feat/adopt-zb-telemetry
|
||||
```
|
||||
|
||||
**Step 2: Add Telemetry patterns to `nuget.config`** under `<packageSource key="dohertj2-gitea">`:
|
||||
```xml
|
||||
<package pattern="ZB.MOM.WW.Telemetry" />
|
||||
<package pattern="ZB.MOM.WW.Telemetry.*" />
|
||||
```
|
||||
|
||||
**Step 3: Add direct versioned refs to the Server csproj** (in the main `<ItemGroup>` of `<PackageReference>`s). MxGateway has no Serilog/OTel today, so it needs the shared packages AND the concrete sink assemblies referenced by the `appsettings` `Using` block:
|
||||
```xml
|
||||
<PackageReference Include="ZB.MOM.WW.Telemetry" Version="0.1.0" />
|
||||
<PackageReference Include="ZB.MOM.WW.Telemetry.Serilog" Version="0.1.0" />
|
||||
<PackageReference Include="Serilog.AspNetCore" Version="10.0.0" />
|
||||
<PackageReference Include="Serilog.Sinks.Console" Version="6.1.1" />
|
||||
<PackageReference Include="Serilog.Sinks.File" Version="7.0.0" />
|
||||
```
|
||||
> Versions align with ScadaBridge's pins (Serilog.AspNetCore 10.0.0, Console 6.1.1, File 7.0.0). If
|
||||
> the `.Serilog` package requires a different `Serilog.AspNetCore` floor, match it.
|
||||
|
||||
**Step 4: Restore + build**
|
||||
```bash
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
|
||||
```
|
||||
Expected: PASS (packages resolve from Gitea + nuget.org).
|
||||
|
||||
**Step 5: Commit**
|
||||
```bash
|
||||
git add nuget.config src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj
|
||||
git commit -m "build(mxgateway): reference ZB.MOM.WW.Telemetry + Serilog packages"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 8: MxAccessGateway — migrate appsettings Logging → Serilog section
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** none (within MxGateway)
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/appsettings.json`
|
||||
|
||||
**Context:** Current `Logging` (MEL) section: `Default: Information`, `Microsoft.AspNetCore: Warning`.
|
||||
`AddZbSerilog` reads sinks/levels via `ReadFrom.Configuration` from a `Serilog` section. Translate
|
||||
the levels and add Console + File sinks so logging output is preserved after the provider swap.
|
||||
|
||||
**Step 1: Replace the `Logging` block with a `Serilog` block.** Remove:
|
||||
```json
|
||||
"Logging": {
|
||||
"LogLevel": { "Default": "Information", "Microsoft.AspNetCore": "Warning" }
|
||||
},
|
||||
```
|
||||
Add:
|
||||
```json
|
||||
"Serilog": {
|
||||
"Using": [ "Serilog.Sinks.Console", "Serilog.Sinks.File" ],
|
||||
"MinimumLevel": {
|
||||
"Default": "Information",
|
||||
"Override": { "Microsoft.AspNetCore": "Warning" }
|
||||
},
|
||||
"WriteTo": [
|
||||
{ "Name": "Console" },
|
||||
{ "Name": "File", "Args": { "path": "logs/mxgateway-.log", "rollingInterval": "Day" } }
|
||||
]
|
||||
},
|
||||
```
|
||||
> Keep the rest of `appsettings.json` (gateway config) unchanged. Note: `AddZbSerilog` applies its
|
||||
> own `MinimumLevel.Is(Information)` before `ReadFrom.Configuration`, so the `Serilog:MinimumLevel`
|
||||
> above is honoured (raising the floor to Information and overriding Microsoft.AspNetCore to Warning
|
||||
> — matching today's MEL levels).
|
||||
|
||||
**Step 2: Commit** (config-only; build happens in Task 9 once the provider is wired)
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Server/appsettings.json
|
||||
git commit -m "config(mxgateway): translate MEL Logging section to Serilog"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 9: MxAccessGateway — wire AddZbSerilog (MEL → Serilog provider swap)
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (within MxGateway)
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs` (`CreateBuilder`, after `ConfigureSelfSignedTls(builder)` ~line 63)
|
||||
- Test: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs` (add a provider-swap assertion)
|
||||
|
||||
**Context (high-risk — logging on the most operational app):** Register Serilog as the host's
|
||||
logging provider so all existing MEL `ILogger`/`ILoggerFactory` calls (including
|
||||
`UseGatewayRequestLoggingScope`'s middleware) route through Serilog. The Serilog MEL provider
|
||||
captures MEL `BeginScope` dictionaries as structured properties, so `GatewayLogScope` and the
|
||||
request-logging middleware keep working unchanged. The temporary `LoggerFactory.Create(...AddConsole())`
|
||||
at lines 96-100 (used only by the TLS cert provider) may remain as-is.
|
||||
|
||||
**Step 1: Add the failing test** in `GatewayApplicationTests.cs` — assert the logger factory is now Serilog-backed:
|
||||
```csharp
|
||||
[Fact]
|
||||
public void Build_UsesSerilogLoggerProvider()
|
||||
{
|
||||
using var app = GatewayApplication.Build([]);
|
||||
var factory = app.Services.GetRequiredService<ILoggerFactory>();
|
||||
// Serilog.Extensions.Hosting registers SerilogLoggerFactory when AddSerilog replaces the factory.
|
||||
Assert.Equal("SerilogLoggerFactory", factory.GetType().Name);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run it — expect FAIL** (`dotnet test ... --filter Build_UsesSerilogLoggerProvider`) → today the factory is the default MEL `LoggerFactory`.
|
||||
|
||||
**Step 3: Wire `AddZbSerilog`.** In `GatewayApplication.CreateBuilder`, immediately after
|
||||
`ConfigureSelfSignedTls(builder);`, add:
|
||||
```csharp
|
||||
builder.AddZbSerilog(o => o.ServiceName = "mxgateway");
|
||||
```
|
||||
Add `using ZB.MOM.WW.Telemetry.Serilog;`. (`AddZbSerilog` calls `services.AddSerilog(..., preserveStaticLogger: true)`,
|
||||
which registers `SerilogLoggerFactory` — replacing the MEL factory, so default providers do not
|
||||
double-log.)
|
||||
|
||||
**Step 4: Run the test — expect PASS**, then run the broader logging-adjacent suites:
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/MxAccessGateway
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~GatewayApplicationTests"
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~FakeWorker"
|
||||
```
|
||||
Expected: PASS — `Build_MapsCanonicalHealthEndpoints`, `Build_RegistersGatewayMetrics`, the
|
||||
config-validation cases, and the fake-worker smoke all stay green; the new provider-swap test passes.
|
||||
|
||||
**Step 5: Verify no double console logging** — if `SerilogLoggerFactory` is confirmed in Step 4, the
|
||||
default providers are bypassed and no extra step is needed. If you observe duplicated console lines
|
||||
in any manual run, add `builder.Logging.ClearProviders();` immediately before `AddZbSerilog`.
|
||||
|
||||
**Step 6: Commit**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs
|
||||
git commit -m "feat(mxgateway): adopt AddZbSerilog — MEL→Serilog provider swap (behaviour-preserving)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 10: MxAccessGateway — wrap GatewayLogRedactor behind the ILogRedactor seam
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none (within MxGateway)
|
||||
|
||||
**Files:**
|
||||
- Create: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogRedactorSeam.cs`
|
||||
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs` (register the seam in DI in `CreateBuilder`)
|
||||
- Test: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorSeamTests.cs`
|
||||
|
||||
**Context:** The shared `RedactionEnricher` applies any DI-registered `ILogRedactor` to every log
|
||||
event before it reaches a sink. MxGateway's redaction lives in the static `GatewayLogRedactor`
|
||||
(API-key Bearer tokens, client identity). Provide a thin `ILogRedactor` that redacts the relevant
|
||||
log-event properties (`ClientIdentity`, `authorization`) via the existing static helper. Keep
|
||||
`GatewayLogRedactor` for its current callers (`GatewayLogScope`, `DashboardRedactor`).
|
||||
|
||||
**Step 1: Write the failing test** (`GatewayLogRedactorSeamTests.cs`):
|
||||
```csharp
|
||||
using System.Collections.Generic;
|
||||
using ZB.MOM.WW.MxGateway.Server.Diagnostics;
|
||||
using Xunit;
|
||||
|
||||
public class GatewayLogRedactorSeamTests
|
||||
{
|
||||
[Fact]
|
||||
public void Redact_MasksApiKeyInClientIdentity()
|
||||
{
|
||||
var redactor = new GatewayLogRedactorSeam();
|
||||
var props = new Dictionary<string, object?>
|
||||
{
|
||||
["ClientIdentity"] = "Bearer mxgw_operator01_super-secret"
|
||||
};
|
||||
redactor.Redact(props);
|
||||
Assert.Equal("Bearer mxgw_operator01_[redacted]", props["ClientIdentity"]);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run it — expect FAIL** (type doesn't exist).
|
||||
|
||||
**Step 3: Implement `GatewayLogRedactorSeam.cs`:**
|
||||
```csharp
|
||||
using ZB.MOM.WW.Telemetry.Serilog;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Server.Diagnostics;
|
||||
|
||||
/// <summary>
|
||||
/// Adapts the static <see cref="GatewayLogRedactor"/> to the shared <see cref="ILogRedactor"/> seam
|
||||
/// so the telemetry RedactionEnricher masks API-key/credential material on every log event.
|
||||
/// </summary>
|
||||
public sealed class GatewayLogRedactorSeam : ILogRedactor
|
||||
{
|
||||
private static readonly string[] IdentityKeys = ["ClientIdentity", "authorization", "Authorization"];
|
||||
|
||||
public void Redact(IDictionary<string, object?> properties)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(properties);
|
||||
foreach (var key in IdentityKeys)
|
||||
{
|
||||
if (properties.TryGetValue(key, out var value) && value is string s)
|
||||
{
|
||||
properties[key] = GatewayLogRedactor.RedactClientIdentity(s);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Step 4: Register in DI.** In `GatewayApplication.CreateBuilder`, alongside the other singletons, add:
|
||||
```csharp
|
||||
builder.Services.AddSingleton<ZB.MOM.WW.Telemetry.Serilog.ILogRedactor, Diagnostics.GatewayLogRedactorSeam>();
|
||||
```
|
||||
|
||||
**Step 5: Run the test + build**
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/MxAccessGateway
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~GatewayLogRedactorSeamTests"
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
|
||||
```
|
||||
Expected: PASS.
|
||||
|
||||
**Step 6: Commit**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogRedactorSeam.cs src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorSeamTests.cs
|
||||
git commit -m "feat(mxgateway): expose GatewayLogRedactor via shared ILogRedactor seam"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 11: MxAccessGateway — wire AddZbTelemetry (export GatewayMetrics) + MapZbMetrics
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none (within MxGateway)
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs` (`CreateBuilder` after `AddSingleton<GatewayMetrics>()` ~line 72; `MapGatewayEndpoints` after `MapZbHealth()` ~line 177)
|
||||
- Test: `/Users/dohertj2/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs` (add `/metrics`-served assertion) + existing `GatewayMetricsTests` as oracle
|
||||
|
||||
**Context:** The `MxGateway.Server` meter (13 counters, 3 ms-histograms, 4 gauges) exists but is
|
||||
never exported (no OTel SDK, no `/metrics`). `AddZbTelemetry` with `Meters = ["MxGateway.Server"]`
|
||||
registers the meter with the OTel MeterProvider + Prometheus exporter; `MapZbMetrics()` mounts
|
||||
`/metrics`. **Keep the `MxGateway.Server` name and the `ms` histogram units** (rename #7 + unit #6
|
||||
are deferred). `GetSnapshot()` is untouched.
|
||||
|
||||
**Step 1: Add `AddZbTelemetry` in `CreateBuilder`**, immediately after `builder.Services.AddSingleton<GatewayMetrics>();`:
|
||||
```csharp
|
||||
builder.AddZbTelemetry(o =>
|
||||
{
|
||||
o.ServiceName = "mxgateway";
|
||||
o.Meters = [GatewayMetrics.MeterName]; // "MxGateway.Server" — unchanged (rename deferred)
|
||||
});
|
||||
```
|
||||
Add `using ZB.MOM.WW.Telemetry;`.
|
||||
|
||||
**Step 2: Map `/metrics` in `MapGatewayEndpoints`**, after `endpoints.MapZbHealth();`:
|
||||
```csharp
|
||||
endpoints.MapZbMetrics();
|
||||
```
|
||||
|
||||
**Step 3: Add the served-endpoint test** in `GatewayApplicationTests.cs`:
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task Build_MapsMetricsEndpoint()
|
||||
{
|
||||
using var app = GatewayApplication.Build([]);
|
||||
await app.StartAsync();
|
||||
try
|
||||
{
|
||||
using var client = new HttpClient { BaseAddress = new Uri(app.Urls.First()) };
|
||||
var response = await client.GetAsync("/metrics");
|
||||
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
|
||||
}
|
||||
finally { await app.StopAsync(); }
|
||||
}
|
||||
```
|
||||
> If the existing test class already has a started-host helper (the config-validation tests call
|
||||
> `StartAsync`), reuse it rather than starting a fresh host. Tests bind ephemeral ports (`:0`).
|
||||
|
||||
**Step 4: Build + test**
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/MxAccessGateway
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -c Debug
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~GatewayApplicationTests|FullyQualifiedName~GatewayMetricsTests"
|
||||
```
|
||||
Expected: PASS — the `MeterListener`-based `GatewayMetricsTests` (Tests-027 isolation) stay green
|
||||
because the meter name/instruments are unchanged; the new `/metrics` test passes.
|
||||
|
||||
**Step 5: Commit**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs
|
||||
git commit -m "feat(mxgateway): export GatewayMetrics via AddZbTelemetry + /metrics (name/units unchanged)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 12: scadaproj — bookkeeping (GAPS + correct the false "MxGateway logging adopted" claim)
|
||||
|
||||
**Classification:** trivial
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none (runs after all repo phases)
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Users/dohertj2/Desktop/scadaproj/components/observability/GAPS.md` (add "Adoption status — 2026-06-01 (DONE)" section)
|
||||
- Modify: `/Users/dohertj2/Desktop/scadaproj/components/observability/README.md` (correct the "MxGateway logging adopted" claim)
|
||||
- Modify: `/Users/dohertj2/Desktop/scadaproj/ZB.MOM.WW.Telemetry/CLAUDE.md` (same correction)
|
||||
- Modify: `/Users/dohertj2/Desktop/scadaproj/CLAUDE.md` (observability row + "MxAccessGateway logging adopted" note)
|
||||
|
||||
**Step 1: Add an adoption-status section to `GAPS.md`** with a per-repo table (what each app now
|
||||
does), the **accepted scope note** (ScadaBridge keeps `LoggerConfigurationFactory` + adds
|
||||
`TraceContextEnricher` rather than adopting `AddZbSerilog`; MxGateway keeps `GatewayLogScope`), and a
|
||||
**Deferred** subsection listing #6 (histogram ms→s), #7 (meter rename), #9 (ScadaBridge app
|
||||
instruments), #10/#11 (OTLP) as still-open.
|
||||
|
||||
**Step 2: Correct the false claim** everywhere it appears — the prior text said MxGateway's MEL→Serilog
|
||||
migration was "done on its own branch." Replace with: "MxGateway MEL→Serilog migration + metrics
|
||||
export landed on `main` via the 2026-06-01 telemetry adoption (branch `feat/adopt-zb-telemetry`)."
|
||||
|
||||
**Step 3: Commit**
|
||||
```bash
|
||||
cd /Users/dohertj2/Desktop/scadaproj
|
||||
git add components/observability/GAPS.md components/observability/README.md ZB.MOM.WW.Telemetry/CLAUDE.md CLAUDE.md
|
||||
git commit -m "docs(observability): record ZB.MOM.WW.Telemetry adoption across 3 apps; correct MxGateway logging-status claim"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance checklist (whole plan)
|
||||
|
||||
- [ ] Both Telemetry packages resolve from the Gitea feed (Task 0 verified `200`).
|
||||
- [ ] OtOpcUa: builds; `OtOpcUaTelemetryHookTests` + `LogContextEnricherTests` green; `/metrics` still served; meter `ZB.MOM.WW.OtOpcUa` unchanged.
|
||||
- [ ] ScadaBridge: builds; composition-root + health + new metrics tests green; `/metrics` served in both roles; `LoggerConfigurationFactory` governance intact.
|
||||
- [ ] MxGateway: builds; `GatewayApplicationTests` + `GatewayMetricsTests` + fake-worker smoke green; logger is Serilog-backed; redaction applied via seam; `/metrics` served; `MxGateway.Server` name + `ms` units unchanged.
|
||||
- [ ] No secrets committed to any repo (token stays in `~/.nuget/NuGet/NuGet.Config`).
|
||||
- [ ] `components/observability/GAPS.md` updated; the false "MxGateway logging adopted" claim corrected.
|
||||
- [ ] All three feature branches committed (one commit per task), no hooks skipped, no force-push.
|
||||
@@ -0,0 +1,20 @@
|
||||
{
|
||||
"planPath": "docs/plans/2026-06-01-telemetry-library-adoption.md",
|
||||
"tasks": [
|
||||
{"id": 0, "taskId": 23, "subject": "Task 0: Publish/verify Telemetry packages on Gitea", "status": "pending", "classification": "small"},
|
||||
{"id": 1, "taskId": 24, "subject": "Task 1: OtOpcUa — distribution wiring", "status": "pending", "classification": "small", "blockedBy": [0]},
|
||||
{"id": 2, "taskId": 25, "subject": "Task 2: OtOpcUa — swap OTel to AddZbTelemetry", "status": "pending", "classification": "standard", "blockedBy": [1]},
|
||||
{"id": 3, "taskId": 26, "subject": "Task 3: OtOpcUa — swap Serilog to AddZbSerilog", "status": "pending", "classification": "standard", "blockedBy": [2]},
|
||||
{"id": 4, "taskId": 27, "subject": "Task 4: ScadaBridge — distribution wiring", "status": "pending", "classification": "small", "blockedBy": [0]},
|
||||
{"id": 5, "taskId": 28, "subject": "Task 5: ScadaBridge — AddZbTelemetry both roots + MapZbMetrics", "status": "pending", "classification": "standard", "blockedBy": [4]},
|
||||
{"id": 6, "taskId": 29, "subject": "Task 6: ScadaBridge — TraceContextEnricher in LoggerConfigurationFactory", "status": "pending", "classification": "small", "blockedBy": [5]},
|
||||
{"id": 7, "taskId": 30, "subject": "Task 7: MxAccessGateway — distribution wiring", "status": "pending", "classification": "small", "blockedBy": [0]},
|
||||
{"id": 8, "taskId": 31, "subject": "Task 8: MxAccessGateway — appsettings Logging → Serilog", "status": "pending", "classification": "small", "blockedBy": [7]},
|
||||
{"id": 9, "taskId": 32, "subject": "Task 9: MxAccessGateway — AddZbSerilog (MEL→Serilog provider swap)", "status": "pending", "classification": "high-risk", "blockedBy": [8]},
|
||||
{"id": 10, "taskId": 33, "subject": "Task 10: MxAccessGateway — ILogRedactor seam", "status": "pending", "classification": "standard", "blockedBy": [9]},
|
||||
{"id": 11, "taskId": 34, "subject": "Task 11: MxAccessGateway — AddZbTelemetry metrics export + MapZbMetrics", "status": "pending", "classification": "standard", "blockedBy": [10]},
|
||||
{"id": 12, "taskId": 35, "subject": "Task 12: scadaproj — bookkeeping + correct false claim", "status": "pending", "classification": "trivial", "blockedBy": [3, 6, 11]}
|
||||
],
|
||||
"notes": "Task 0 gates all. After Task 0 the three repo phases (OtOpcUa 1-3, ScadaBridge 4-6, MxGateway 7-11) are independent and may run concurrently across their separate working directories; within a repo tasks are sequential. Task 12 last.",
|
||||
"lastUpdated": "2026-06-01"
|
||||
}
|
||||
Reference in New Issue
Block a user