All 3 apps adopted on branch feat/adopt-zb-telemetry (behaviour-preserving). Records the per-repo result + accepted scope deviations (ScadaBridge keeps LoggerConfigurationFactory + TraceContextEnricher instead of AddZbSerilog; MxGateway keeps GatewayLogScope, exposes redaction via ILogRedactor seam) and deferred follow-ons (#6 ms->s, #7 meter rename, #9 app instruments, OTLP, and the new ScadaBridge Site-node HTTP/1.1 metrics-listener item). Corrects the prior false 'MxGateway logging adopted on its own branch' claim — that migration actually landed in this pass.
8.1 KiB
Observability (metrics / traces / logs)
Third normalized component under the operability cluster. Goal: path to shared code — converge
the three sister projects onto a common OpenTelemetry Resource, a shared Serilog bootstrap with
unified enrichers, and a trace↔log correlation bridge, proposed as the ZB.MOM.WW.Telemetry
library set (2 packages), while each project keeps its own application instruments and sink
configuration.
- The one target:
spec/SPEC.md - Metric naming reference:
spec/METRIC-CONVENTIONS.md - The proposed shared library:
shared-contract/ZB.MOM.WW.Telemetry.md - Divergences + backlog:
GAPS.md - Current state, per project:
current-state/
Why observability is a strong normalization candidate
All three projects instrument something — but in three completely different ways and at three very different levels of completeness. The divergences are structural:
- OtOpcUa has the full OpenTelemetry SDK (metrics + tracing), Prometheus export, and a bespoke
Serilog enricher for driver-lifecycle correlation — but no Resource (
service.nameis never set) and no trace↔log bridge. - MxAccessGateway has 20 hand-rolled instruments (counters, histograms, gauges) recording real production data — that never leave the process. No OTel SDK, no exporter, no tracing. Logging uses Microsoft.Extensions.Logging rather than Serilog, with a bespoke correlation-scope and redaction pipeline.
- ScadaBridge has zero application instruments. Its
OpenTelemetry.Apireference is a CVE patch, not instrumentation. It does have the cleanest structured logging enricher set (SiteId/NodeRole/NodeHostname) — but those properties exist only in Serilog, not in the OTel Resource, so logs and metrics cannot join in a backend.
Nobody sets a Resource. Nobody does trace↔log correlation. MxGateway's metrics are invisible. ScadaBridge has no metrics at all.
The common fix is a single AddZbTelemetry(options) call that: creates a shared Resource from a
service.name/site.id/node.role options object; registers the project's own Meter/ActivitySource
names with the OTel SDK; and exposes Prometheus /metrics. A companion AddZbSerilog(options) wires
Serilog with the same options as enricher properties and adds TraceContextEnricher so logs carry
trace_id/span_id. The unifying hinge: the same identity triple (service.name/site.id/
node.role) populates both the OTel Resource and the Serilog enrichers, so a metric, a span, and
a log line from the same node carry identical dimensions and join up in a backend.
Adopted across all three apps on 2026-06-01 (branch feat/adopt-zb-telemetry per repo,
behaviour-preserving). Note: MxAccessGateway's MEL→Serilog migration was not actually done at
library-build time despite an earlier claim — it landed in this adoption pass, along with the
metrics export. See GAPS.md → Adoption status — 2026-06-01 for the per-repo result,
the accepted scope decisions (ScadaBridge keeps LoggerConfigurationFactory; MxGateway keeps its
log-scope code), and the deferred follow-ons.
Status by project
| Project | OTel SDK today | Metrics today | Tracing today | Logging today | Enrichers today | Adoption status |
|---|---|---|---|---|---|---|
| OtOpcUa | ✅ full SDK via AddZbTelemetry |
✅ 7 instruments (otopcua.*); Prometheus /metrics |
🟡 2 spans defined; no exporter | Serilog via AddZbSerilog (sinks in appsettings) |
DriverInstanceId/DriverType/CapabilityName/CorrelationId (driver-scope, kept) + shared |
✅ Adopted 2026-06-01 |
| MxAccessGateway | ✅ AddZbTelemetry exports GatewayMetrics |
✅ 20 instruments (mxgateway.*) now exported; new /metrics |
⛔ none | ✅ Serilog (migrated from MEL in this pass) | SiteId/NodeRole/NodeHostname via AddZbSerilog; GatewayLogScope kept; ILogRedactor seam |
✅ Adopted 2026-06-01 |
| ScadaBridge | ✅ AddZbTelemetry (both roots) |
✅ Resource + std instrumentation; /metrics (Central) |
⛔ none | Serilog via LoggerConfigurationFactory (kept) + shared TraceContextEnricher |
SiteId/NodeRole/NodeHostname (process-level) + trace context |
✅ Adopted 2026-06-01 (logging via factory, not AddZbSerilog — see GAPS) |
See each project's current-state/<project>/CURRENT-STATE.md for the
code-verified detail and its adoption plan.
Normalized vs. left per-project
Normalized (the shared target):
AddZbTelemetry(ZbTelemetryOptions)— front door for the OTel SDK. Populates the shared Resource (service.name,service.namespace,service.version,site.id,node.role,host.name). Registers the caller-supplied Meter and ActivitySource name(s). Wires standard instrumentation (ASP.NET Core, HttpClient, runtime, process). Prometheus default; OTLP opt-in.app.MapZbMetrics()— maps the Prometheus/metricsendpoint (shared path + shared exporter).AddZbSerilog(ZbTelemetryOptions)— shared Serilog two-stage bootstrap generalizing ScadaBridge'sLoggerConfigurationFactory. WiresSiteId/NodeRole/NodeHostnameenrichers from the same options object as the OTel Resource. WiresTraceContextEnricher(trace_id/span_idfromActivity.Current). PreservesReadFrom.Configurationfor sinks and explicitMinimumLevel.Isoverride.ILogRedactorseam — generalized from MxGateway'sGatewayLogRedactor. The seam is shared; the redaction policy (which fields/commands) stays per-project.- Metric naming convention:
<meter>.<subsystem>.<event>; Meter name = project namespace (ZB.MOM.WW.<ProjectName>); duration unit =s(OTel semconv).
Left per-project (not forced together):
- Application
Meter,ActivitySource, and all instrument definitions —otopcua.*,mxgateway.*,scadabridge.*instruments are owned by each repo. - Serilog sink configuration (
appsettings.jsonConsole/File templates, rolling intervals). - Per-operation/per-session correlation enrichers (
LogContextEnricherin OtOpcUa;LogContext.PushPropertyscope in MxGateway after migration). - Redaction policies (
MxGatewayLogRedactorimplementsILogRedactorwith gateway-specific command/field rules). - Config section paths for
SiteId/NodeRole/NodeHostname— each project binds these from its own config hierarchy and passes the resolved values toAddZbTelemetry/AddZbSerilog.
Package structure
ZB.MOM.WW.Telemetry ships as two dependency-split packages:
| Package | Contents | Consumers |
|---|---|---|
ZB.MOM.WW.Telemetry |
AddZbTelemetry, ZbTelemetryOptions, Resource builder, standard instrumentation, Prometheus/OTLP exporters, app.MapZbMetrics() |
All three |
ZB.MOM.WW.Telemetry.Serilog |
AddZbSerilog, shared enrichers (SiteId/NodeRole/NodeHostname/TraceContextEnricher), ILogRedactor seam |
All three (Serilog users); MxGateway on migration |
Both packages share ZbTelemetryOptions as the single options object that drives Resource
attributes, Serilog enrichers, Meter/ActivitySource names, and exporter selection — the unifying
hinge that makes a metric, a span, and a log line from the same node carry identical dimensions.
Component status
Status: Built @ 0.1.0 and published to the Gitea NuGet feed. Adopted across all three apps on
2026-06-01 (OtOpcUa, MxAccessGateway, ScadaBridge — branch feat/adopt-zb-telemetry per repo).
The MxAccessGateway MEL→Serilog migration and metrics export both landed in this pass (they were
not actually done beforehand despite an earlier claim). Per-repo result + deferred follow-ons:
GAPS.md → Adoption status — 2026-06-01.
The shared library lives at
~/Desktop/scadaproj/ZB.MOM.WW.Telemetry/ (.NET 10; 2 packages —
ZB.MOM.WW.Telemetry and ZB.MOM.WW.Telemetry.Serilog; 19 tests; dotnet pack → 2 nupkgs @ 0.1.0).
Build/test/pack from ZB.MOM.WW.Telemetry/:
dotnet test ZB.MOM.WW.Telemetry.slnx
dotnet pack ZB.MOM.WW.Telemetry.slnx -c Release -o ./artifacts