Files
scadaproj/components/observability/current-state/scadabridge/CURRENT-STATE.md
T
Joseph Doherty 7d243890ed docs(observability): spec + METRIC-CONVENTIONS + ZB.MOM.WW.Telemetry shared contract
Author the three normalization docs for the observability component:
- components/observability/spec/SPEC.md — Section 0 scope (normalized vs. per-project),
  AddZbTelemetry pipeline, shared Resource attribute set, standard instrumentation baseline,
  exporter conventions, Serilog two-stage bootstrap with identity enrichers and
  TraceContextEnricher, ILogRedactor redaction seam, per-project migration table, and
  acceptance criteria.
- components/observability/spec/METRIC-CONVENTIONS.md — meter naming convention (app
  namespace; MxGateway.Server flagged as convergence target), instrument naming pattern
  (<app>.<subsystem>.<event>), mandatory duration unit = seconds (MxGateway ms histograms
  flagged), Resource attribute set table, standard instrumentation baseline, and per-app
  instrument tables (OtOpcUa 7 instruments + 2 spans; MxGateway 13 counters / 3 histograms
  / 4 gauges; ScadaBridge TBD).
- components/observability/shared-contract/ZB.MOM.WW.Telemetry.md — paper API for the two
  packages: ZbTelemetryOptions, ZbExporter enum, AddZbTelemetry (IHostApplicationBuilder +
  IServiceCollection overloads), ZbResource.Build, MapZbMetrics; AddZbSerilog,
  ZbLogEnricherNames constants, TraceContextEnricher, ILogRedactor, RedactionEnricher.
  Consumer matrix and open contract questions included.
2026-06-01 07:19:38 -04:00

8.2 KiB
Raw Blame History

Observability — current state: ScadaBridge

Repo: ~/Desktop/ScadaBridge. Stack: .NET 10, Akka.NET, Docker; solution ZB.MOM.WW.ScadaBridge.slnx. The telemetry posture is split across a dangling OTel package ref (metrics/traces) and a substantive Serilog setup (logs). All paths relative to repo root. Verified 2026-06-01.

Structurally the cleanest logging enricher set in the family — SiteId / NodeRole / NodeHostname are already first-class Serilog enricher properties — but the weakest on metrics/tracing: zero instrumentation. The OpenTelemetry.Api package reference is a CVE-patch artefact, not instrumentation.

1. Metrics and traces (absent)

OpenTelemetry.Api — CVE-patch ref, not instrumentation

src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj:

  • :31<PackageReference Include="OpenTelemetry.Api" /> — a direct version override added to satisfy GHSA-g94r-2vxg-569j / GHSA-8785-wc3w-h8q6 (OpenTelemetry 1.9.0 CVEs introduced via Akka.Hosting's pinned transitive dependency).

There is no AddOpenTelemetry() call in the solution. No Meter is created. No ActivitySource is declared. No exporter is configured. The package reference solely overrides the transitive version — it has no runtime effect on observability.

Instrument coverage

Zero application instruments. There is no custom Meter, no counter, no histogram, no gauge, and no span in the ScadaBridge codebase. This is the largest gap in the family.

2. Logging (Serilog — strongest enricher set)

Two-stage bootstrap

src/ZB.MOM.WW.ScadaBridge.Host/Program.cs:

  • :2754 — two-stage Serilog bootstrap: an initial logger is created for startup messages before the host is built; the full logger replaces it during UseSerilog.

LoggerConfigurationFactory.cs

src/ZB.MOM.WW.ScadaBridge.Host/LoggerConfigurationFactory.cs:

Full factory method signature: Build(IConfiguration config, string nodeRole, string siteId, string nodeHostname).

  • :62 — reads ScadaBridge:Logging:MinimumLevel from configuration.
  • :84ReadFrom.Configuration(config) pulls sink configuration from appsettings.json.
  • :85 — explicit MinimumLevel.Is(...) override from the typed option.
  • :8688 — three structural enrichers:
    • .Enrich.WithProperty("SiteId", siteId) — site identifier (e.g. "site-a").
    • .Enrich.WithProperty("NodeHostname", nodeHostname) — node hostname.
    • .Enrich.WithProperty("NodeRole", nodeRole) — Akka cluster role (e.g. "central", "site").

These three properties are the cleanest and most complete set in the family. ScadaBridge's property names (SiteId / NodeRole / NodeHostname) are also the ones the shared AddZbTelemetry options object maps onto site.id / node.role / host.name OTel Resource attributes — no renaming needed on adoption.

Sink configuration

appsettings.json:323 — Serilog sinks configured via ReadFrom.Configuration:

  • Console sink with output template that includes [{NodeRole}/{NodeHostname}].
  • File sink (path in config; rolling interval).

LoggingOptions.cs

src/ZB.MOM.WW.ScadaBridge.Host/LoggingOptions.cs:

  • MinimumLevel — config-bound minimum level; default Information.

Missing elements

  • No custom enrichers beyond the three structural properties. LogContextEnricher (OtOpcUa's driver-correlation enricher) has no equivalent; MxGateway's per-session correlation scope has no equivalent. Per-request/per-operation correlation is not present.
  • No trace_id / span_id enricher. As with the other two projects, log lines do not carry trace context. Because ScadaBridge has zero ActivitySource instrumentation, this is consistent — but it means no trace↔log correlation path exists even hypothetically.

3. Signal summary

Signal Provider Export Resource / service.name
Metrics none none none
Traces none none none
Logs Serilog Console + file (appsettings.json) none (no service.name property)
Trace↔log correlation absent (no ActivitySource; no enricher)

4. Notable design choices

  • SiteId / NodeRole / NodeHostname as first-class enrichers — unlike OtOpcUa's driver- scoped LogContextEnricher, ScadaBridge's structural enrichers are attached at logger creation and appear on every log line from the process. This is the target pattern for the shared bootstrap.
  • nodeRole + siteId passed into the factory — ScadaBridge's LoggerConfigurationFactory.Build takes these as constructor arguments rather than reading them from a registered options object. The shared AddZbSerilog approach binds them from the same ZbTelemetryOptions used for the OTel Resource, unifying the source.
  • Config-driven MinimumLevelScadaBridge:Logging:MinimumLevel is a typed config path; ReadFrom.Configuration for sinks. The shared bootstrap's AddZbSerilog must support the same pattern.
  • No custom enrichers — ScadaBridge's logging is intentionally minimal on operation-scoped context. Correlation in the distributed model is provided by structured log fields from Akka actor context, not a log enricher pipeline.
  • CVE-patch ref discipline — the OpenTelemetry.Api pin is a responsible CVE response but leaves the telemetry story incomplete. On adoption, the CVE pin is superseded by the full OTel SDK pulled in by AddZbTelemetry; the explicit <PackageReference> override can be removed.

Adoption plan → ZB.MOM.WW.Telemetry

Replace CVE-patch ref with full OTel SDK via AddZbTelemetry:

  • Remove the lone OpenTelemetry.Api override from src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj:31.
  • Add builder.AddZbTelemetry(o => { o.ServiceName = "scadabridge"; o.SiteId = cfg.SiteId; o.NodeRole = cfg.NodeRole; o.Meters = ["ZB.MOM.WW.ScadaBridge"]; }). The full OTel SDK supersedes the transitive version override; the CVE is resolved transitively via the SDK's current dependency.

Add first application instruments:

  • Define a ScadaBridgeTelemetry class (mirror OtOpcUaTelemetry) with a Meter named "ZB.MOM.WW.ScadaBridge" and an initial set of instruments covering the most observable operations: site connection lifecycle, alarm received, data-change received, actor supervision events. Naming convention: scadabridge.<subsystem>.<event>.
  • Register the meter name in AddZbTelemetry options. Expose /metrics via app.MapZbMetrics(). ScadaBridge goes from zero instrumentation to a baseline exportable set.

Adopt AddZbSerilog:

  • Replace the LoggerConfigurationFactory.Build(config, nodeRole, siteId, nodeHostname) call in Program.cs:2754 with builder.AddZbSerilog(o => { o.ServiceName = "scadabridge"; o.SiteId = cfg.SiteId; o.NodeRole = cfg.NodeRole; o.NodeHostname = cfg.NodeHostname; }). The three enrichers (SiteId, NodeRole, NodeHostname) are now provided by the shared AddZbSerilog path; LoggerConfigurationFactory can be deleted.
  • ReadFrom.Configuration for sinks and MinimumLevel.Is override from config are preserved inside AddZbSerilog — behavior is unchanged.
  • The TraceContextEnricher is wired automatically by AddZbSerilog; once application instruments are added (above), trace_id / span_id will appear on log lines emitted during spans.

Keep bespoke:

  • LoggingOptions.cs — the MinimumLevel typed option and its config path (ScadaBridge:Logging:MinimumLevel) remain; AddZbSerilog must accept the minimum-level override from configuration. The config path stays ScadaBridge's own.
  • Console output template including [{NodeRole}/{NodeHostname}] — driven by appsettings.json; no change.
  • Akka actor-context log fields — per-operation context emitted by Akka infrastructure; not an enricher concern.
  • ZB.MOM.WW.ScadaBridge.Host.csproj package set otherwise — no other changes to the project file.

Adoption is a follow-on task (tracked in GAPS.md), not part of the ZB.MOM.WW.Telemetry library build. Adding instruments and adopting AddZbSerilog/AddZbTelemetry lands in the ScadaBridge repo as a separate commit once the nupkg is available.