6.9 KiB
ZB.MOM.WW.Telemetry — Follow-ons Implementation Plan
Continuation of
2026-06-01-telemetry-library-adoption.md. Executes the deferred follow-ons recorded incomponents/observability/GAPS.md, all four groups selected by the user.
Goal: Close the recorded telemetry follow-ons across the three apps — additive/hygiene fixes, MxGateway metric normalization, ScadaBridge first application instruments, and OTLP opt-in.
Branches: new feat/telemetry-followons per repo (off the now-updated default). Commit per task,
never skip hooks, never force-push. The three repo phases are independent (parallel); within a repo,
sequential.
Behaviour bar: additive/opt-in by default (Prometheus stays the default exporter; new instruments
are new series; the MxGateway ms→s + rename are the one intentional metric-shape change, safe
because those series were never Prometheus-exported before the adoption).
OtOpcUa (branch feat/telemetry-followons off master)
Task O-A2: align Serilog to the 10.x line
Classification: small · Files: Directory.Packages.props
Bump Serilog.AspNetCore, Serilog.Extensions.Hosting, Serilog.Settings.Configuration from
9.0.0 → 10.0.0 (ScadaBridge already runs 10.0.0 with Serilog 4.x, so 10.x is 4.x-compatible —
no Serilog 5 needed). Keep Serilog 4.3.0 (or bump to 4.3.1 to match ScadaBridge). Restore + build
ZB.MOM.WW.OtOpcUa.slnx; run --filter LogContextEnricherTests. Commit.
Task O-D: OTLP exporter opt-in (config-driven)
Classification: standard · Parallelizable with: O-A2 (disjoint files)
Files: src/Server/.../Observability/ObservabilityExtensions.cs, src/Server/.../Program.cs:138
Refactor AddOtOpcUaObservability to accept IConfiguration and read
OtOpcUa:Telemetry:Exporter (Prometheus|Otlp, default Prometheus) + OtOpcUa:Telemetry:OtlpEndpoint;
set o.Exporter/o.OtlpEndpoint accordingly. Update the call site to
builder.Services.AddOtOpcUaObservability(builder.Configuration). Default (no config) stays Prometheus.
This also makes OtOpcUa's recorded spans exportable when OTLP is configured (resolves the trace no-op).
Build; run OtOpcUaTelemetryHookTests. Commit.
MxAccessGateway (branch feat/telemetry-followons off main)
Task M-A3: gitignore stray doc artifacts
Classification: trivial · Files: .gitignore
Append a # Documentation review artifacts block ignoring *-docs-issues.md, *-docs-fixed.md,
*-docs-final.md (the 5 untracked *-docs-*.md files are CommentChecker "Documentation Analysis
Report" output). Commit. (Do NOT delete the files — just ignore.)
Task M-B: metric normalization (ms→s + meter rename)
Classification: standard · Files: src/.../Metrics/GatewayMetrics.cs, test if needed
- Rename
MeterNameconst"MxGateway.Server"→"ZB.MOM.WW.MxGateway". (AddZbTelemetry uses the const, so it follows automatically; no test asserts the literal;GatewayMetricsTestsfilter by meter instance, not name.) - Change the 3 histograms' unit
"ms"→"s"(CreateHistogram lines) and their 4 record sites.TotalMilliseconds→.TotalSeconds. The snapshot/dashboard do NOT read these histograms, so no read-path impact. CheckGatewayMetricsTestsfor any histogram-value assertion in ms and update. Build the Server project; run--filter "GatewayMetricsTests|GatewayApplicationTests". Commit.
Task M-D: OTLP exporter opt-in
Classification: small · Files: src/.../GatewayApplication.cs (the AddZbTelemetry lambda)
In the AddZbTelemetry lambda, read MxGateway:Telemetry:Exporter + MxGateway:Telemetry:OtlpEndpoint
from builder.Configuration (in scope) and set o.Exporter/o.OtlpEndpoint. Default Prometheus. Build.
Commit. (Sequential after M-B — both touch GatewayApplication.cs / metrics area.)
ScadaBridge (branch feat/telemetry-followons off main)
Task S-A1: site-node HTTP/1.1 /metrics listener
Classification: standard · Files: src/.../NodeOptions.cs, src/.../Program.cs (Site Kestrel)
Add MetricsPort (default 8082) to NodeOptions. In the Site block's ConfigureKestrel, add a
second ListenAnyIP(metricsPort, lo => lo.Protocols = Http1AndHttp2) alongside the existing HTTP/2-only
gRPC-port listener, so the already-mapped /metrics becomes scrapable over HTTP/1.1 on site nodes.
Read the port from ScadaBridge:Node:MetricsPort (default 8082). Build; existing Host.Tests stay green.
Commit.
Task S-D: OTLP exporter opt-in
Classification: small · Files: src/.../SiteServiceRegistration.cs (the AddZbTelemetry lambda)
In BindSharedOptions, read ScadaBridge:Telemetry:Exporter + ScadaBridge:Telemetry:OtlpEndpoint
from config (in scope) and set o.Exporter/o.OtlpEndpoint. Default Prometheus. Build. Commit.
(Sequential after S-C0 — both edit the AddZbTelemetry call.)
Task S-C0: ScadaBridgeTelemetry meter + registration
Classification: standard · Files: Create src/ZB.MOM.WW.ScadaBridge.Commons/Observability/ScadaBridgeTelemetry.cs; edit SiteServiceRegistration.cs (AddZbTelemetry Meters)
Create a ScadaBridgeTelemetry static class: Meter "ZB.MOM.WW.ScadaBridge" + the four instruments
(scadabridge.deployments.applied counter; scadabridge.store_and_forward.queue.depth observable
gauge; scadabridge.inbound_api.requests counter; scadabridge.site.connection.up up/down gauge) with
thin static emit helpers. Register o.Meters = ["ZB.MOM.WW.ScadaBridge"] in the AddZbTelemetry call.
Build. Commit. (Precedes C1–C4.)
Tasks S-C1…S-C4: wire the four emit points
Classification: standard each · depend on S-C0
- S-C1
deployments.applied— increment on the DeploymentManager/DeploymentService success path. - S-C2
store_and_forward.queue.depth— observable-gauge callback reading the StoreAndForward depth (SQLiteCOUNT/existing depth accessor). - S-C3
inbound_api.requests— increment (tag = method) in the InboundAPI endpoint filter/middleware. - S-C4
site.connection.up— +1 on site-stream open, −1 on close in the Communication/SiteStream gRPC server. Each implementer finds the cleanest emit point and STOPs + reports if no clean point exists rather than forcing a fragile edit. Add a focused test where practical. Build; commit per instrument.
scadaproj bookkeeping
Task Z: update GAPS.md
Classification: trivial · Files: components/observability/GAPS.md
Move the handled follow-ons (#6/#7 done; A1 site-listener done; #9 first instruments done; #10/#11 OTLP
opt-in done) from "Deferred" to a "Follow-ons — DONE 2026-06-01" subsection; note what each app now does.
Commit + (on user request) push all branches/merges.
Sequencing
After each repo branch is cut: OtOpcUa {O-A2 ∥ O-D}; MxGateway {M-A3 → M-B → M-D}; ScadaBridge {S-A1 ∥ (S-C0 → {S-C1, S-C2, S-C3, S-C4} → S-D)}. Repos run in parallel. Z + merge/push last.