Compare commits
49 Commits
b95c413c08
...
18e4b70572
| Author | SHA1 | Date | |
|---|---|---|---|
| 18e4b70572 | |||
| a09cc02d46 | |||
| 88c557dee8 | |||
| 8311912f40 | |||
| f569d537d1 | |||
| f1240c0bd4 | |||
| 37fb84f477 | |||
| c284e4d68d | |||
| 2b856074d5 | |||
| 70f91a855a | |||
| 1344f249d0 | |||
| 7f05107c1d | |||
| 3e4d4369bf | |||
| 4126e1df54 | |||
| 215a646e35 | |||
| 453ec7358d | |||
| 645388b1f1 | |||
| a1c3d5ec81 | |||
| 3934e528f2 | |||
| fba3d09eed | |||
| 7d243890ed | |||
| 54654a49af | |||
| 76295695ee | |||
| 6588e15f57 | |||
| 0c087d150d | |||
| 69c1be943e | |||
| ef234d3574 | |||
| 8f0b70d12f | |||
| 1c2b23cbbb | |||
| edbc79204f | |||
| a7a8f1e493 | |||
| aa2251b93d | |||
| cf277eb7df | |||
| 9c8c1431af | |||
| 02cc687556 | |||
| e498bb7c5a | |||
| 2dbedce0ac | |||
| 25dd328280 | |||
| 1ab2f32e8e | |||
| 5b82d68ea9 | |||
| d1b837e718 | |||
| 5fb579c2f0 | |||
| 18be42d0e2 | |||
| 07d5907258 | |||
| 16540b3001 | |||
| 3d25ee5090 | |||
| 1dc35a8c43 | |||
| c77df2a2cd | |||
| 29b309c6c1 |
@@ -6,9 +6,11 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
|
||||
`scadaproj` is primarily an umbrella/index workspace that aggregates a family of
|
||||
related SCADA / OT / Wonderware / OPC UA "sister projects" that live as **sibling
|
||||
directories under `~/Desktop/`**. It now also **hosts two pieces of source itself** —
|
||||
the shared [`ZB.MOM.WW.Auth/`](ZB.MOM.WW.Auth/) library and the shared
|
||||
[`ZB.MOM.WW.Theme/`](ZB.MOM.WW.Theme/) UI kit — both the realized output of their
|
||||
directories under `~/Desktop/`**. It now also **hosts four pieces of source itself** —
|
||||
the shared [`ZB.MOM.WW.Auth/`](ZB.MOM.WW.Auth/) library, the shared
|
||||
[`ZB.MOM.WW.Theme/`](ZB.MOM.WW.Theme/) UI kit, the shared
|
||||
[`ZB.MOM.WW.Health/`](ZB.MOM.WW.Health/) health-check library, and the shared
|
||||
[`ZB.MOM.WW.Telemetry/`](ZB.MOM.WW.Telemetry/) observability library — all the realized output of their
|
||||
respective component normalizations (see [Component normalization](#component-normalization)).
|
||||
The point of this file is to give a high-level scan of each sister project — its purpose,
|
||||
location, stack, and primary commands — so a fresh Claude Code session can orient across
|
||||
@@ -119,6 +121,9 @@ each project's **code-verified current state**, and the **gaps** between. See
|
||||
|---|---|---|---|---|
|
||||
| Auth (login / identity / authz) | Built (lib `0.1.0`) | Shared `ZB.MOM.WW.Auth` lib | [`components/auth/`](components/auth/) | [`ZB.MOM.WW.Auth/`](ZB.MOM.WW.Auth/) |
|
||||
| UI Theme (layout / tokens / components) | Built (lib `0.1.0`) | Shared `ZB.MOM.WW.Theme` RCL | [`components/ui-theme/`](components/ui-theme/) | [`ZB.MOM.WW.Theme/`](ZB.MOM.WW.Theme/) |
|
||||
| Health (readiness / liveness / active-node) | Built (lib `0.1.0`) | Shared `ZB.MOM.WW.Health` lib | [`components/health/`](components/health/) | [`ZB.MOM.WW.Health/`](ZB.MOM.WW.Health/) |
|
||||
| Observability (metrics / traces / logs) | Built (lib `0.1.0`) | Shared `ZB.MOM.WW.Telemetry` lib + `.Serilog` | [`components/observability/`](components/observability/) | [`ZB.MOM.WW.Telemetry/`](ZB.MOM.WW.Telemetry/) |
|
||||
| Audit (event model + writer seam) | Built (lib `0.1.0`) | Shared `ZB.MOM.WW.Audit` lib | [`components/audit/`](components/audit/) | [`ZB.MOM.WW.Audit/`](ZB.MOM.WW.Audit/) |
|
||||
|
||||
The auth component is fully populated: a normalized [`spec`](components/auth/spec/SPEC.md), a
|
||||
proposed [`shared-contract`](components/auth/shared-contract/ZB.MOM.WW.Auth.md), three
|
||||
@@ -149,6 +154,57 @@ The implementation plan is at
|
||||
Build/test from `ZB.MOM.WW.Theme/`: `dotnet test`. Consumer matrix: all three apps consume
|
||||
the single `ZB.MOM.WW.Theme` package (OtOpcUa AdminUI, MxGateway Server, ScadaBridge Host + CentralUI).
|
||||
|
||||
The health component is fully populated: a normalized [`spec`](components/health/spec/SPEC.md), a
|
||||
[`shared-contract`](components/health/shared-contract/ZB.MOM.WW.Health.md), three
|
||||
[`current-state`](components/health/current-state/) docs, and an adoption [`GAPS`](components/health/GAPS.md)
|
||||
backlog. Shared = three-tier endpoint convention (ready/active/healthz) + canonical JSON writer +
|
||||
`IActiveNodeGate` seam + `GrpcDependencyHealthCheck` + `AkkaClusterHealthCheck` + `ActiveNodeHealthCheck`
|
||||
+ `DatabaseHealthCheck<TContext>`; left per-project = which probes each app registers,
|
||||
orchestrator wiring, and ScadaBridge's distributed health-monitoring pipeline.
|
||||
|
||||
The shared library is **built and lives in this repo** at [`ZB.MOM.WW.Health/`](ZB.MOM.WW.Health/)
|
||||
(.NET 10; 3 packages — `ZB.MOM.WW.Health`, `ZB.MOM.WW.Health.Akka`, `ZB.MOM.WW.Health.EntityFrameworkCore`;
|
||||
58 tests; `dotnet pack` → 3 nupkgs @ 0.1.0).
|
||||
**Not yet adopted** by the three apps — that's the follow-on tracked in [`components/health/GAPS.md`](components/health/GAPS.md).
|
||||
Build/test from `ZB.MOM.WW.Health/`: `dotnet test`. Consumer matrix: MxAccessGateway → core only;
|
||||
OtOpcUa & ScadaBridge → all three packages.
|
||||
|
||||
The observability component is fully populated: a normalized [`spec`](components/observability/spec/SPEC.md),
|
||||
a [`metric-conventions`](components/observability/spec/METRIC-CONVENTIONS.md) reference, a
|
||||
[`shared-contract`](components/observability/shared-contract/ZB.MOM.WW.Telemetry.md), three
|
||||
[`current-state`](components/observability/current-state/) docs, and an adoption [`GAPS`](components/observability/GAPS.md)
|
||||
backlog. Shared = OTel Resource (service.name/site.id/node.role identity triple) + standard instrumentation
|
||||
(ASP.NET Core, HttpClient, gRPC client, runtime, process) + Prometheus always-on exporter + OTLP opt-in
|
||||
+ Serilog two-stage bootstrap + SiteId/NodeRole/NodeHostname enrichers + TraceContextEnricher (trace_id/span_id)
|
||||
+ ILogRedactor seam; left per-project = application Meters/ActivitySources, sink config, per-operation
|
||||
enrichers, and redaction policies.
|
||||
|
||||
The shared library is **built and lives in this repo** at [`ZB.MOM.WW.Telemetry/`](ZB.MOM.WW.Telemetry/)
|
||||
(.NET 10; 2 packages — `ZB.MOM.WW.Telemetry`, `ZB.MOM.WW.Telemetry.Serilog`; 19 tests;
|
||||
`dotnet pack` → 2 nupkgs @ 0.1.0). **MxAccessGateway logging adopted** (MEL → Serilog migration done on
|
||||
its own branch) — the one in-pass adoption. Broader OtOpcUa and ScadaBridge telemetry adoption is
|
||||
follow-on, tracked in [`components/observability/GAPS.md`](components/observability/GAPS.md).
|
||||
Build/test from `ZB.MOM.WW.Telemetry/`: `dotnet test`. Consumer matrix: all three apps consume both
|
||||
packages after adoption (OtOpcUa, MxGateway Server, ScadaBridge Host + any instrumented project).
|
||||
|
||||
The audit component is fully populated: a normalized [`spec`](components/audit/spec/SPEC.md), an
|
||||
[`event-model`](components/audit/spec/EVENT-MODEL.md) reference, a
|
||||
[`shared-contract`](components/audit/shared-contract/ZB.MOM.WW.Audit.md), three
|
||||
[`current-state`](components/audit/current-state/) docs, and an adoption [`GAPS`](components/audit/GAPS.md)
|
||||
backlog. Common ground = canonical `AuditEvent` record + `AuditOutcome` enum + `IAuditWriter` /
|
||||
`IAuditRedactor` seams + helpers (`NullAuditRedactor`, `TruncatingAuditRedactor`, `NoOpAuditWriter`,
|
||||
`CompositeAuditWriter`, `RedactingAuditWriter`) + `AddZbAudit` DI registration; left per-project =
|
||||
transport/storage and domain vocabulary. Closes the loop on Auth — audit's `Actor` field = the Auth
|
||||
principal. `IAuditRedactor` is aligned with Telemetry's `ILogRedactor` seam convention.
|
||||
|
||||
The shared library is **built and lives in this repo** at [`ZB.MOM.WW.Audit/`](ZB.MOM.WW.Audit/)
|
||||
(.NET 10; 1 package — `ZB.MOM.WW.Audit`; only non-BCL dependency `Microsoft.Extensions.DependencyInjection.Abstractions`;
|
||||
19 tests; `dotnet pack` → 1 nupkg @ 0.1.0). Repo: `https://gitea.dohertylan.com/dohertj2/zb-mom-ww-audit`.
|
||||
**Not yet adopted** by the three apps — that's the follow-on tracked in [`components/audit/GAPS.md`](components/audit/GAPS.md).
|
||||
Build/test from `ZB.MOM.WW.Audit/`: `dotnet test`. Consumer matrix: all three apps consume the single
|
||||
`ZB.MOM.WW.Audit` package (OtOpcUa, MxAccessGateway, ScadaBridge each map their own audit record/seam
|
||||
onto the canonical type at the emit boundary).
|
||||
|
||||
## Per-project primary commands
|
||||
|
||||
Run these from inside each project directory (not from `scadaproj`).
|
||||
|
||||
@@ -0,0 +1,482 @@
|
||||
## Ignore Visual Studio temporary files, build results, and
|
||||
## files generated by popular Visual Studio add-ons.
|
||||
##
|
||||
## Get latest from `dotnet new gitignore`
|
||||
|
||||
# dotenv files
|
||||
.env
|
||||
|
||||
# User-specific files
|
||||
*.rsuser
|
||||
*.suo
|
||||
*.user
|
||||
*.userosscache
|
||||
*.sln.docstates
|
||||
|
||||
# User-specific files (MonoDevelop/Xamarin Studio)
|
||||
*.userprefs
|
||||
|
||||
# Mono auto generated files
|
||||
mono_crash.*
|
||||
|
||||
# Build results
|
||||
[Dd]ebug/
|
||||
[Dd]ebugPublic/
|
||||
[Rr]elease/
|
||||
[Rr]eleases/
|
||||
x64/
|
||||
x86/
|
||||
[Ww][Ii][Nn]32/
|
||||
[Aa][Rr][Mm]/
|
||||
[Aa][Rr][Mm]64/
|
||||
bld/
|
||||
[Bb]in/
|
||||
[Oo]bj/
|
||||
[Ll]og/
|
||||
[Ll]ogs/
|
||||
|
||||
# Visual Studio 2015/2017 cache/options directory
|
||||
.vs/
|
||||
# Uncomment if you have tasks that create the project's static files in wwwroot
|
||||
#wwwroot/
|
||||
|
||||
# Visual Studio 2017 auto generated files
|
||||
Generated\ Files/
|
||||
|
||||
# MSTest test Results
|
||||
[Tt]est[Rr]esult*/
|
||||
[Bb]uild[Ll]og.*
|
||||
|
||||
# NUnit
|
||||
*.VisualState.xml
|
||||
TestResult.xml
|
||||
nunit-*.xml
|
||||
|
||||
# Build Results of an ATL Project
|
||||
[Dd]ebugPS/
|
||||
[Rr]eleasePS/
|
||||
dlldata.c
|
||||
|
||||
# Benchmark Results
|
||||
BenchmarkDotNet.Artifacts/
|
||||
|
||||
# .NET
|
||||
project.lock.json
|
||||
project.fragment.lock.json
|
||||
artifacts/
|
||||
|
||||
# Tye
|
||||
.tye/
|
||||
|
||||
# ASP.NET Scaffolding
|
||||
ScaffoldingReadMe.txt
|
||||
|
||||
# StyleCop
|
||||
StyleCopReport.xml
|
||||
|
||||
# Files built by Visual Studio
|
||||
*_i.c
|
||||
*_p.c
|
||||
*_h.h
|
||||
*.ilk
|
||||
*.meta
|
||||
*.obj
|
||||
*.iobj
|
||||
*.pch
|
||||
*.pdb
|
||||
*.ipdb
|
||||
*.pgc
|
||||
*.pgd
|
||||
*.rsp
|
||||
# but not Directory.Build.rsp, as it configures directory-level build defaults
|
||||
!Directory.Build.rsp
|
||||
*.sbr
|
||||
*.tlb
|
||||
*.tli
|
||||
*.tlh
|
||||
*.tmp
|
||||
*.tmp_proj
|
||||
*_wpftmp.csproj
|
||||
*.log
|
||||
*.tlog
|
||||
*.vspscc
|
||||
*.vssscc
|
||||
.builds
|
||||
*.pidb
|
||||
*.svclog
|
||||
*.scc
|
||||
|
||||
# Chutzpah Test files
|
||||
_Chutzpah*
|
||||
|
||||
# Visual C++ cache files
|
||||
ipch/
|
||||
*.aps
|
||||
*.ncb
|
||||
*.opendb
|
||||
*.opensdf
|
||||
*.sdf
|
||||
*.cachefile
|
||||
*.VC.db
|
||||
*.VC.VC.opendb
|
||||
|
||||
# Visual Studio profiler
|
||||
*.psess
|
||||
*.vsp
|
||||
*.vspx
|
||||
*.sap
|
||||
|
||||
# Visual Studio Trace Files
|
||||
*.e2e
|
||||
|
||||
# TFS 2012 Local Workspace
|
||||
$tf/
|
||||
|
||||
# Guidance Automation Toolkit
|
||||
*.gpState
|
||||
|
||||
# ReSharper is a .NET coding add-in
|
||||
_ReSharper*/
|
||||
*.[Rr]e[Ss]harper
|
||||
*.DotSettings.user
|
||||
|
||||
# TeamCity is a build add-in
|
||||
_TeamCity*
|
||||
|
||||
# DotCover is a Code Coverage Tool
|
||||
*.dotCover
|
||||
|
||||
# AxoCover is a Code Coverage Tool
|
||||
.axoCover/*
|
||||
!.axoCover/settings.json
|
||||
|
||||
# Coverlet is a free, cross platform Code Coverage Tool
|
||||
coverage*.json
|
||||
coverage*.xml
|
||||
coverage*.info
|
||||
|
||||
# Visual Studio code coverage results
|
||||
*.coverage
|
||||
*.coveragexml
|
||||
|
||||
# NCrunch
|
||||
_NCrunch_*
|
||||
.*crunch*.local.xml
|
||||
nCrunchTemp_*
|
||||
|
||||
# MightyMoose
|
||||
*.mm.*
|
||||
AutoTest.Net/
|
||||
|
||||
# Web workbench (sass)
|
||||
.sass-cache/
|
||||
|
||||
# Installshield output folder
|
||||
[Ee]xpress/
|
||||
|
||||
# DocProject is a documentation generator add-in
|
||||
DocProject/buildhelp/
|
||||
DocProject/Help/*.HxT
|
||||
DocProject/Help/*.HxC
|
||||
DocProject/Help/*.hhc
|
||||
DocProject/Help/*.hhk
|
||||
DocProject/Help/*.hhp
|
||||
DocProject/Help/Html2
|
||||
DocProject/Help/html
|
||||
|
||||
# Click-Once directory
|
||||
publish/
|
||||
|
||||
# Publish Web Output
|
||||
*.[Pp]ublish.xml
|
||||
*.azurePubxml
|
||||
# Note: Comment the next line if you want to checkin your web deploy settings,
|
||||
# but database connection strings (with potential passwords) will be unencrypted
|
||||
*.pubxml
|
||||
*.publishproj
|
||||
|
||||
# Microsoft Azure Web App publish settings. Comment the next line if you want to
|
||||
# checkin your Azure Web App publish settings, but sensitive information contained
|
||||
# in these scripts will be unencrypted
|
||||
PublishScripts/
|
||||
|
||||
# NuGet Packages
|
||||
*.nupkg
|
||||
# NuGet Symbol Packages
|
||||
*.snupkg
|
||||
# The packages folder can be ignored because of Package Restore
|
||||
**/[Pp]ackages/*
|
||||
# except build/, which is used as an MSBuild target.
|
||||
!**/[Pp]ackages/build/
|
||||
# Uncomment if necessary however generally it will be regenerated when needed
|
||||
#!**/[Pp]ackages/repositories.config
|
||||
# NuGet v3's project.json files produces more ignorable files
|
||||
*.nuget.props
|
||||
*.nuget.targets
|
||||
|
||||
# Microsoft Azure Build Output
|
||||
csx/
|
||||
*.build.csdef
|
||||
|
||||
# Microsoft Azure Emulator
|
||||
ecf/
|
||||
rcf/
|
||||
|
||||
# Windows Store app package directories and files
|
||||
AppPackages/
|
||||
BundleArtifacts/
|
||||
Package.StoreAssociation.xml
|
||||
_pkginfo.txt
|
||||
*.appx
|
||||
*.appxbundle
|
||||
*.appxupload
|
||||
|
||||
# Visual Studio cache files
|
||||
# files ending in .cache can be ignored
|
||||
*.[Cc]ache
|
||||
# but keep track of directories ending in .cache
|
||||
!?*.[Cc]ache/
|
||||
|
||||
# Others
|
||||
ClientBin/
|
||||
~$*
|
||||
*~
|
||||
*.dbmdl
|
||||
*.dbproj.schemaview
|
||||
*.jfm
|
||||
*.pfx
|
||||
*.publishsettings
|
||||
orleans.codegen.cs
|
||||
|
||||
# Including strong name files can present a security risk
|
||||
# (https://github.com/github/gitignore/pull/2483#issue-259490424)
|
||||
#*.snk
|
||||
|
||||
# Since there are multiple workflows, uncomment next line to ignore bower_components
|
||||
# (https://github.com/github/gitignore/pull/1529#issuecomment-104372622)
|
||||
#bower_components/
|
||||
|
||||
# RIA/Silverlight projects
|
||||
Generated_Code/
|
||||
|
||||
# Backup & report files from converting an old project file
|
||||
# to a newer Visual Studio version. Backup files are not needed,
|
||||
# because we have git ;-)
|
||||
_UpgradeReport_Files/
|
||||
Backup*/
|
||||
UpgradeLog*.XML
|
||||
UpgradeLog*.htm
|
||||
ServiceFabricBackup/
|
||||
*.rptproj.bak
|
||||
|
||||
# SQL Server files
|
||||
*.mdf
|
||||
*.ldf
|
||||
*.ndf
|
||||
|
||||
# Business Intelligence projects
|
||||
*.rdl.data
|
||||
*.bim.layout
|
||||
*.bim_*.settings
|
||||
*.rptproj.rsuser
|
||||
*- [Bb]ackup.rdl
|
||||
*- [Bb]ackup ([0-9]).rdl
|
||||
*- [Bb]ackup ([0-9][0-9]).rdl
|
||||
|
||||
# Microsoft Fakes
|
||||
FakesAssemblies/
|
||||
|
||||
# GhostDoc plugin setting file
|
||||
*.GhostDoc.xml
|
||||
|
||||
# Node.js Tools for Visual Studio
|
||||
.ntvs_analysis.dat
|
||||
node_modules/
|
||||
|
||||
# Visual Studio 6 build log
|
||||
*.plg
|
||||
|
||||
# Visual Studio 6 workspace options file
|
||||
*.opt
|
||||
|
||||
# Visual Studio 6 auto-generated workspace file (contains which files were open etc.)
|
||||
*.vbw
|
||||
|
||||
# Visual Studio 6 auto-generated project file (contains which files were open etc.)
|
||||
*.vbp
|
||||
|
||||
# Visual Studio 6 workspace and project file (working project files containing files to include in project)
|
||||
*.dsw
|
||||
*.dsp
|
||||
|
||||
# Visual Studio 6 technical files
|
||||
*.ncb
|
||||
*.aps
|
||||
|
||||
# Visual Studio LightSwitch build output
|
||||
**/*.HTMLClient/GeneratedArtifacts
|
||||
**/*.DesktopClient/GeneratedArtifacts
|
||||
**/*.DesktopClient/ModelManifest.xml
|
||||
**/*.Server/GeneratedArtifacts
|
||||
**/*.Server/ModelManifest.xml
|
||||
_Pvt_Extensions
|
||||
|
||||
# Paket dependency manager
|
||||
.paket/paket.exe
|
||||
paket-files/
|
||||
|
||||
# FAKE - F# Make
|
||||
.fake/
|
||||
|
||||
# CodeRush personal settings
|
||||
.cr/personal
|
||||
|
||||
# Python Tools for Visual Studio (PTVS)
|
||||
__pycache__/
|
||||
*.pyc
|
||||
|
||||
# Cake - Uncomment if you are using it
|
||||
# tools/**
|
||||
# !tools/packages.config
|
||||
|
||||
# Tabs Studio
|
||||
*.tss
|
||||
|
||||
# Telerik's JustMock configuration file
|
||||
*.jmconfig
|
||||
|
||||
# BizTalk build output
|
||||
*.btp.cs
|
||||
*.btm.cs
|
||||
*.odx.cs
|
||||
*.xsd.cs
|
||||
|
||||
# OpenCover UI analysis results
|
||||
OpenCover/
|
||||
|
||||
# Azure Stream Analytics local run output
|
||||
ASALocalRun/
|
||||
|
||||
# MSBuild Binary and Structured Log
|
||||
*.binlog
|
||||
|
||||
# NVidia Nsight GPU debugger configuration file
|
||||
*.nvuser
|
||||
|
||||
# MFractors (Xamarin productivity tool) working folder
|
||||
.mfractor/
|
||||
|
||||
# Local History for Visual Studio
|
||||
.localhistory/
|
||||
|
||||
# Visual Studio History (VSHistory) files
|
||||
.vshistory/
|
||||
|
||||
# BeatPulse healthcheck temp database
|
||||
healthchecksdb
|
||||
|
||||
# Backup folder for Package Reference Convert tool in Visual Studio 2017
|
||||
MigrationBackup/
|
||||
|
||||
# Ionide (cross platform F# VS Code tools) working folder
|
||||
.ionide/
|
||||
|
||||
# Fody - auto-generated XML schema
|
||||
FodyWeavers.xsd
|
||||
|
||||
# VS Code files for those working on multiple tools
|
||||
.vscode/*
|
||||
!.vscode/settings.json
|
||||
!.vscode/tasks.json
|
||||
!.vscode/launch.json
|
||||
!.vscode/extensions.json
|
||||
*.code-workspace
|
||||
|
||||
# Local History for Visual Studio Code
|
||||
.history/
|
||||
|
||||
# Windows Installer files from build outputs
|
||||
*.cab
|
||||
*.msi
|
||||
*.msix
|
||||
*.msm
|
||||
*.msp
|
||||
|
||||
# JetBrains Rider
|
||||
*.sln.iml
|
||||
.idea/
|
||||
|
||||
##
|
||||
## Visual studio for Mac
|
||||
##
|
||||
|
||||
|
||||
# globs
|
||||
Makefile.in
|
||||
*.userprefs
|
||||
*.usertasks
|
||||
config.make
|
||||
config.status
|
||||
aclocal.m4
|
||||
install-sh
|
||||
autom4te.cache/
|
||||
*.tar.gz
|
||||
tarballs/
|
||||
test-results/
|
||||
|
||||
# content below from: https://github.com/github/gitignore/blob/main/Global/macOS.gitignore
|
||||
# General
|
||||
.DS_Store
|
||||
.AppleDouble
|
||||
.LSOverride
|
||||
|
||||
# Icon must end with two \r
|
||||
Icon
|
||||
|
||||
|
||||
# Thumbnails
|
||||
._*
|
||||
|
||||
# Files that might appear in the root of a volume
|
||||
.DocumentRevisions-V100
|
||||
.fseventsd
|
||||
.Spotlight-V100
|
||||
.TemporaryItems
|
||||
.Trashes
|
||||
.VolumeIcon.icns
|
||||
.com.apple.timemachine.donotpresent
|
||||
|
||||
# Directories potentially created on remote AFP share
|
||||
.AppleDB
|
||||
.AppleDesktop
|
||||
Network Trash Folder
|
||||
Temporary Items
|
||||
.apdisk
|
||||
|
||||
# content below from: https://github.com/github/gitignore/blob/main/Global/Windows.gitignore
|
||||
# Windows thumbnail cache files
|
||||
Thumbs.db
|
||||
ehthumbs.db
|
||||
ehthumbs_vista.db
|
||||
|
||||
# Dump file
|
||||
*.stackdump
|
||||
|
||||
# Folder config file
|
||||
[Dd]esktop.ini
|
||||
|
||||
# Recycle Bin used on file shares
|
||||
$RECYCLE.BIN/
|
||||
|
||||
# Windows Installer files
|
||||
*.cab
|
||||
*.msi
|
||||
*.msix
|
||||
*.msm
|
||||
*.msp
|
||||
|
||||
# Windows shortcuts
|
||||
*.lnk
|
||||
|
||||
# Vim temporary swap files
|
||||
*.swp
|
||||
@@ -0,0 +1,10 @@
|
||||
<Project>
|
||||
<PropertyGroup>
|
||||
<TargetFramework>net10.0</TargetFramework>
|
||||
<Nullable>enable</Nullable>
|
||||
<ImplicitUsings>enable</ImplicitUsings>
|
||||
<LangVersion>latest</LangVersion>
|
||||
<Version>0.1.0</Version>
|
||||
<ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
|
||||
</PropertyGroup>
|
||||
</Project>
|
||||
@@ -0,0 +1,15 @@
|
||||
<Project>
|
||||
<PropertyGroup>
|
||||
<ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
|
||||
</PropertyGroup>
|
||||
<ItemGroup>
|
||||
<!-- Extensions -->
|
||||
<PackageVersion Include="Microsoft.Extensions.DependencyInjection.Abstractions" Version="10.0.7" />
|
||||
<PackageVersion Include="Microsoft.Extensions.DependencyInjection" Version="10.0.7" />
|
||||
<!-- Test -->
|
||||
<PackageVersion Include="Microsoft.NET.Test.Sdk" Version="17.14.1" />
|
||||
<PackageVersion Include="xunit" Version="2.9.3" />
|
||||
<PackageVersion Include="xunit.runner.visualstudio" Version="3.1.4" />
|
||||
<PackageVersion Include="coverlet.collector" Version="6.0.4" />
|
||||
</ItemGroup>
|
||||
</Project>
|
||||
@@ -0,0 +1,59 @@
|
||||
# ZB.MOM.WW.Audit
|
||||
|
||||
Canonical audit event model, best-effort writer seam, and redactor seam for the **ZB.MOM.WW SCADA family** (OtOpcUa, MxAccessGateway, ScadaBridge). This is a **library, not a service** — it is linked directly into the consuming application at build time. Transport and storage remain per-project; only the shared record + seams live here.
|
||||
|
||||
---
|
||||
|
||||
## Packages
|
||||
|
||||
| Package | Description | Key Dependencies |
|
||||
|---|---|---|
|
||||
| `ZB.MOM.WW.Audit` | Canonical `AuditEvent` record, `AuditOutcome` enum, `IAuditWriter` + `IAuditRedactor` seams, shipped helpers (`NullAuditRedactor`, `TruncatingAuditRedactor`, `NoOpAuditWriter`, `CompositeAuditWriter`, `RedactingAuditWriter`), and `AddZbAudit` DI extension. | `Microsoft.Extensions.DependencyInjection.Abstractions` |
|
||||
|
||||
---
|
||||
|
||||
## Consumer Matrix
|
||||
|
||||
| Consumer | ZB.MOM.WW.Audit |
|
||||
|---|:---:|
|
||||
| **OtOpcUa** | yes (adoption deferred) |
|
||||
| **MxAccessGateway** | yes (adoption deferred) |
|
||||
| **ScadaBridge** | yes (adoption deferred — "align, don't replace") |
|
||||
|
||||
Adoption is tracked in `components/audit/GAPS.md` in the outer `scadaproj` workspace. Each app brings its own transport (Akka broadcast / SQLite append / SQL ingest) and domain vocabulary (channels / kinds / event-types) — those stay per-project. The shared library provides the canonical record and the two seams that decouple "what to audit" from "how to store it".
|
||||
|
||||
---
|
||||
|
||||
## Auth alignment
|
||||
|
||||
`AuditEvent.Actor` is a string today. At adoption time it SHOULD be set to the `ZB.MOM.WW.Auth` principal identifier — this is the "audit closes the loop on Auth" hinge described in the spec. No compile-time dependency on `ZB.MOM.WW.Auth` is introduced here; the alignment is by convention.
|
||||
|
||||
---
|
||||
|
||||
## Versioning
|
||||
|
||||
The single package is versioned from `Directory.Build.props`. The current release is **0.1.0**. A single version bump in `Directory.Build.props` bumps the package.
|
||||
|
||||
---
|
||||
|
||||
## Building and packing
|
||||
|
||||
```bash
|
||||
# From ZB.MOM.WW.Audit/
|
||||
dotnet build ZB.MOM.WW.Audit.slnx
|
||||
dotnet test ZB.MOM.WW.Audit.slnx
|
||||
|
||||
# Produce the NuGet package into ./artifacts/
|
||||
./build/pack.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Design documentation
|
||||
|
||||
Full design docs live in the `components/audit` folder of the SCADA project workspace:
|
||||
|
||||
- `~/Desktop/scadaproj-audit/components/audit/spec/SPEC.md` — overall audit specification
|
||||
- `~/Desktop/scadaproj-audit/components/audit/spec/EVENT-MODEL.md` — field-by-field event model + per-project mapping table
|
||||
- `~/Desktop/scadaproj-audit/components/audit/shared-contract/ZB.MOM.WW.Audit.md` — public API contract (on paper)
|
||||
- `~/Desktop/scadaproj-audit/components/audit/GAPS.md` — adoption backlog
|
||||
@@ -0,0 +1,8 @@
|
||||
<Solution>
|
||||
<Folder Name="/src/">
|
||||
<Project Path="src/ZB.MOM.WW.Audit/ZB.MOM.WW.Audit.csproj" />
|
||||
</Folder>
|
||||
<Folder Name="/tests/">
|
||||
<Project Path="tests/ZB.MOM.WW.Audit.Tests/ZB.MOM.WW.Audit.Tests.csproj" />
|
||||
</Folder>
|
||||
</Solution>
|
||||
Executable
+4
@@ -0,0 +1,4 @@
|
||||
#!/usr/bin/env bash
|
||||
# pack.sh — produce the ZB.MOM.WW.Audit NuGet package into ./artifacts.
|
||||
set -euo pipefail
|
||||
dotnet pack -c Release -o ./artifacts
|
||||
@@ -0,0 +1,50 @@
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
/// <summary>
|
||||
/// Canonical, transport-agnostic audit record — who did what, when, with what outcome.
|
||||
/// Required core + optional common fields + a <see cref="DetailsJson"/> extension bag. Each
|
||||
/// sister app maps its own record onto this; domain vocabularies (channels/kinds/event-types)
|
||||
/// map into <see cref="Action"/>/<see cref="Category"/>/<see cref="DetailsJson"/> and are not
|
||||
/// modelled here. See scadaproj/components/audit/spec/EVENT-MODEL.md.
|
||||
/// </summary>
|
||||
public sealed record AuditEvent
|
||||
{
|
||||
/// <summary>Idempotency key uniquely identifying this audit event.</summary>
|
||||
public required Guid EventId { get; init; }
|
||||
|
||||
/// <summary>When the audited action occurred. Normalized to UTC on assignment.</summary>
|
||||
/// <remarks>Participates in record value-equality as a normalized instant: two events whose
|
||||
/// <c>OccurredAtUtc</c> denote the same instant at different offsets (e.g. <c>12:00+05:00</c> and
|
||||
/// <c>07:00Z</c>) compare equal and share a hash code. Relevant to consumers that dedup/key on
|
||||
/// <see cref="AuditEvent"/> value-equality.</remarks>
|
||||
public required DateTimeOffset OccurredAtUtc
|
||||
{
|
||||
get => _occurredAtUtc;
|
||||
init => _occurredAtUtc = value.ToUniversalTime();
|
||||
}
|
||||
private readonly DateTimeOffset _occurredAtUtc;
|
||||
|
||||
/// <summary>Who performed the action (identity string; the ZB.MOM.WW.Auth principal at adoption).</summary>
|
||||
public required string Actor { get; init; }
|
||||
|
||||
/// <summary>What was done — a verb/event-type string.</summary>
|
||||
public required string Action { get; init; }
|
||||
|
||||
/// <summary>Normalized outcome.</summary>
|
||||
public required AuditOutcome Outcome { get; init; }
|
||||
|
||||
/// <summary>Optional subsystem/grouping for the action.</summary>
|
||||
public string? Category { get; init; }
|
||||
|
||||
/// <summary>Optional target of the action (resource/method/connection).</summary>
|
||||
public string? Target { get; init; }
|
||||
|
||||
/// <summary>Optional node that emitted the event.</summary>
|
||||
public string? SourceNode { get; init; }
|
||||
|
||||
/// <summary>Optional correlation id joining this row to its originating request/workflow.</summary>
|
||||
public Guid? CorrelationId { get; init; }
|
||||
|
||||
/// <summary>Optional JSON extension carrying project-specific fields.</summary>
|
||||
public string? DetailsJson { get; init; }
|
||||
}
|
||||
@@ -0,0 +1,12 @@
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
/// <summary>Normalized outcome of an audited action.</summary>
|
||||
public enum AuditOutcome
|
||||
{
|
||||
/// <summary>The action completed successfully.</summary>
|
||||
Success,
|
||||
/// <summary>The action failed due to an error.</summary>
|
||||
Failure,
|
||||
/// <summary>The action was rejected by authentication/authorization.</summary>
|
||||
Denied,
|
||||
}
|
||||
@@ -0,0 +1,21 @@
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.DependencyInjection.Extensions;
|
||||
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
/// <summary>DI helpers for ZB.MOM.WW.Audit.</summary>
|
||||
public static class AuditServiceCollectionExtensions
|
||||
{
|
||||
/// <summary>
|
||||
/// Registers safe defaults — <see cref="NullAuditRedactor"/> and <see cref="NoOpAuditWriter"/> —
|
||||
/// using TryAdd so a consumer that has already registered a real writer/redactor wins. Consumers
|
||||
/// compose <see cref="RedactingAuditWriter"/>/<see cref="CompositeAuditWriter"/> around their own sink.
|
||||
/// </summary>
|
||||
public static IServiceCollection AddZbAudit(this IServiceCollection services)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(services);
|
||||
services.TryAddSingleton<IAuditRedactor>(NullAuditRedactor.Instance);
|
||||
services.TryAddSingleton<IAuditWriter>(NoOpAuditWriter.Instance);
|
||||
return services;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,28 @@
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
/// <summary>Fans an event out to several writers. Best-effort: a failing writer does not stop the others.</summary>
|
||||
/// <remarks>A failing writer's exception is swallowed so the fan-out drains and the caller is never
|
||||
/// aborted — but <see cref="OperationCanceledException"/> is re-thrown so cancellation is honored.</remarks>
|
||||
public sealed class CompositeAuditWriter : IAuditWriter
|
||||
{
|
||||
private readonly IReadOnlyList<IAuditWriter> _inner;
|
||||
|
||||
/// <summary>Creates a composite over the given writers.</summary>
|
||||
public CompositeAuditWriter(IEnumerable<IAuditWriter> inner)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(inner);
|
||||
_inner = inner.ToArray();
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public async Task WriteAsync(AuditEvent evt, CancellationToken ct = default)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(evt);
|
||||
foreach (var writer in _inner)
|
||||
{
|
||||
try { await writer.WriteAsync(evt, ct).ConfigureAwait(false); }
|
||||
catch (OperationCanceledException) { throw; } // honor cancellation; do not swallow
|
||||
catch { /* best-effort seam: a failing writer must not stop the others or the caller */ }
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,13 @@
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
/// <summary>
|
||||
/// Filters an <see cref="AuditEvent"/> between construction and persistence — truncates oversized
|
||||
/// fields and scrubs sensitive content. Pure function: returns a filtered COPY and MUST NOT throw
|
||||
/// (over-redact on internal failure). Shaped to mirror Telemetry's <c>ILogRedactor</c> so a future
|
||||
/// ZB.MOM.WW.Hosting aggregator can wire both consistently; intentionally has no dependency on it.
|
||||
/// </summary>
|
||||
public interface IAuditRedactor
|
||||
{
|
||||
/// <summary>Apply the configured truncation/redaction policy and return a filtered copy.</summary>
|
||||
AuditEvent Apply(AuditEvent rawEvent);
|
||||
}
|
||||
@@ -0,0 +1,12 @@
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
/// <summary>
|
||||
/// Best-effort sink for <see cref="AuditEvent"/>s. Implementations MUST swallow/log internal
|
||||
/// failures rather than propagating them — a failed audit write must never abort the
|
||||
/// user-facing action that produced it.
|
||||
/// </summary>
|
||||
public interface IAuditWriter
|
||||
{
|
||||
/// <summary>Persist an audit event. Best-effort; must not throw to the caller.</summary>
|
||||
Task WriteAsync(AuditEvent evt, CancellationToken ct = default);
|
||||
}
|
||||
@@ -0,0 +1,12 @@
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
/// <summary>Writer that discards events. Default when audit is disabled, and useful in tests.</summary>
|
||||
public sealed class NoOpAuditWriter : IAuditWriter
|
||||
{
|
||||
/// <summary>Shared singleton instance.</summary>
|
||||
public static readonly NoOpAuditWriter Instance = new();
|
||||
private NoOpAuditWriter() { }
|
||||
|
||||
/// <inheritdoc />
|
||||
public Task WriteAsync(AuditEvent evt, CancellationToken ct = default) => Task.CompletedTask;
|
||||
}
|
||||
@@ -0,0 +1,12 @@
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
/// <summary>Identity redactor — returns the event unchanged. The default when no policy is configured.</summary>
|
||||
public sealed class NullAuditRedactor : IAuditRedactor
|
||||
{
|
||||
/// <summary>Shared singleton instance.</summary>
|
||||
public static readonly NullAuditRedactor Instance = new();
|
||||
private NullAuditRedactor() { }
|
||||
|
||||
/// <inheritdoc />
|
||||
public AuditEvent Apply(AuditEvent rawEvent) => rawEvent;
|
||||
}
|
||||
@@ -0,0 +1,24 @@
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
/// <summary>Decorator: applies an <see cref="IAuditRedactor"/>, then delegates to an inner <see cref="IAuditWriter"/>.</summary>
|
||||
public sealed class RedactingAuditWriter : IAuditWriter
|
||||
{
|
||||
private readonly IAuditRedactor _redactor;
|
||||
private readonly IAuditWriter _inner;
|
||||
|
||||
/// <summary>Creates the decorator around <paramref name="inner"/> using <paramref name="redactor"/>.</summary>
|
||||
public RedactingAuditWriter(IAuditRedactor redactor, IAuditWriter inner)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(redactor);
|
||||
ArgumentNullException.ThrowIfNull(inner);
|
||||
_redactor = redactor;
|
||||
_inner = inner;
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public Task WriteAsync(AuditEvent evt, CancellationToken ct = default)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(evt);
|
||||
return _inner.WriteAsync(_redactor.Apply(evt), ct);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,41 @@
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
/// <summary>
|
||||
/// Redactor that caps oversized <see cref="AuditEvent.DetailsJson"/> and <see cref="AuditEvent.Target"/>.
|
||||
/// Never throws — over-redacts (drops DetailsJson) on internal failure. The secret-field policy
|
||||
/// (which fields are sensitive) stays per-project; compose this with a project redactor as needed.
|
||||
/// </summary>
|
||||
public sealed class TruncatingAuditRedactor : IAuditRedactor
|
||||
{
|
||||
private readonly TruncatingAuditRedactorOptions _options;
|
||||
|
||||
/// <summary>Creates the redactor with the given options (defaults when null).</summary>
|
||||
public TruncatingAuditRedactor(TruncatingAuditRedactorOptions? options = null)
|
||||
=> _options = options ?? new TruncatingAuditRedactorOptions();
|
||||
|
||||
/// <inheritdoc />
|
||||
public AuditEvent Apply(AuditEvent rawEvent)
|
||||
{
|
||||
try
|
||||
{
|
||||
return rawEvent with
|
||||
{
|
||||
Target = Truncate(rawEvent.Target, _options.MaxTargetLength),
|
||||
DetailsJson = Truncate(rawEvent.DetailsJson, _options.MaxDetailsJsonLength),
|
||||
};
|
||||
}
|
||||
catch
|
||||
{
|
||||
// Hard contract: never throw. Over-redact on internal failure.
|
||||
return rawEvent with { DetailsJson = null };
|
||||
}
|
||||
}
|
||||
|
||||
private string? Truncate(string? value, int max)
|
||||
{
|
||||
if (value is null || value.Length <= max) return value;
|
||||
var marker = _options.TruncationMarker;
|
||||
if (marker.Length >= max) return marker[..max];
|
||||
return string.Concat(value.AsSpan(0, max - marker.Length), marker);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,12 @@
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
/// <summary>Caps for <see cref="TruncatingAuditRedactor"/>.</summary>
|
||||
public sealed class TruncatingAuditRedactorOptions
|
||||
{
|
||||
/// <summary>Max length of <see cref="AuditEvent.DetailsJson"/> before truncation. Default 4096.</summary>
|
||||
public int MaxDetailsJsonLength { get; set; } = 4096;
|
||||
/// <summary>Max length of <see cref="AuditEvent.Target"/> before truncation. Default 512.</summary>
|
||||
public int MaxTargetLength { get; set; } = 512;
|
||||
/// <summary>Marker appended to a truncated value. Default "…[truncated]".</summary>
|
||||
public string TruncationMarker { get; set; } = "…[truncated]";
|
||||
}
|
||||
@@ -0,0 +1,18 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
<PropertyGroup>
|
||||
<TargetFramework>net10.0</TargetFramework>
|
||||
<ImplicitUsings>enable</ImplicitUsings>
|
||||
<Nullable>enable</Nullable>
|
||||
</PropertyGroup>
|
||||
<PropertyGroup>
|
||||
<IsPackable>true</IsPackable>
|
||||
<PackageId>ZB.MOM.WW.Audit</PackageId>
|
||||
<Authors>ZB.MOM.WW</Authors>
|
||||
<Description>Canonical audit event model + best-effort writer and redactor seams for the ZB.MOM.WW SCADA family.</Description>
|
||||
<PackageProjectUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-audit</PackageProjectUrl>
|
||||
<RepositoryUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-audit</RepositoryUrl>
|
||||
</PropertyGroup>
|
||||
<ItemGroup>
|
||||
<PackageReference Include="Microsoft.Extensions.DependencyInjection.Abstractions" />
|
||||
</ItemGroup>
|
||||
</Project>
|
||||
@@ -0,0 +1,67 @@
|
||||
namespace ZB.MOM.WW.Audit.Tests;
|
||||
|
||||
public class AuditEventTests
|
||||
{
|
||||
private static AuditEvent Minimal() => new()
|
||||
{
|
||||
EventId = Guid.NewGuid(),
|
||||
OccurredAtUtc = DateTimeOffset.UtcNow,
|
||||
Actor = "alice",
|
||||
Action = "ConfigPublished",
|
||||
Outcome = AuditOutcome.Success,
|
||||
};
|
||||
|
||||
[Fact]
|
||||
public void Required_core_fields_round_trip()
|
||||
{
|
||||
var id = Guid.NewGuid();
|
||||
var evt = Minimal() with { EventId = id, Actor = "svc", Action = "ApiCall", Outcome = AuditOutcome.Denied };
|
||||
Assert.Equal(id, evt.EventId);
|
||||
Assert.Equal("svc", evt.Actor);
|
||||
Assert.Equal("ApiCall", evt.Action);
|
||||
Assert.Equal(AuditOutcome.Denied, evt.Outcome);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void OccurredAtUtc_is_normalized_to_utc()
|
||||
{
|
||||
var local = new DateTimeOffset(2026, 6, 1, 12, 0, 0, TimeSpan.FromHours(5));
|
||||
var evt = Minimal() with { OccurredAtUtc = local };
|
||||
Assert.Equal(TimeSpan.Zero, evt.OccurredAtUtc.Offset);
|
||||
Assert.Equal(local.UtcDateTime, evt.OccurredAtUtc.UtcDateTime);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Optional_fields_default_to_null()
|
||||
{
|
||||
var evt = Minimal();
|
||||
Assert.Null(evt.Category);
|
||||
Assert.Null(evt.Target);
|
||||
Assert.Null(evt.SourceNode);
|
||||
Assert.Null(evt.CorrelationId);
|
||||
Assert.Null(evt.DetailsJson);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Records_with_same_values_are_equal()
|
||||
{
|
||||
var id = Guid.NewGuid();
|
||||
var when = DateTimeOffset.UtcNow;
|
||||
AuditEvent Make() => new() { EventId = id, OccurredAtUtc = when, Actor = "a", Action = "x", Outcome = AuditOutcome.Success };
|
||||
Assert.Equal(Make(), Make());
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Same_instant_at_different_offset_compares_equal()
|
||||
{
|
||||
// Guards the UTC-normalizing init-setter: if OccurredAtUtc is ever "simplified" back to a
|
||||
// plain auto-property, these two (same instant, different offset) would stop comparing equal.
|
||||
var id = Guid.NewGuid();
|
||||
var utc = new DateTimeOffset(2026, 6, 1, 7, 0, 0, TimeSpan.Zero);
|
||||
var plus5 = new DateTimeOffset(2026, 6, 1, 12, 0, 0, TimeSpan.FromHours(5)); // same instant as utc
|
||||
AuditEvent With(DateTimeOffset when) =>
|
||||
new() { EventId = id, OccurredAtUtc = when, Actor = "a", Action = "x", Outcome = AuditOutcome.Success };
|
||||
Assert.Equal(With(utc), With(plus5));
|
||||
Assert.Equal(With(utc).GetHashCode(), With(plus5).GetHashCode());
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,32 @@
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
|
||||
namespace ZB.MOM.WW.Audit.Tests;
|
||||
|
||||
public class AuditServiceCollectionExtensionsTests
|
||||
{
|
||||
[Fact]
|
||||
public void Registers_null_redactor_and_noop_writer_by_default()
|
||||
{
|
||||
var sp = new ServiceCollection().AddZbAudit().BuildServiceProvider();
|
||||
Assert.IsType<NullAuditRedactor>(sp.GetRequiredService<IAuditRedactor>());
|
||||
Assert.IsType<NoOpAuditWriter>(sp.GetRequiredService<IAuditWriter>());
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Does_not_override_a_preregistered_writer()
|
||||
{
|
||||
var services = new ServiceCollection();
|
||||
services.AddSingleton<IAuditWriter>(new CompositeAuditWriter(System.Array.Empty<IAuditWriter>()));
|
||||
var sp = services.AddZbAudit().BuildServiceProvider();
|
||||
Assert.IsType<CompositeAuditWriter>(sp.GetRequiredService<IAuditWriter>());
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Does_not_override_a_preregistered_redactor()
|
||||
{
|
||||
var services = new ServiceCollection();
|
||||
services.AddSingleton<IAuditRedactor>(new TruncatingAuditRedactor());
|
||||
var sp = services.AddZbAudit().BuildServiceProvider();
|
||||
Assert.IsType<TruncatingAuditRedactor>(sp.GetRequiredService<IAuditRedactor>());
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,48 @@
|
||||
namespace ZB.MOM.WW.Audit.Tests;
|
||||
|
||||
public class CompositeAuditWriterTests
|
||||
{
|
||||
private sealed class RecordingWriter : IAuditWriter
|
||||
{
|
||||
public int Count;
|
||||
public Task WriteAsync(AuditEvent evt, CancellationToken ct = default) { Count++; return Task.CompletedTask; }
|
||||
}
|
||||
private sealed class ThrowingWriter : IAuditWriter
|
||||
{
|
||||
public Task WriteAsync(AuditEvent evt, CancellationToken ct = default) => throw new InvalidOperationException("boom");
|
||||
}
|
||||
private sealed class CancellingWriter : IAuditWriter
|
||||
{
|
||||
public Task WriteAsync(AuditEvent evt, CancellationToken ct = default) => throw new OperationCanceledException();
|
||||
}
|
||||
|
||||
private static AuditEvent Evt() => new() { EventId = Guid.NewGuid(), OccurredAtUtc = DateTimeOffset.UtcNow,
|
||||
Actor = "a", Action = "x", Outcome = AuditOutcome.Success };
|
||||
|
||||
[Fact]
|
||||
public async Task Fans_out_to_all_writers()
|
||||
{
|
||||
var a = new RecordingWriter(); var b = new RecordingWriter();
|
||||
await new CompositeAuditWriter(new IAuditWriter[] { a, b }).WriteAsync(Evt());
|
||||
Assert.Equal(1, a.Count);
|
||||
Assert.Equal(1, b.Count);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task One_failing_writer_does_not_stop_the_others()
|
||||
{
|
||||
var after = new RecordingWriter();
|
||||
var sut = new CompositeAuditWriter(new IAuditWriter[] { new ThrowingWriter(), after });
|
||||
await sut.WriteAsync(Evt()); // must not throw
|
||||
Assert.Equal(1, after.Count);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task Cancellation_is_propagated_not_swallowed()
|
||||
{
|
||||
// OperationCanceledException is re-thrown (unlike ordinary writer failures, which are swallowed).
|
||||
var after = new RecordingWriter();
|
||||
var sut = new CompositeAuditWriter(new IAuditWriter[] { new CancellingWriter(), after });
|
||||
await Assert.ThrowsAsync<OperationCanceledException>(() => sut.WriteAsync(Evt()));
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,12 @@
|
||||
namespace ZB.MOM.WW.Audit.Tests;
|
||||
|
||||
public class NoOpAuditWriterTests
|
||||
{
|
||||
[Fact]
|
||||
public async Task WriteAsync_completes_without_error()
|
||||
{
|
||||
var evt = new AuditEvent { EventId = Guid.NewGuid(), OccurredAtUtc = DateTimeOffset.UtcNow,
|
||||
Actor = "a", Action = "x", Outcome = AuditOutcome.Success };
|
||||
await NoOpAuditWriter.Instance.WriteAsync(evt);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,12 @@
|
||||
namespace ZB.MOM.WW.Audit.Tests;
|
||||
|
||||
public class NullAuditRedactorTests
|
||||
{
|
||||
[Fact]
|
||||
public void Apply_returns_input_unchanged()
|
||||
{
|
||||
var evt = new AuditEvent { EventId = Guid.NewGuid(), OccurredAtUtc = DateTimeOffset.UtcNow,
|
||||
Actor = "a", Action = "x", Outcome = AuditOutcome.Success, DetailsJson = "{\"k\":1}" };
|
||||
Assert.Same(evt, NullAuditRedactor.Instance.Apply(evt));
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,26 @@
|
||||
namespace ZB.MOM.WW.Audit.Tests;
|
||||
|
||||
public class RedactingAuditWriterTests
|
||||
{
|
||||
private sealed class CapturingWriter : IAuditWriter
|
||||
{
|
||||
public AuditEvent? Last;
|
||||
public Task WriteAsync(AuditEvent evt, CancellationToken ct = default) { Last = evt; return Task.CompletedTask; }
|
||||
}
|
||||
private sealed class StampRedactor : IAuditRedactor
|
||||
{
|
||||
public AuditEvent Apply(AuditEvent rawEvent) => rawEvent with { DetailsJson = "redacted" };
|
||||
}
|
||||
|
||||
private static AuditEvent Evt() => new() { EventId = Guid.NewGuid(), OccurredAtUtc = DateTimeOffset.UtcNow,
|
||||
Actor = "a", Action = "x", Outcome = AuditOutcome.Success, DetailsJson = "secret" };
|
||||
|
||||
[Fact]
|
||||
public async Task Inner_writer_receives_the_redacted_event()
|
||||
{
|
||||
var inner = new CapturingWriter();
|
||||
var sut = new RedactingAuditWriter(new StampRedactor(), inner);
|
||||
await sut.WriteAsync(Evt());
|
||||
Assert.Equal("redacted", inner.Last!.DetailsJson);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,56 @@
|
||||
namespace ZB.MOM.WW.Audit.Tests;
|
||||
|
||||
public class TruncatingAuditRedactorTests
|
||||
{
|
||||
private static AuditEvent Evt(string? details, string? target = null) => new()
|
||||
{
|
||||
EventId = Guid.NewGuid(), OccurredAtUtc = DateTimeOffset.UtcNow,
|
||||
Actor = "a", Action = "x", Outcome = AuditOutcome.Success,
|
||||
DetailsJson = details, Target = target,
|
||||
};
|
||||
|
||||
[Fact]
|
||||
public void Short_values_pass_through_unchanged()
|
||||
{
|
||||
var r = new TruncatingAuditRedactor(new() { MaxDetailsJsonLength = 100 });
|
||||
var evt = Evt("small");
|
||||
Assert.Equal("small", r.Apply(evt).DetailsJson);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Oversized_details_are_truncated_with_marker()
|
||||
{
|
||||
var opts = new TruncatingAuditRedactorOptions { MaxDetailsJsonLength = 10, TruncationMarker = "~" };
|
||||
var r = new TruncatingAuditRedactor(opts);
|
||||
var result = r.Apply(Evt(new string('x', 50)));
|
||||
Assert.Equal(10, result.DetailsJson!.Length);
|
||||
Assert.EndsWith("~", result.DetailsJson);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Oversized_target_is_truncated()
|
||||
{
|
||||
var r = new TruncatingAuditRedactor(new() { MaxTargetLength = 5, TruncationMarker = "" });
|
||||
var result = r.Apply(Evt(null, target: "abcdefghij"));
|
||||
Assert.Equal(5, result.Target!.Length);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Null_fields_are_left_null()
|
||||
{
|
||||
var r = new TruncatingAuditRedactor();
|
||||
var result = r.Apply(Evt(null));
|
||||
Assert.Null(result.DetailsJson);
|
||||
Assert.Null(result.Target);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Marker_longer_than_max_clips_the_marker_itself()
|
||||
{
|
||||
// Misconfiguration: marker longer than the cap. Must not throw; clips to the first max chars.
|
||||
var opts = new TruncatingAuditRedactorOptions { MaxDetailsJsonLength = 3, TruncationMarker = "…[truncated]" };
|
||||
var r = new TruncatingAuditRedactor(opts);
|
||||
var result = r.Apply(Evt(new string('x', 20)));
|
||||
Assert.Equal(3, result.DetailsJson!.Length);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,18 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
<PropertyGroup>
|
||||
<IsPackable>false</IsPackable>
|
||||
</PropertyGroup>
|
||||
<ItemGroup>
|
||||
<PackageReference Include="coverlet.collector" />
|
||||
<PackageReference Include="Microsoft.NET.Test.Sdk" />
|
||||
<PackageReference Include="xunit" />
|
||||
<PackageReference Include="xunit.runner.visualstudio" />
|
||||
<PackageReference Include="Microsoft.Extensions.DependencyInjection" />
|
||||
</ItemGroup>
|
||||
<ItemGroup>
|
||||
<Using Include="Xunit" />
|
||||
</ItemGroup>
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\..\src\ZB.MOM.WW.Audit\ZB.MOM.WW.Audit.csproj" />
|
||||
</ItemGroup>
|
||||
</Project>
|
||||
@@ -0,0 +1,482 @@
|
||||
## Ignore Visual Studio temporary files, build results, and
|
||||
## files generated by popular Visual Studio add-ons.
|
||||
##
|
||||
## Get latest from `dotnet new gitignore`
|
||||
|
||||
# dotenv files
|
||||
.env
|
||||
|
||||
# User-specific files
|
||||
*.rsuser
|
||||
*.suo
|
||||
*.user
|
||||
*.userosscache
|
||||
*.sln.docstates
|
||||
|
||||
# User-specific files (MonoDevelop/Xamarin Studio)
|
||||
*.userprefs
|
||||
|
||||
# Mono auto generated files
|
||||
mono_crash.*
|
||||
|
||||
# Build results
|
||||
[Dd]ebug/
|
||||
[Dd]ebugPublic/
|
||||
[Rr]elease/
|
||||
[Rr]eleases/
|
||||
x64/
|
||||
x86/
|
||||
[Ww][Ii][Nn]32/
|
||||
[Aa][Rr][Mm]/
|
||||
[Aa][Rr][Mm]64/
|
||||
bld/
|
||||
[Bb]in/
|
||||
[Oo]bj/
|
||||
[Ll]og/
|
||||
[Ll]ogs/
|
||||
|
||||
# Visual Studio 2015/2017 cache/options directory
|
||||
.vs/
|
||||
# Uncomment if you have tasks that create the project's static files in wwwroot
|
||||
#wwwroot/
|
||||
|
||||
# Visual Studio 2017 auto generated files
|
||||
Generated\ Files/
|
||||
|
||||
# MSTest test Results
|
||||
[Tt]est[Rr]esult*/
|
||||
[Bb]uild[Ll]og.*
|
||||
|
||||
# NUnit
|
||||
*.VisualState.xml
|
||||
TestResult.xml
|
||||
nunit-*.xml
|
||||
|
||||
# Build Results of an ATL Project
|
||||
[Dd]ebugPS/
|
||||
[Rr]eleasePS/
|
||||
dlldata.c
|
||||
|
||||
# Benchmark Results
|
||||
BenchmarkDotNet.Artifacts/
|
||||
|
||||
# .NET
|
||||
project.lock.json
|
||||
project.fragment.lock.json
|
||||
artifacts/
|
||||
|
||||
# Tye
|
||||
.tye/
|
||||
|
||||
# ASP.NET Scaffolding
|
||||
ScaffoldingReadMe.txt
|
||||
|
||||
# StyleCop
|
||||
StyleCopReport.xml
|
||||
|
||||
# Files built by Visual Studio
|
||||
*_i.c
|
||||
*_p.c
|
||||
*_h.h
|
||||
*.ilk
|
||||
*.meta
|
||||
*.obj
|
||||
*.iobj
|
||||
*.pch
|
||||
*.pdb
|
||||
*.ipdb
|
||||
*.pgc
|
||||
*.pgd
|
||||
*.rsp
|
||||
# but not Directory.Build.rsp, as it configures directory-level build defaults
|
||||
!Directory.Build.rsp
|
||||
*.sbr
|
||||
*.tlb
|
||||
*.tli
|
||||
*.tlh
|
||||
*.tmp
|
||||
*.tmp_proj
|
||||
*_wpftmp.csproj
|
||||
*.log
|
||||
*.tlog
|
||||
*.vspscc
|
||||
*.vssscc
|
||||
.builds
|
||||
*.pidb
|
||||
*.svclog
|
||||
*.scc
|
||||
|
||||
# Chutzpah Test files
|
||||
_Chutzpah*
|
||||
|
||||
# Visual C++ cache files
|
||||
ipch/
|
||||
*.aps
|
||||
*.ncb
|
||||
*.opendb
|
||||
*.opensdf
|
||||
*.sdf
|
||||
*.cachefile
|
||||
*.VC.db
|
||||
*.VC.VC.opendb
|
||||
|
||||
# Visual Studio profiler
|
||||
*.psess
|
||||
*.vsp
|
||||
*.vspx
|
||||
*.sap
|
||||
|
||||
# Visual Studio Trace Files
|
||||
*.e2e
|
||||
|
||||
# TFS 2012 Local Workspace
|
||||
$tf/
|
||||
|
||||
# Guidance Automation Toolkit
|
||||
*.gpState
|
||||
|
||||
# ReSharper is a .NET coding add-in
|
||||
_ReSharper*/
|
||||
*.[Rr]e[Ss]harper
|
||||
*.DotSettings.user
|
||||
|
||||
# TeamCity is a build add-in
|
||||
_TeamCity*
|
||||
|
||||
# DotCover is a Code Coverage Tool
|
||||
*.dotCover
|
||||
|
||||
# AxoCover is a Code Coverage Tool
|
||||
.axoCover/*
|
||||
!.axoCover/settings.json
|
||||
|
||||
# Coverlet is a free, cross platform Code Coverage Tool
|
||||
coverage*.json
|
||||
coverage*.xml
|
||||
coverage*.info
|
||||
|
||||
# Visual Studio code coverage results
|
||||
*.coverage
|
||||
*.coveragexml
|
||||
|
||||
# NCrunch
|
||||
_NCrunch_*
|
||||
.*crunch*.local.xml
|
||||
nCrunchTemp_*
|
||||
|
||||
# MightyMoose
|
||||
*.mm.*
|
||||
AutoTest.Net/
|
||||
|
||||
# Web workbench (sass)
|
||||
.sass-cache/
|
||||
|
||||
# Installshield output folder
|
||||
[Ee]xpress/
|
||||
|
||||
# DocProject is a documentation generator add-in
|
||||
DocProject/buildhelp/
|
||||
DocProject/Help/*.HxT
|
||||
DocProject/Help/*.HxC
|
||||
DocProject/Help/*.hhc
|
||||
DocProject/Help/*.hhk
|
||||
DocProject/Help/*.hhp
|
||||
DocProject/Help/Html2
|
||||
DocProject/Help/html
|
||||
|
||||
# Click-Once directory
|
||||
publish/
|
||||
|
||||
# Publish Web Output
|
||||
*.[Pp]ublish.xml
|
||||
*.azurePubxml
|
||||
# Note: Comment the next line if you want to checkin your web deploy settings,
|
||||
# but database connection strings (with potential passwords) will be unencrypted
|
||||
*.pubxml
|
||||
*.publishproj
|
||||
|
||||
# Microsoft Azure Web App publish settings. Comment the next line if you want to
|
||||
# checkin your Azure Web App publish settings, but sensitive information contained
|
||||
# in these scripts will be unencrypted
|
||||
PublishScripts/
|
||||
|
||||
# NuGet Packages
|
||||
*.nupkg
|
||||
# NuGet Symbol Packages
|
||||
*.snupkg
|
||||
# The packages folder can be ignored because of Package Restore
|
||||
**/[Pp]ackages/*
|
||||
# except build/, which is used as an MSBuild target.
|
||||
!**/[Pp]ackages/build/
|
||||
# Uncomment if necessary however generally it will be regenerated when needed
|
||||
#!**/[Pp]ackages/repositories.config
|
||||
# NuGet v3's project.json files produces more ignorable files
|
||||
*.nuget.props
|
||||
*.nuget.targets
|
||||
|
||||
# Microsoft Azure Build Output
|
||||
csx/
|
||||
*.build.csdef
|
||||
|
||||
# Microsoft Azure Emulator
|
||||
ecf/
|
||||
rcf/
|
||||
|
||||
# Windows Store app package directories and files
|
||||
AppPackages/
|
||||
BundleArtifacts/
|
||||
Package.StoreAssociation.xml
|
||||
_pkginfo.txt
|
||||
*.appx
|
||||
*.appxbundle
|
||||
*.appxupload
|
||||
|
||||
# Visual Studio cache files
|
||||
# files ending in .cache can be ignored
|
||||
*.[Cc]ache
|
||||
# but keep track of directories ending in .cache
|
||||
!?*.[Cc]ache/
|
||||
|
||||
# Others
|
||||
ClientBin/
|
||||
~$*
|
||||
*~
|
||||
*.dbmdl
|
||||
*.dbproj.schemaview
|
||||
*.jfm
|
||||
*.pfx
|
||||
*.publishsettings
|
||||
orleans.codegen.cs
|
||||
|
||||
# Including strong name files can present a security risk
|
||||
# (https://github.com/github/gitignore/pull/2483#issue-259490424)
|
||||
#*.snk
|
||||
|
||||
# Since there are multiple workflows, uncomment next line to ignore bower_components
|
||||
# (https://github.com/github/gitignore/pull/1529#issuecomment-104372622)
|
||||
#bower_components/
|
||||
|
||||
# RIA/Silverlight projects
|
||||
Generated_Code/
|
||||
|
||||
# Backup & report files from converting an old project file
|
||||
# to a newer Visual Studio version. Backup files are not needed,
|
||||
# because we have git ;-)
|
||||
_UpgradeReport_Files/
|
||||
Backup*/
|
||||
UpgradeLog*.XML
|
||||
UpgradeLog*.htm
|
||||
ServiceFabricBackup/
|
||||
*.rptproj.bak
|
||||
|
||||
# SQL Server files
|
||||
*.mdf
|
||||
*.ldf
|
||||
*.ndf
|
||||
|
||||
# Business Intelligence projects
|
||||
*.rdl.data
|
||||
*.bim.layout
|
||||
*.bim_*.settings
|
||||
*.rptproj.rsuser
|
||||
*- [Bb]ackup.rdl
|
||||
*- [Bb]ackup ([0-9]).rdl
|
||||
*- [Bb]ackup ([0-9][0-9]).rdl
|
||||
|
||||
# Microsoft Fakes
|
||||
FakesAssemblies/
|
||||
|
||||
# GhostDoc plugin setting file
|
||||
*.GhostDoc.xml
|
||||
|
||||
# Node.js Tools for Visual Studio
|
||||
.ntvs_analysis.dat
|
||||
node_modules/
|
||||
|
||||
# Visual Studio 6 build log
|
||||
*.plg
|
||||
|
||||
# Visual Studio 6 workspace options file
|
||||
*.opt
|
||||
|
||||
# Visual Studio 6 auto-generated workspace file (contains which files were open etc.)
|
||||
*.vbw
|
||||
|
||||
# Visual Studio 6 auto-generated project file (contains which files were open etc.)
|
||||
*.vbp
|
||||
|
||||
# Visual Studio 6 workspace and project file (working project files containing files to include in project)
|
||||
*.dsw
|
||||
*.dsp
|
||||
|
||||
# Visual Studio 6 technical files
|
||||
*.ncb
|
||||
*.aps
|
||||
|
||||
# Visual Studio LightSwitch build output
|
||||
**/*.HTMLClient/GeneratedArtifacts
|
||||
**/*.DesktopClient/GeneratedArtifacts
|
||||
**/*.DesktopClient/ModelManifest.xml
|
||||
**/*.Server/GeneratedArtifacts
|
||||
**/*.Server/ModelManifest.xml
|
||||
_Pvt_Extensions
|
||||
|
||||
# Paket dependency manager
|
||||
.paket/paket.exe
|
||||
paket-files/
|
||||
|
||||
# FAKE - F# Make
|
||||
.fake/
|
||||
|
||||
# CodeRush personal settings
|
||||
.cr/personal
|
||||
|
||||
# Python Tools for Visual Studio (PTVS)
|
||||
__pycache__/
|
||||
*.pyc
|
||||
|
||||
# Cake - Uncomment if you are using it
|
||||
# tools/**
|
||||
# !tools/packages.config
|
||||
|
||||
# Tabs Studio
|
||||
*.tss
|
||||
|
||||
# Telerik's JustMock configuration file
|
||||
*.jmconfig
|
||||
|
||||
# BizTalk build output
|
||||
*.btp.cs
|
||||
*.btm.cs
|
||||
*.odx.cs
|
||||
*.xsd.cs
|
||||
|
||||
# OpenCover UI analysis results
|
||||
OpenCover/
|
||||
|
||||
# Azure Stream Analytics local run output
|
||||
ASALocalRun/
|
||||
|
||||
# MSBuild Binary and Structured Log
|
||||
*.binlog
|
||||
|
||||
# NVidia Nsight GPU debugger configuration file
|
||||
*.nvuser
|
||||
|
||||
# MFractors (Xamarin productivity tool) working folder
|
||||
.mfractor/
|
||||
|
||||
# Local History for Visual Studio
|
||||
.localhistory/
|
||||
|
||||
# Visual Studio History (VSHistory) files
|
||||
.vshistory/
|
||||
|
||||
# BeatPulse healthcheck temp database
|
||||
healthchecksdb
|
||||
|
||||
# Backup folder for Package Reference Convert tool in Visual Studio 2017
|
||||
MigrationBackup/
|
||||
|
||||
# Ionide (cross platform F# VS Code tools) working folder
|
||||
.ionide/
|
||||
|
||||
# Fody - auto-generated XML schema
|
||||
FodyWeavers.xsd
|
||||
|
||||
# VS Code files for those working on multiple tools
|
||||
.vscode/*
|
||||
!.vscode/settings.json
|
||||
!.vscode/tasks.json
|
||||
!.vscode/launch.json
|
||||
!.vscode/extensions.json
|
||||
*.code-workspace
|
||||
|
||||
# Local History for Visual Studio Code
|
||||
.history/
|
||||
|
||||
# Windows Installer files from build outputs
|
||||
*.cab
|
||||
*.msi
|
||||
*.msix
|
||||
*.msm
|
||||
*.msp
|
||||
|
||||
# JetBrains Rider
|
||||
*.sln.iml
|
||||
.idea/
|
||||
|
||||
##
|
||||
## Visual studio for Mac
|
||||
##
|
||||
|
||||
|
||||
# globs
|
||||
Makefile.in
|
||||
*.userprefs
|
||||
*.usertasks
|
||||
config.make
|
||||
config.status
|
||||
aclocal.m4
|
||||
install-sh
|
||||
autom4te.cache/
|
||||
*.tar.gz
|
||||
tarballs/
|
||||
test-results/
|
||||
|
||||
# content below from: https://github.com/github/gitignore/blob/main/Global/macOS.gitignore
|
||||
# General
|
||||
.DS_Store
|
||||
.AppleDouble
|
||||
.LSOverride
|
||||
|
||||
# Icon must end with two \r
|
||||
Icon
|
||||
|
||||
|
||||
# Thumbnails
|
||||
._*
|
||||
|
||||
# Files that might appear in the root of a volume
|
||||
.DocumentRevisions-V100
|
||||
.fseventsd
|
||||
.Spotlight-V100
|
||||
.TemporaryItems
|
||||
.Trashes
|
||||
.VolumeIcon.icns
|
||||
.com.apple.timemachine.donotpresent
|
||||
|
||||
# Directories potentially created on remote AFP share
|
||||
.AppleDB
|
||||
.AppleDesktop
|
||||
Network Trash Folder
|
||||
Temporary Items
|
||||
.apdisk
|
||||
|
||||
# content below from: https://github.com/github/gitignore/blob/main/Global/Windows.gitignore
|
||||
# Windows thumbnail cache files
|
||||
Thumbs.db
|
||||
ehthumbs.db
|
||||
ehthumbs_vista.db
|
||||
|
||||
# Dump file
|
||||
*.stackdump
|
||||
|
||||
# Folder config file
|
||||
[Dd]esktop.ini
|
||||
|
||||
# Recycle Bin used on file shares
|
||||
$RECYCLE.BIN/
|
||||
|
||||
# Windows Installer files
|
||||
*.cab
|
||||
*.msi
|
||||
*.msix
|
||||
*.msm
|
||||
*.msp
|
||||
|
||||
# Windows shortcuts
|
||||
*.lnk
|
||||
|
||||
# Vim temporary swap files
|
||||
*.swp
|
||||
@@ -0,0 +1,72 @@
|
||||
# ZB.MOM.WW.Health
|
||||
|
||||
Health-check libraries for the **ZB.MOM.WW SCADA family** (OtOpcUa, MxAccessGateway, ScadaBridge). These are **libraries, not a service** — each package is linked directly into the consuming application at build time. There is no central health process or network hop; probes run in-process alongside the application.
|
||||
|
||||
The library normalizes the three-tier health endpoint convention (`/health/ready`, `/health/active`, `/healthz`) and provides reusable probe implementations so the three sister projects share a common surface without duplicating probe logic.
|
||||
|
||||
**Built at 0.1.0. NOT yet adopted by the three apps.** Adoption is tracked in `~/Desktop/scadaproj/components/health/GAPS.md`.
|
||||
|
||||
---
|
||||
|
||||
## Packages
|
||||
|
||||
| Package | Responsibilities | Key Dependencies |
|
||||
|---|---|---|
|
||||
| `ZB.MOM.WW.Health` | Core tier convention, `MapZbHealth` extension, canonical JSON writer (`ZbHealthWriter`), `IActiveNodeGate` seam, `GrpcDependencyHealthCheck` reachability probe, tier-tag constants (`ZbHealthTags`). No Akka or EF dependency. | `Microsoft.AspNetCore.App` (framework ref), `Grpc.Net.Client` |
|
||||
| `ZB.MOM.WW.Health.Akka` | `AkkaClusterHealthCheck` with a configurable `AkkaClusterStatusPolicy` (presets: `Default` / `OtOpcUaCompat`), `ActiveNodeHealthCheck` with an optional role filter, and `AkkaActiveNodeGate` that backs `IActiveNodeGate` from cluster member state. | `ZB.MOM.WW.Health`, `Akka.Cluster` |
|
||||
| `ZB.MOM.WW.Health.EntityFrameworkCore` | `DatabaseHealthCheck<TContext>` — resolves `IDbContextFactory<TContext>` when registered, else a scoped `TContext`; pool-safe. Default probe: `CanConnectAsync`. Optional `ProbeQuery` delegate for query-based validation. | `ZB.MOM.WW.Health`, `Microsoft.EntityFrameworkCore` |
|
||||
|
||||
---
|
||||
|
||||
## Consumer matrix
|
||||
|
||||
| Consumer | `ZB.MOM.WW.Health` (core) | `ZB.MOM.WW.Health.Akka` | `ZB.MOM.WW.Health.EntityFrameworkCore` |
|
||||
|---|:---:|:---:|:---:|
|
||||
| **OtOpcUa** | yes | yes | yes |
|
||||
| **MxAccessGateway** | yes | — | — |
|
||||
| **ScadaBridge** | yes | yes | yes |
|
||||
|
||||
MxAccessGateway consumes the core package only — it has no Akka cluster and no EF DbContext. OtOpcUa and ScadaBridge consume all three packages.
|
||||
|
||||
---
|
||||
|
||||
## Build, test, and pack commands
|
||||
|
||||
```bash
|
||||
# From ZB.MOM.WW.Health/
|
||||
|
||||
# Build
|
||||
dotnet build ZB.MOM.WW.Health.slnx
|
||||
dotnet build ZB.MOM.WW.Health.slnx -c Release
|
||||
|
||||
# Test (no external dependencies — no running Akka cluster, no database)
|
||||
dotnet test ZB.MOM.WW.Health.slnx
|
||||
|
||||
# Pack (three .nupkg files land in artifacts/)
|
||||
dotnet pack ZB.MOM.WW.Health.slnx -c Release -o ./artifacts
|
||||
```
|
||||
|
||||
All three test assemblies run offline:
|
||||
|
||||
| Assembly | Tests |
|
||||
|---|---|
|
||||
| `ZB.MOM.WW.Health.Tests` | 20 |
|
||||
| `ZB.MOM.WW.Health.Akka.Tests` | 32 |
|
||||
| `ZB.MOM.WW.Health.EntityFrameworkCore.Tests` | 6 |
|
||||
| **Total** | **58** |
|
||||
|
||||
`GeneratePackageOnBuild` is off — pack explicitly with the command above.
|
||||
|
||||
---
|
||||
|
||||
## Status
|
||||
|
||||
Built at **0.1.0** and published to the Gitea NuGet feed. **Not yet adopted** by the three apps — adoption is tracked in the component backlog:
|
||||
|
||||
- `~/Desktop/scadaproj/components/health/GAPS.md` — adoption order, effort, and risk
|
||||
|
||||
Design documentation:
|
||||
|
||||
- `~/Desktop/scadaproj/components/health/spec/SPEC.md` — normalized three-tier target
|
||||
- `~/Desktop/scadaproj/components/health/shared-contract/ZB.MOM.WW.Health.md` — proposed API (aligned to shipped code)
|
||||
- `~/Desktop/scadaproj/components/health/current-state/` — per-project current state (code-verified)
|
||||
@@ -0,0 +1,12 @@
|
||||
<Project>
|
||||
|
||||
<PropertyGroup>
|
||||
<TargetFramework>net10.0</TargetFramework>
|
||||
<Nullable>enable</Nullable>
|
||||
<ImplicitUsings>enable</ImplicitUsings>
|
||||
<LangVersion>latest</LangVersion>
|
||||
<Version>0.1.0</Version>
|
||||
<ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
|
||||
</PropertyGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,31 @@
|
||||
<Project>
|
||||
|
||||
<PropertyGroup>
|
||||
<ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!-- Akka -->
|
||||
<PackageVersion Include="Akka.Cluster" Version="1.5.62" />
|
||||
<PackageVersion Include="Akka.TestKit.Xunit2" Version="1.5.62" />
|
||||
|
||||
<!-- Health Checks / ASP.NET Core -->
|
||||
<PackageVersion Include="Microsoft.Extensions.Diagnostics.HealthChecks.Abstractions" Version="10.0.7" />
|
||||
<PackageVersion Include="Microsoft.AspNetCore.Mvc.Testing" Version="10.0.7" />
|
||||
|
||||
<!-- gRPC -->
|
||||
<PackageVersion Include="Grpc.Net.Client" Version="2.71.0" />
|
||||
|
||||
<!-- Entity Framework Core -->
|
||||
<PackageVersion Include="Microsoft.EntityFrameworkCore" Version="10.0.7" />
|
||||
<PackageVersion Include="Microsoft.EntityFrameworkCore.Sqlite" Version="10.0.7" />
|
||||
<PackageVersion Include="Microsoft.EntityFrameworkCore.InMemory" Version="10.0.7" />
|
||||
|
||||
<!-- Test -->
|
||||
<PackageVersion Include="Microsoft.NET.Test.Sdk" Version="17.14.1" />
|
||||
<PackageVersion Include="xunit" Version="2.9.3" />
|
||||
<PackageVersion Include="xunit.runner.visualstudio" Version="3.1.4" />
|
||||
<PackageVersion Include="coverlet.collector" Version="6.0.4" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,84 @@
|
||||
# ZB.MOM.WW.Health
|
||||
|
||||
Health-check libraries for the **ZB.MOM.WW SCADA family** (OtOpcUa, MxAccessGateway, ScadaBridge). These are **libraries, not a service** — each package is linked directly into the consuming application at build time. There is no central health process or network hop; probes run in-process alongside the application.
|
||||
|
||||
The library normalizes the three-tier health endpoint convention (`/health/ready`, `/health/active`, `/healthz`) and provides reusable probe implementations so the three sister projects share a common surface without duplicating probe logic.
|
||||
|
||||
---
|
||||
|
||||
## Packages
|
||||
|
||||
| Package | Description | Key Dependencies |
|
||||
|---|---|---|
|
||||
| `ZB.MOM.WW.Health` | Core tiers, `MapZbHealth` extension, canonical JSON writer (`ZbHealthWriter`), `IActiveNodeGate` seam, `GrpcDependencyHealthCheck` reachability probe, and tier-tag constants (`ZbHealthTags`). No Akka or EF dependency. | `Microsoft.AspNetCore.App` (framework ref), `Grpc.Net.Client` |
|
||||
| `ZB.MOM.WW.Health.Akka` | `AkkaClusterHealthCheck` with a configurable `AkkaClusterStatusPolicy` (presets: `Default` three-way / `OtOpcUaCompat` two-way), `ActiveNodeHealthCheck` with an optional role filter, and `AkkaActiveNodeGate` that backs `IActiveNodeGate` from the cluster member state. | `ZB.MOM.WW.Health`, `Akka.Cluster` |
|
||||
| `ZB.MOM.WW.Health.EntityFrameworkCore` | `DatabaseHealthCheck<TContext>` with `CanConnectAsync` by default and an optional `ProbeQuery` delegate for custom connectivity validation. | `ZB.MOM.WW.Health`, `Microsoft.EntityFrameworkCore` |
|
||||
|
||||
---
|
||||
|
||||
## Consumer Matrix
|
||||
|
||||
| Consumer | `ZB.MOM.WW.Health` (core) | `ZB.MOM.WW.Health.Akka` | `ZB.MOM.WW.Health.EntityFrameworkCore` |
|
||||
|---|:---:|:---:|:---:|
|
||||
| **OtOpcUa** | yes (+ `GrpcDependencyHealthCheck` for the MxAccessGateway channel) | yes | yes |
|
||||
| **MxAccessGateway** | yes (+ `GrpcDependencyHealthCheck` for the x86 worker IPC) | — | — |
|
||||
| **ScadaBridge** | yes | yes | yes |
|
||||
|
||||
MxAccessGateway consumes the core package only — it has no Akka cluster and no EF DbContext. OtOpcUa and ScadaBridge consume all three packages.
|
||||
|
||||
---
|
||||
|
||||
## Versioning
|
||||
|
||||
All three packages are versioned **lockstep** from `Directory.Build.props`. The current release is **0.1.0**. A single version bump in `Directory.Build.props` bumps all three packages simultaneously — consumers should reference the same version for all ZB.MOM.WW.Health packages.
|
||||
|
||||
---
|
||||
|
||||
## Building and testing
|
||||
|
||||
```bash
|
||||
# from ZB.MOM.WW.Health/
|
||||
dotnet build ZB.MOM.WW.Health.slnx
|
||||
dotnet test ZB.MOM.WW.Health.slnx
|
||||
```
|
||||
|
||||
All three test assemblies run with `dotnet test` and require no external dependencies (no running Akka cluster, no database):
|
||||
|
||||
| Assembly | Tests |
|
||||
|---|---|
|
||||
| `ZB.MOM.WW.Health.Tests` | 20 |
|
||||
| `ZB.MOM.WW.Health.Akka.Tests` | 32 |
|
||||
| `ZB.MOM.WW.Health.EntityFrameworkCore.Tests` | 6 |
|
||||
| **Total** | **58** |
|
||||
|
||||
---
|
||||
|
||||
## Packing
|
||||
|
||||
```bash
|
||||
dotnet pack ZB.MOM.WW.Health.slnx -c Release -o ./artifacts
|
||||
```
|
||||
|
||||
Produces three `.nupkg` files in `artifacts/`:
|
||||
|
||||
```
|
||||
ZB.MOM.WW.Health.0.1.0.nupkg
|
||||
ZB.MOM.WW.Health.Akka.0.1.0.nupkg
|
||||
ZB.MOM.WW.Health.EntityFrameworkCore.0.1.0.nupkg
|
||||
```
|
||||
|
||||
`GeneratePackageOnBuild` is off — pack explicitly as above.
|
||||
|
||||
---
|
||||
|
||||
## Status
|
||||
|
||||
**Built at 0.1.0. NOT yet adopted by the three apps.** Adoption is tracked in the component backlog:
|
||||
|
||||
- `~/Desktop/scadaproj/components/health/GAPS.md`
|
||||
|
||||
Design documentation lives alongside that backlog:
|
||||
|
||||
- `~/Desktop/scadaproj/components/health/spec/SPEC.md` — normalized three-tier target
|
||||
- `~/Desktop/scadaproj/components/health/shared-contract/ZB.MOM.WW.Health.md` — proposed API
|
||||
- `~/Desktop/scadaproj/components/health/current-state/` — per-project current state (code-verified)
|
||||
@@ -0,0 +1,12 @@
|
||||
<Solution>
|
||||
<Folder Name="/src/">
|
||||
<Project Path="src/ZB.MOM.WW.Health.Akka/ZB.MOM.WW.Health.Akka.csproj" />
|
||||
<Project Path="src/ZB.MOM.WW.Health.EntityFrameworkCore/ZB.MOM.WW.Health.EntityFrameworkCore.csproj" />
|
||||
<Project Path="src/ZB.MOM.WW.Health/ZB.MOM.WW.Health.csproj" />
|
||||
</Folder>
|
||||
<Folder Name="/tests/">
|
||||
<Project Path="tests/ZB.MOM.WW.Health.Akka.Tests/ZB.MOM.WW.Health.Akka.Tests.csproj" />
|
||||
<Project Path="tests/ZB.MOM.WW.Health.EntityFrameworkCore.Tests/ZB.MOM.WW.Health.EntityFrameworkCore.Tests.csproj" />
|
||||
<Project Path="tests/ZB.MOM.WW.Health.Tests/ZB.MOM.WW.Health.Tests.csproj" />
|
||||
</Folder>
|
||||
</Solution>
|
||||
@@ -0,0 +1,153 @@
|
||||
using System.Runtime.CompilerServices;
|
||||
using Akka.Actor;
|
||||
using Akka.Cluster;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
|
||||
[assembly: InternalsVisibleTo("ZB.MOM.WW.Health.Akka.Tests")]
|
||||
|
||||
namespace ZB.MOM.WW.Health.Akka;
|
||||
|
||||
/// <summary>
|
||||
/// Pure decision function for the active / leader probe, factored out of
|
||||
/// <see cref="ActiveNodeHealthCheck"/> so the role-less and role-filtered matrices are exhaustively
|
||||
/// table-testable without forming a real cluster.
|
||||
/// </summary>
|
||||
internal static class ActiveNodeDecision
|
||||
{
|
||||
/// <summary>
|
||||
/// Maps the resolved cluster facts to a <see cref="HealthStatus"/>.
|
||||
/// </summary>
|
||||
/// <param name="selfUp">Whether the local node's member status is <c>Up</c>.</param>
|
||||
/// <param name="isLeader">
|
||||
/// Whether the local node is the leader: the cluster leader in role-less mode, or the
|
||||
/// role-singleton leader in role-filtered mode.
|
||||
/// </param>
|
||||
/// <param name="hasRole">
|
||||
/// Whether the local node carries <paramref name="requiredRole"/>. Ignored when
|
||||
/// <paramref name="requiredRole"/> is <c>null</c>.
|
||||
/// </param>
|
||||
/// <param name="requiredRole">
|
||||
/// The role to scope the check to, or <c>null</c> for the role-less (whole-cluster-leader) mode.
|
||||
/// </param>
|
||||
/// <returns>
|
||||
/// Role-less: Healthy iff the node is Up and the cluster leader, otherwise Unhealthy.
|
||||
/// Role-filtered: Healthy when the node lacks the role (probe irrelevant) or carries the role and
|
||||
/// is the role-singleton leader; Degraded when it carries the role but is not the leader.
|
||||
/// </returns>
|
||||
public static HealthStatus Evaluate(bool selfUp, bool isLeader, bool hasRole, string? requiredRole)
|
||||
{
|
||||
if (requiredRole is null)
|
||||
return selfUp && isLeader ? HealthStatus.Healthy : HealthStatus.Unhealthy;
|
||||
|
||||
if (!hasRole)
|
||||
return HealthStatus.Healthy;
|
||||
|
||||
return isLeader ? HealthStatus.Healthy : HealthStatus.Degraded;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Health check that reports whether this node is the designated active / leader node.
|
||||
/// An optional role scopes the check to nodes carrying that role. Register to the
|
||||
/// <see cref="ZbHealthTags.Active"/> tag.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// The <see cref="ActorSystem"/> is resolved lazily from the service provider. If it is not yet
|
||||
/// available — e.g. during startup before Akka is initialised — the check returns
|
||||
/// <see cref="HealthStatus.Degraded"/> rather than throwing, so it is startup-safe.
|
||||
/// </remarks>
|
||||
public sealed class ActiveNodeHealthCheck : IHealthCheck
|
||||
{
|
||||
private readonly IServiceProvider _serviceProvider;
|
||||
private readonly string? _role;
|
||||
|
||||
/// <summary>
|
||||
/// Role-less constructor: Healthy when the node is <c>Up</c> and the cluster leader
|
||||
/// (ScadaBridge ActiveNode pattern); Unhealthy otherwise. Degraded when the ActorSystem /
|
||||
/// cluster is not yet ready.
|
||||
/// </summary>
|
||||
/// <param name="serviceProvider">
|
||||
/// The application service provider. The <see cref="ActorSystem"/> is resolved lazily so the
|
||||
/// check is startup-safe: if no <see cref="ActorSystem"/> is registered yet the result is Degraded.
|
||||
/// </param>
|
||||
public ActiveNodeHealthCheck(IServiceProvider serviceProvider)
|
||||
{
|
||||
_serviceProvider = serviceProvider ?? throw new ArgumentNullException(nameof(serviceProvider));
|
||||
_role = null;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Role-filtered constructor: Healthy when the node lacks <paramref name="role"/> or carries it
|
||||
/// and is the role-singleton leader; Degraded when it carries the role but is not the leader
|
||||
/// (OtOpcUa AdminRoleLeader pattern). Degraded when the ActorSystem / cluster is not yet ready.
|
||||
/// </summary>
|
||||
/// <param name="serviceProvider">
|
||||
/// The application service provider. The <see cref="ActorSystem"/> is resolved lazily so the
|
||||
/// check is startup-safe: if no <see cref="ActorSystem"/> is registered yet the result is Degraded.
|
||||
/// </param>
|
||||
/// <param name="role">The Akka cluster role to scope the check to.</param>
|
||||
public ActiveNodeHealthCheck(IServiceProvider serviceProvider, string role)
|
||||
{
|
||||
_serviceProvider = serviceProvider ?? throw new ArgumentNullException(nameof(serviceProvider));
|
||||
ArgumentException.ThrowIfNullOrWhiteSpace(role);
|
||||
_role = role;
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public Task<HealthCheckResult> CheckHealthAsync(
|
||||
HealthCheckContext context,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
var system = _serviceProvider.GetService<ActorSystem>();
|
||||
if (system is null)
|
||||
return Task.FromResult(HealthCheckResult.Degraded("ActorSystem not yet available."));
|
||||
|
||||
var cluster = Cluster.Get(system);
|
||||
var self = cluster.SelfMember;
|
||||
var selfUp = self.Status == MemberStatus.Up;
|
||||
|
||||
bool hasRole;
|
||||
bool isLeader;
|
||||
if (_role is null)
|
||||
{
|
||||
hasRole = false;
|
||||
var leader = cluster.State.Leader;
|
||||
isLeader = leader is not null && leader == self.Address;
|
||||
}
|
||||
else
|
||||
{
|
||||
hasRole = self.HasRole(_role);
|
||||
var roleLeader = cluster.State.RoleLeader(_role);
|
||||
isLeader = roleLeader is not null && roleLeader == self.Address;
|
||||
}
|
||||
|
||||
var health = ActiveNodeDecision.Evaluate(selfUp, isLeader, hasRole, _role);
|
||||
var description = DescribeResult(health, self.Status, selfUp, isLeader);
|
||||
var result = health switch
|
||||
{
|
||||
HealthStatus.Healthy => HealthCheckResult.Healthy(description),
|
||||
HealthStatus.Degraded => HealthCheckResult.Degraded(description),
|
||||
_ => HealthCheckResult.Unhealthy(description),
|
||||
};
|
||||
return Task.FromResult(result);
|
||||
}
|
||||
|
||||
private string DescribeResult(HealthStatus health, MemberStatus status, bool selfUp, bool isLeader)
|
||||
{
|
||||
if (_role is null)
|
||||
{
|
||||
if (health == HealthStatus.Healthy)
|
||||
return "Active node (cluster leader).";
|
||||
return selfUp && !isLeader
|
||||
? "Standby: node is Up but not the cluster leader."
|
||||
: $"Standby: node is not Up (status: {status}).";
|
||||
}
|
||||
|
||||
return health switch
|
||||
{
|
||||
HealthStatus.Healthy => $"Active for role '{_role}' (or not a role member).",
|
||||
_ => $"Role '{_role}' member but not leader.",
|
||||
};
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,50 @@
|
||||
using Akka.Actor;
|
||||
using Akka.Cluster;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
|
||||
namespace ZB.MOM.WW.Health.Akka;
|
||||
|
||||
/// <summary>
|
||||
/// <see cref="IActiveNodeGate"/> implementation that computes <see cref="IsActiveNode"/> directly
|
||||
/// from the Akka cluster state (self member <c>Up</c> and the local node is the cluster leader).
|
||||
/// Register as a singleton.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// The <see cref="ActorSystem"/> is resolved lazily from the service provider; if it is not yet
|
||||
/// available — e.g. during startup before Akka is initialised — <see cref="IsActiveNode"/> returns
|
||||
/// <c>false</c> (the safe default during startup). This gate reads the cluster state directly and
|
||||
/// does not resolve <see cref="ActiveNodeHealthCheck"/> from DI.
|
||||
/// </remarks>
|
||||
public sealed class AkkaActiveNodeGate : IActiveNodeGate
|
||||
{
|
||||
private readonly IServiceProvider _serviceProvider;
|
||||
|
||||
/// <summary>Initializes a new <see cref="AkkaActiveNodeGate"/>.</summary>
|
||||
/// <param name="serviceProvider">
|
||||
/// The application service provider. The <see cref="ActorSystem"/> is resolved lazily; if it is
|
||||
/// not yet available <see cref="IsActiveNode"/> returns <c>false</c>.
|
||||
/// </param>
|
||||
public AkkaActiveNodeGate(IServiceProvider serviceProvider)
|
||||
{
|
||||
_serviceProvider = serviceProvider ?? throw new ArgumentNullException(nameof(serviceProvider));
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public bool IsActiveNode
|
||||
{
|
||||
get
|
||||
{
|
||||
var system = _serviceProvider.GetService<ActorSystem>();
|
||||
if (system is null)
|
||||
return false;
|
||||
|
||||
var cluster = Cluster.Get(system);
|
||||
var self = cluster.SelfMember;
|
||||
if (self.Status != MemberStatus.Up)
|
||||
return false;
|
||||
|
||||
var leader = cluster.State.Leader;
|
||||
return leader is not null && leader == self.Address;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,56 @@
|
||||
using Akka.Actor;
|
||||
using Akka.Cluster;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
|
||||
namespace ZB.MOM.WW.Health.Akka;
|
||||
|
||||
/// <summary>
|
||||
/// Health check that maps the local node's Akka cluster membership status to a
|
||||
/// <see cref="HealthStatus"/> through a configurable <see cref="AkkaClusterStatusPolicy"/>.
|
||||
/// Register to the <see cref="ZbHealthTags.Ready"/> tag (recommended <c>[ready, active]</c>).
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// The <see cref="ActorSystem"/> is resolved lazily from the service provider. If it is not yet
|
||||
/// available — e.g. during startup before Akka is initialised — the check returns
|
||||
/// <see cref="HealthStatus.Degraded"/> rather than throwing, so it is safe to register before Akka
|
||||
/// is fully up.
|
||||
/// </remarks>
|
||||
public sealed class AkkaClusterHealthCheck : IHealthCheck
|
||||
{
|
||||
private readonly IServiceProvider _serviceProvider;
|
||||
private readonly AkkaClusterStatusPolicy _policy;
|
||||
|
||||
/// <summary>Initializes a new <see cref="AkkaClusterHealthCheck"/>.</summary>
|
||||
/// <param name="serviceProvider">
|
||||
/// The application service provider. The <see cref="ActorSystem"/> is resolved lazily so the
|
||||
/// check is startup-safe: if no <see cref="ActorSystem"/> is registered yet the result is Degraded.
|
||||
/// </param>
|
||||
/// <param name="policy">The status-to-health mapping policy to apply.</param>
|
||||
public AkkaClusterHealthCheck(IServiceProvider serviceProvider, AkkaClusterStatusPolicy policy)
|
||||
{
|
||||
_serviceProvider = serviceProvider ?? throw new ArgumentNullException(nameof(serviceProvider));
|
||||
_policy = policy ?? throw new ArgumentNullException(nameof(policy));
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public Task<HealthCheckResult> CheckHealthAsync(
|
||||
HealthCheckContext context,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
var system = _serviceProvider.GetService<ActorSystem>();
|
||||
if (system is null)
|
||||
return Task.FromResult(HealthCheckResult.Degraded("ActorSystem not yet available."));
|
||||
|
||||
var status = Cluster.Get(system).SelfMember.Status;
|
||||
var health = _policy.Evaluate(status);
|
||||
var description = $"Akka cluster member status: {status}";
|
||||
var result = health switch
|
||||
{
|
||||
HealthStatus.Healthy => HealthCheckResult.Healthy(description),
|
||||
HealthStatus.Degraded => HealthCheckResult.Degraded(description),
|
||||
_ => HealthCheckResult.Unhealthy(description),
|
||||
};
|
||||
return Task.FromResult(result);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,56 @@
|
||||
using Akka.Cluster;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
|
||||
namespace ZB.MOM.WW.Health.Akka;
|
||||
|
||||
/// <summary>
|
||||
/// Pure mapping from an Akka <see cref="MemberStatus"/> to a <see cref="HealthStatus"/>.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// Wraps a <see cref="Func{MemberStatus, HealthStatus}"/> so the decision logic is a deterministic,
|
||||
/// table-testable function — <see cref="AkkaClusterHealthCheck"/> only supplies the live cluster
|
||||
/// status. Two named presets reconcile the divergence between the existing ScadaBridge and OtOpcUa
|
||||
/// implementations; construct a custom instance for project-specific overrides.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public sealed class AkkaClusterStatusPolicy
|
||||
{
|
||||
private readonly Func<MemberStatus, HealthStatus> _evaluate;
|
||||
|
||||
/// <summary>Initializes a new <see cref="AkkaClusterStatusPolicy"/>.</summary>
|
||||
/// <param name="evaluate">The pure status-to-health mapping function.</param>
|
||||
public AkkaClusterStatusPolicy(Func<MemberStatus, HealthStatus> evaluate)
|
||||
{
|
||||
_evaluate = evaluate ?? throw new ArgumentNullException(nameof(evaluate));
|
||||
}
|
||||
|
||||
/// <summary>Applies the policy to the given member status.</summary>
|
||||
/// <param name="status">The local node's Akka cluster member status.</param>
|
||||
/// <returns>The mapped <see cref="HealthStatus"/>.</returns>
|
||||
public HealthStatus Evaluate(MemberStatus status) => _evaluate(status);
|
||||
|
||||
/// <summary>
|
||||
/// ScadaBridge origin: <c>Up</c>/<c>Joining</c> → Healthy, <c>Leaving</c>/<c>Exiting</c> →
|
||||
/// Degraded, everything else → Unhealthy. The convergence target for all projects.
|
||||
/// </summary>
|
||||
public static AkkaClusterStatusPolicy Default { get; } = new(static status => status switch
|
||||
{
|
||||
MemberStatus.Up or MemberStatus.Joining => HealthStatus.Healthy,
|
||||
MemberStatus.Leaving or MemberStatus.Exiting => HealthStatus.Degraded,
|
||||
_ => HealthStatus.Unhealthy,
|
||||
});
|
||||
|
||||
/// <summary>
|
||||
/// OtOpcUa origin: self-<c>Up</c>-among-reachable-members → Healthy, any non-<c>Up</c> state
|
||||
/// (including <c>Leaving</c>/<c>Exiting</c>/<c>Down</c>) → Degraded. Provided for backward
|
||||
/// compatibility during OtOpcUa's migration.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// The original OtOpcUa check scanned the reachable member set for self with
|
||||
/// <c>Status == Up</c>; any other state caused the scan to miss self and collapse to Degraded.
|
||||
/// This preset reproduces that behavior: only <see cref="MemberStatus.Up"/> is Healthy.
|
||||
/// </remarks>
|
||||
public static AkkaClusterStatusPolicy OtOpcUaCompat { get; } = new(static status =>
|
||||
status == MemberStatus.Up ? HealthStatus.Healthy : HealthStatus.Degraded);
|
||||
}
|
||||
@@ -0,0 +1,21 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<IsPackable>true</IsPackable>
|
||||
<PackageId>ZB.MOM.WW.Health.Akka</PackageId>
|
||||
<Authors>ZB.MOM.WW</Authors>
|
||||
<Description>Akka.Cluster health-check extensions for the ZB.MOM.WW SCADA family.</Description>
|
||||
<PackageTags>health-checks;akka;akka-cluster;scada;wonderware;zb-mom-ww</PackageTags>
|
||||
<PackageProjectUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-health</PackageProjectUrl>
|
||||
<RepositoryUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-health</RepositoryUrl>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\ZB.MOM.WW.Health\ZB.MOM.WW.Health.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="Akka.Cluster" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,111 @@
|
||||
using Microsoft.EntityFrameworkCore;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
|
||||
namespace ZB.MOM.WW.Health.EntityFrameworkCore;
|
||||
|
||||
/// <summary>
|
||||
/// Health check that verifies database reachability through an EF Core <typeparamref name="TContext"/>.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// The default probe calls
|
||||
/// <see cref="Microsoft.EntityFrameworkCore.Infrastructure.DatabaseFacade.CanConnectAsync(CancellationToken)"/>
|
||||
/// (the ScadaBridge pattern): <see cref="HealthStatus.Healthy"/> when it returns <c>true</c>,
|
||||
/// <see cref="HealthStatus.Unhealthy"/> when it returns <c>false</c> or throws. Supplying
|
||||
/// <see cref="DatabaseHealthCheckOptions{TContext}.ProbeQuery"/> swaps in a stricter query-based probe
|
||||
/// (the OtOpcUa "query <c>Deployments</c>" pattern): the result is <see cref="HealthStatus.Healthy"/>
|
||||
/// unless the delegate throws, in which case it is <see cref="HealthStatus.Unhealthy"/>. No exception
|
||||
/// escapes <see cref="CheckHealthAsync"/>.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// The context is resolved from the application <see cref="IServiceProvider"/>: an
|
||||
/// <see cref="IDbContextFactory{TContext}"/> is used when one is registered (each probe gets a fresh,
|
||||
/// disposed context); otherwise a scoped <typeparamref name="TContext"/> is resolved from a new DI
|
||||
/// scope. Recommended registration tag: <c>ZbHealthTags.Ready</c> (applied by the registrant).
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// The scoped-resolution path is safe for <c>AddDbContextPool</c>: disposing the
|
||||
/// <see cref="IServiceScope"/> returns the pooled context to the pool rather than destroying it,
|
||||
/// so no pooled instance is prematurely discarded.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
/// <typeparam name="TContext">The EF Core <see cref="DbContext"/> to probe.</typeparam>
|
||||
public sealed class DatabaseHealthCheck<TContext> : IHealthCheck
|
||||
where TContext : DbContext
|
||||
{
|
||||
private readonly IServiceProvider _serviceProvider;
|
||||
private readonly DatabaseHealthCheckOptions<TContext> _options;
|
||||
|
||||
/// <summary>Initializes a new <see cref="DatabaseHealthCheck{TContext}"/>.</summary>
|
||||
/// <param name="serviceProvider">
|
||||
/// Application service provider used to resolve <typeparamref name="TContext"/> — preferring a
|
||||
/// registered <see cref="IDbContextFactory{TContext}"/>, otherwise a scoped instance.
|
||||
/// </param>
|
||||
/// <param name="options">
|
||||
/// Probe override and timeout. When <c>null</c>, the default <c>CanConnectAsync</c> probe with a
|
||||
/// 10 s timeout is used.
|
||||
/// </param>
|
||||
public DatabaseHealthCheck(
|
||||
IServiceProvider serviceProvider,
|
||||
DatabaseHealthCheckOptions<TContext>? options = null)
|
||||
{
|
||||
_serviceProvider = serviceProvider ?? throw new ArgumentNullException(nameof(serviceProvider));
|
||||
_options = options ?? new DatabaseHealthCheckOptions<TContext>();
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public async Task<HealthCheckResult> CheckHealthAsync(
|
||||
HealthCheckContext context,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
using var timeoutCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
|
||||
timeoutCts.CancelAfter(_options.Timeout);
|
||||
|
||||
try
|
||||
{
|
||||
var result = await ProbeAsync(timeoutCts.Token).ConfigureAwait(false);
|
||||
// Eagerly release the pending timer on the happy path so the OS timer
|
||||
// resource is not held for the full timeout duration.
|
||||
timeoutCts.CancelAfter(Timeout.InfiniteTimeSpan);
|
||||
return result;
|
||||
}
|
||||
catch (OperationCanceledException ex)
|
||||
when (timeoutCts.IsCancellationRequested && !cancellationToken.IsCancellationRequested)
|
||||
{
|
||||
return HealthCheckResult.Unhealthy($"Database probe timed out after {_options.Timeout}.", ex);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
return HealthCheckResult.Unhealthy("Database connection failed.", ex);
|
||||
}
|
||||
}
|
||||
|
||||
private async Task<HealthCheckResult> ProbeAsync(CancellationToken cancellationToken)
|
||||
{
|
||||
var factory = _serviceProvider.GetService<IDbContextFactory<TContext>>();
|
||||
if (factory is not null)
|
||||
{
|
||||
await using var db = await factory.CreateDbContextAsync(cancellationToken).ConfigureAwait(false);
|
||||
return await RunProbeAsync(db, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
await using var scope = _serviceProvider.CreateAsyncScope();
|
||||
var scoped = scope.ServiceProvider.GetRequiredService<TContext>();
|
||||
return await RunProbeAsync(scoped, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
private async Task<HealthCheckResult> RunProbeAsync(TContext db, CancellationToken cancellationToken)
|
||||
{
|
||||
if (_options.ProbeQuery is { } probeQuery)
|
||||
{
|
||||
await probeQuery(db, cancellationToken).ConfigureAwait(false);
|
||||
return HealthCheckResult.Healthy("Database query probe succeeded.");
|
||||
}
|
||||
|
||||
var canConnect = await db.Database.CanConnectAsync(cancellationToken).ConfigureAwait(false);
|
||||
return canConnect
|
||||
? HealthCheckResult.Healthy("Database connection is available.")
|
||||
: HealthCheckResult.Unhealthy("Database connection failed.");
|
||||
}
|
||||
}
|
||||
+28
@@ -0,0 +1,28 @@
|
||||
using Microsoft.EntityFrameworkCore;
|
||||
|
||||
namespace ZB.MOM.WW.Health.EntityFrameworkCore;
|
||||
|
||||
/// <summary>
|
||||
/// Options for <see cref="DatabaseHealthCheck{TContext}"/>.
|
||||
/// </summary>
|
||||
/// <typeparam name="TContext">The EF Core <see cref="DbContext"/> the probe runs against.</typeparam>
|
||||
public sealed class DatabaseHealthCheckOptions<TContext>
|
||||
where TContext : DbContext
|
||||
{
|
||||
/// <summary>
|
||||
/// Optional query-based probe that overrides the default
|
||||
/// <see cref="Microsoft.EntityFrameworkCore.Infrastructure.DatabaseFacade.CanConnectAsync(CancellationToken)"/>
|
||||
/// reachability check with stricter, query-level validation (the OtOpcUa "query <c>Deployments</c>"
|
||||
/// pattern). Throw to signal failure; return normally to signal success.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Example: <c>(db, ct) => db.Deployments.AsNoTracking().Take(1).ToListAsync(ct)</c>.
|
||||
/// When <c>null</c>, the default <c>CanConnectAsync</c> probe is used.
|
||||
/// </remarks>
|
||||
public Func<TContext, CancellationToken, Task>? ProbeQuery { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// Maximum time the probe may run before it is treated as a failure. Defaults to 10 seconds.
|
||||
/// </summary>
|
||||
public TimeSpan Timeout { get; set; } = TimeSpan.FromSeconds(10);
|
||||
}
|
||||
+21
@@ -0,0 +1,21 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<IsPackable>true</IsPackable>
|
||||
<PackageId>ZB.MOM.WW.Health.EntityFrameworkCore</PackageId>
|
||||
<Authors>ZB.MOM.WW</Authors>
|
||||
<Description>Entity Framework Core health-check extensions for the ZB.MOM.WW SCADA family.</Description>
|
||||
<PackageTags>health-checks;entity-framework-core;efcore;scada;wonderware;zb-mom-ww</PackageTags>
|
||||
<PackageProjectUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-health</PackageProjectUrl>
|
||||
<RepositoryUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-health</RepositoryUrl>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\ZB.MOM.WW.Health\ZB.MOM.WW.Health.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="Microsoft.EntityFrameworkCore" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,80 @@
|
||||
using Microsoft.AspNetCore.Builder;
|
||||
using Microsoft.AspNetCore.Http;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
|
||||
namespace ZB.MOM.WW.Health;
|
||||
|
||||
/// <summary>
|
||||
/// Endpoint filter that gates a route to the active node. Resolves <see cref="IActiveNodeGate"/>
|
||||
/// from request services; when it is registered and reports a standby
|
||||
/// (<see cref="IActiveNodeGate.IsActiveNode"/> is <c>false</c>) the request is short-circuited with
|
||||
/// HTTP 503 and a <c>Retry-After</c> header. When no gate is registered (non-clustered host / tests)
|
||||
/// the request is served, preserving prior behaviour.
|
||||
/// </summary>
|
||||
public sealed class ActiveNodeGateEndpointFilter : IEndpointFilter
|
||||
{
|
||||
/// <summary>Default <c>Retry-After</c> value (seconds) advertised on a standby 503 response.</summary>
|
||||
private const int DefaultRetryAfterSeconds = 5;
|
||||
|
||||
private readonly int _retryAfterSeconds;
|
||||
|
||||
/// <summary>Initializes a new <see cref="ActiveNodeGateEndpointFilter"/> using the default 5 s retry-after.</summary>
|
||||
public ActiveNodeGateEndpointFilter()
|
||||
: this(DefaultRetryAfterSeconds)
|
||||
{
|
||||
}
|
||||
|
||||
/// <summary>Initializes a new <see cref="ActiveNodeGateEndpointFilter"/>.</summary>
|
||||
/// <param name="retryAfterSeconds">The <c>Retry-After</c> value (seconds) advertised on a standby 503 response.</param>
|
||||
public ActiveNodeGateEndpointFilter(int retryAfterSeconds) => _retryAfterSeconds = retryAfterSeconds;
|
||||
|
||||
/// <summary>
|
||||
/// Returns 503 (with <c>Retry-After</c>) when the resolved <see cref="IActiveNodeGate"/> reports
|
||||
/// a standby node; otherwise delegates to the next filter or endpoint handler.
|
||||
/// </summary>
|
||||
/// <param name="context">The endpoint filter invocation context.</param>
|
||||
/// <param name="next">The next filter or endpoint handler in the pipeline.</param>
|
||||
public async ValueTask<object?> InvokeAsync(
|
||||
EndpointFilterInvocationContext context,
|
||||
EndpointFilterDelegate next)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(context);
|
||||
ArgumentNullException.ThrowIfNull(next);
|
||||
|
||||
var httpContext = context.HttpContext;
|
||||
var gate = httpContext.RequestServices.GetService<IActiveNodeGate>();
|
||||
|
||||
if (gate is { IsActiveNode: false })
|
||||
{
|
||||
httpContext.Response.Headers.RetryAfter =
|
||||
_retryAfterSeconds.ToString(System.Globalization.CultureInfo.InvariantCulture);
|
||||
return Results.StatusCode(StatusCodes.Status503ServiceUnavailable);
|
||||
}
|
||||
|
||||
return await next(context);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Route convention that gates endpoint(s) to the active node, returning 503 on standby nodes.
|
||||
/// </summary>
|
||||
public static class ActiveNodeGateExtensions
|
||||
{
|
||||
/// <summary>
|
||||
/// Applies <see cref="ActiveNodeGateEndpointFilter"/> to the decorated endpoint(s): the route is
|
||||
/// served only when the DI-resolved <see cref="IActiveNodeGate"/> reports the node active, and
|
||||
/// returns 503 with a <c>Retry-After</c> header when the node is a standby.
|
||||
/// </summary>
|
||||
/// <param name="builder">The endpoint convention builder to decorate.</param>
|
||||
/// <param name="retryAfterSeconds">
|
||||
/// The <c>Retry-After</c> value (seconds) advertised on a standby 503 response. Defaults to 5.
|
||||
/// </param>
|
||||
/// <returns>The same <paramref name="builder"/> for chaining.</returns>
|
||||
public static IEndpointConventionBuilder RequireActiveNode(
|
||||
this IEndpointConventionBuilder builder,
|
||||
int retryAfterSeconds = 5)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(builder);
|
||||
return builder.AddEndpointFilter(new ActiveNodeGateEndpointFilter(retryAfterSeconds));
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,88 @@
|
||||
using Grpc.Core;
|
||||
using Grpc.Net.Client;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
|
||||
namespace ZB.MOM.WW.Health;
|
||||
|
||||
/// <summary>
|
||||
/// Health check that verifies a downstream gRPC dependency is reachable over its
|
||||
/// <see cref="GrpcChannel"/>.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// The probe is injectable via <see cref="GrpcDependencyOptions.Probe"/>; the default drives the
|
||||
/// channel to a connected state with <see cref="GrpcChannel.ConnectAsync"/>. The result is
|
||||
/// <see cref="HealthStatus.Healthy"/> when the probe returns <c>true</c>, and
|
||||
/// <see cref="HealthStatus.Unhealthy"/> when it returns <c>false</c>, throws an
|
||||
/// <see cref="RpcException"/>, or times out / is cancelled within
|
||||
/// <see cref="GrpcDependencyOptions.Timeout"/>.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Recommended registration tags: <see cref="ZbHealthTags.Ready"/> and
|
||||
/// <see cref="ZbHealthTags.Active"/> — a missing downstream gRPC dependency makes the node both
|
||||
/// not-ready and not-able-to-act. The registrant applies the tags.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public sealed class GrpcDependencyHealthCheck : IHealthCheck
|
||||
{
|
||||
private readonly GrpcChannel _channel;
|
||||
private readonly GrpcDependencyOptions _options;
|
||||
|
||||
/// <summary>Initializes a new <see cref="GrpcDependencyHealthCheck"/>.</summary>
|
||||
/// <param name="channel">The gRPC channel to the downstream dependency.</param>
|
||||
/// <param name="options">
|
||||
/// Probe, dependency name, and timeout. When <c>null</c>, defaults are used (the default probe is
|
||||
/// <see cref="GrpcChannel.ConnectAsync"/> with a 5 s timeout).
|
||||
/// </param>
|
||||
public GrpcDependencyHealthCheck(GrpcChannel channel, GrpcDependencyOptions? options = null)
|
||||
{
|
||||
_channel = channel ?? throw new ArgumentNullException(nameof(channel));
|
||||
_options = options ?? new GrpcDependencyOptions();
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public async Task<HealthCheckResult> CheckHealthAsync(
|
||||
HealthCheckContext context,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
var name = _options.DependencyName ?? "gRPC dependency";
|
||||
var probe = _options.Probe ?? DefaultProbeAsync;
|
||||
|
||||
using var timeoutCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
|
||||
timeoutCts.CancelAfter(_options.Timeout);
|
||||
|
||||
try
|
||||
{
|
||||
var reachable = await probe(_channel, timeoutCts.Token).ConfigureAwait(false);
|
||||
return reachable
|
||||
? HealthCheckResult.Healthy($"{name} is reachable.")
|
||||
: HealthCheckResult.Unhealthy($"{name} is unreachable.");
|
||||
}
|
||||
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
|
||||
{
|
||||
throw;
|
||||
}
|
||||
catch (RpcException ex) when (ex.StatusCode == StatusCode.Cancelled && cancellationToken.IsCancellationRequested)
|
||||
{
|
||||
throw new OperationCanceledException(cancellationToken);
|
||||
}
|
||||
catch (RpcException ex)
|
||||
{
|
||||
return HealthCheckResult.Unhealthy($"{name} probe failed: {ex.Status.StatusCode}.", ex);
|
||||
}
|
||||
catch (OperationCanceledException ex) when (timeoutCts.IsCancellationRequested)
|
||||
{
|
||||
return HealthCheckResult.Unhealthy($"{name} probe timed out after {_options.Timeout}.", ex);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Default probe: connects the channel and reports reachability. Returns <c>true</c> once the
|
||||
/// channel reaches a connected state; surfaces failures as a thrown exception (handled by the caller).
|
||||
/// </summary>
|
||||
private static async Task<bool> DefaultProbeAsync(GrpcChannel channel, CancellationToken cancellationToken)
|
||||
{
|
||||
await channel.ConnectAsync(cancellationToken).ConfigureAwait(false);
|
||||
return true;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,41 @@
|
||||
using Grpc.Net.Client;
|
||||
|
||||
namespace ZB.MOM.WW.Health;
|
||||
|
||||
/// <summary>
|
||||
/// Options for <see cref="GrpcDependencyHealthCheck"/>.
|
||||
/// </summary>
|
||||
public sealed class GrpcDependencyOptions
|
||||
{
|
||||
/// <summary>
|
||||
/// The reachability probe. Returns <c>true</c> when the dependency is reachable, <c>false</c>
|
||||
/// otherwise. When <c>null</c> the default probe is used: <see cref="GrpcChannel.ConnectAsync"/>,
|
||||
/// which drives the channel to the <see cref="Grpc.Core.ConnectivityState.Ready"/> state (or
|
||||
/// throws / cancels on failure). Override to perform a richer probe, e.g. a
|
||||
/// <c>grpc.health.v1.Health/Check</c> RPC returning <c>SERVING</c>.
|
||||
/// </summary>
|
||||
public Func<GrpcChannel, CancellationToken, Task<bool>>? Probe { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// Human-readable name of the dependency, surfaced in the <c>HealthCheckResult</c> description.
|
||||
/// </summary>
|
||||
public string? DependencyName { get; set; }
|
||||
|
||||
private TimeSpan _timeout = TimeSpan.FromSeconds(5);
|
||||
|
||||
/// <summary>Maximum time the probe may take before it is treated as unreachable. Default 5 s.</summary>
|
||||
/// <exception cref="ArgumentOutOfRangeException">Thrown when set to a value <= <see cref="TimeSpan.Zero"/>.</exception>
|
||||
public TimeSpan Timeout
|
||||
{
|
||||
get => _timeout;
|
||||
set
|
||||
{
|
||||
if (value <= TimeSpan.Zero)
|
||||
{
|
||||
throw new ArgumentOutOfRangeException(nameof(value), value, "Timeout must be greater than zero.");
|
||||
}
|
||||
|
||||
_timeout = value;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,20 @@
|
||||
namespace ZB.MOM.WW.Health;
|
||||
|
||||
/// <summary>
|
||||
/// Single-property seam: is this node the active / leader node?
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Attach to endpoints or route groups via
|
||||
/// <see cref="ActiveNodeGateExtensions.RequireActiveNode"/>. A standby node must not serve the
|
||||
/// gated routes, so the filter returns HTTP 503 when <see cref="IsActiveNode"/> is <c>false</c>.
|
||||
/// The implementation is supplied by the consumer — the <c>ZB.MOM.WW.Health.Akka</c> package ships
|
||||
/// <c>AkkaActiveNodeGate</c> for clustered nodes; non-Akka hosts provide their own.
|
||||
/// </remarks>
|
||||
public interface IActiveNodeGate
|
||||
{
|
||||
/// <summary>
|
||||
/// <c>true</c> when this node is the active node and may serve gated routes;
|
||||
/// <c>false</c> on a standby node.
|
||||
/// </summary>
|
||||
bool IsActiveNode { get; }
|
||||
}
|
||||
@@ -0,0 +1,30 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<IsPackable>true</IsPackable>
|
||||
<PackageId>ZB.MOM.WW.Health</PackageId>
|
||||
<Authors>ZB.MOM.WW</Authors>
|
||||
<Description>Core ASP.NET Core health-check extensions for the ZB.MOM.WW SCADA family.</Description>
|
||||
<PackageTags>health-checks;aspnetcore;scada;wonderware;zb-mom-ww</PackageTags>
|
||||
<PackageProjectUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-health</PackageProjectUrl>
|
||||
<RepositoryUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-health</RepositoryUrl>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!--
|
||||
Microsoft.AspNetCore.App is a shared framework, not a NuGet package. It brings in the
|
||||
ASP.NET Core health-checks middleware, IHealthCheck, HealthCheckService, and the full
|
||||
Microsoft.Extensions.* surface. Referencing the shared framework is the supported path
|
||||
for net10.0 libraries that target ASP.NET Core.
|
||||
-->
|
||||
<FrameworkReference Include="Microsoft.AspNetCore.App" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!-- Abstractions for IHealthCheck / HealthCheckResult (also transitively provided by the
|
||||
framework ref above, but declared explicitly so the dependency is visible to consumers). -->
|
||||
<PackageReference Include="Microsoft.Extensions.Diagnostics.HealthChecks.Abstractions" />
|
||||
<PackageReference Include="Grpc.Net.Client" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,85 @@
|
||||
using Microsoft.AspNetCore.Builder;
|
||||
using Microsoft.AspNetCore.Diagnostics.HealthChecks;
|
||||
using Microsoft.AspNetCore.Routing;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
|
||||
namespace ZB.MOM.WW.Health;
|
||||
|
||||
/// <summary>
|
||||
/// Maps the canonical ZB.MOM.WW three-tier health endpoints in one call.
|
||||
/// </summary>
|
||||
public static class ZbHealthEndpointExtensions
|
||||
{
|
||||
/// <summary>
|
||||
/// Maps the three health tiers:
|
||||
/// <list type="bullet">
|
||||
/// <item><description><c>/health/ready</c> — runs only checks tagged <see cref="ZbHealthTags.Ready"/>.</description></item>
|
||||
/// <item><description><c>/health/active</c> — runs only checks tagged <see cref="ZbHealthTags.Active"/>.</description></item>
|
||||
/// <item><description><c>/healthz</c> — bare process liveness; runs no checks (always 200 while the process is up).</description></item>
|
||||
/// </list>
|
||||
/// All three are anonymous. Status mapping is the ASP.NET Core default:
|
||||
/// Healthy/Degraded → 200, Unhealthy → 503.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Does NOT call <c>services.AddHealthChecks()</c> — the caller registers probes and their tags.
|
||||
/// The readiness and active tiers use the canonical JSON writer
|
||||
/// (<see cref="ZbHealthWriter.WriteJsonAsync"/>) unless overridden via
|
||||
/// <see cref="ZbHealthEndpointOptions.ResponseWriter"/>. The liveness tier runs no checks and
|
||||
/// emits a minimal <c>200 OK</c> body.
|
||||
/// </remarks>
|
||||
/// <returns>
|
||||
/// The <see cref="IEndpointConventionBuilder"/> for the readiness (<c>/health/ready</c>) endpoint.
|
||||
/// A single tier is returned (rather than a composite) to keep the API simple; conventions
|
||||
/// applied to the result affect only the readiness endpoint.
|
||||
/// </returns>
|
||||
public static IEndpointConventionBuilder MapZbHealth(
|
||||
this IEndpointRouteBuilder endpoints,
|
||||
ZbHealthEndpointOptions? options = null)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(endpoints);
|
||||
options ??= new ZbHealthEndpointOptions();
|
||||
|
||||
var responseWriter = options.ResponseWriter ?? ZbHealthWriter.WriteJsonAsync;
|
||||
|
||||
var ready = endpoints.MapHealthChecks(options.ReadyPath, new HealthCheckOptions
|
||||
{
|
||||
Predicate = static c => c.Tags.Contains(ZbHealthTags.Ready),
|
||||
ResponseWriter = responseWriter,
|
||||
}).AllowAnonymous();
|
||||
|
||||
endpoints.MapHealthChecks(options.ActivePath, new HealthCheckOptions
|
||||
{
|
||||
Predicate = static c => c.Tags.Contains(ZbHealthTags.Active),
|
||||
ResponseWriter = responseWriter,
|
||||
}).AllowAnonymous();
|
||||
|
||||
// Liveness: run no checks. The endpoint returns 200 as long as the process can respond.
|
||||
// No JSON writer — the empty report would carry no useful data, so the framework default
|
||||
// (a minimal plain-text body) is sufficient.
|
||||
endpoints.MapHealthChecks(options.LivePath, new HealthCheckOptions
|
||||
{
|
||||
Predicate = static _ => false,
|
||||
}).AllowAnonymous();
|
||||
|
||||
return ready;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Maps the three health tiers, configuring options inline. See the other
|
||||
/// <see cref="MapZbHealth(IEndpointRouteBuilder, ZbHealthEndpointOptions?)"/> overload for tier semantics.
|
||||
/// </summary>
|
||||
/// <param name="endpoints">The endpoint route builder to map onto.</param>
|
||||
/// <param name="configure">Callback that mutates a fresh <see cref="ZbHealthEndpointOptions"/>.</param>
|
||||
/// <returns>The <see cref="IEndpointConventionBuilder"/> for the readiness endpoint.</returns>
|
||||
public static IEndpointConventionBuilder MapZbHealth(
|
||||
this IEndpointRouteBuilder endpoints,
|
||||
Action<ZbHealthEndpointOptions> configure)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(endpoints);
|
||||
ArgumentNullException.ThrowIfNull(configure);
|
||||
|
||||
var options = new ZbHealthEndpointOptions();
|
||||
configure(options);
|
||||
return endpoints.MapZbHealth(options);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,27 @@
|
||||
using Microsoft.AspNetCore.Http;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
|
||||
namespace ZB.MOM.WW.Health;
|
||||
|
||||
/// <summary>
|
||||
/// Options for <see cref="ZbHealthEndpointExtensions.MapZbHealth"/>. Lets callers override the
|
||||
/// three tier route paths and the JSON response writer. The defaults match the ZB.MOM.WW health contract.
|
||||
/// </summary>
|
||||
public sealed class ZbHealthEndpointOptions
|
||||
{
|
||||
/// <summary>Path for the readiness tier (runs only checks tagged <see cref="ZbHealthTags.Ready"/>).</summary>
|
||||
public string ReadyPath { get; set; } = "/health/ready";
|
||||
|
||||
/// <summary>Path for the active-node tier (runs only checks tagged <see cref="ZbHealthTags.Active"/>).</summary>
|
||||
public string ActivePath { get; set; } = "/health/active";
|
||||
|
||||
/// <summary>Path for the bare liveness tier (runs no checks; 200 while the process is up).</summary>
|
||||
public string LivePath { get; set; } = "/healthz";
|
||||
|
||||
/// <summary>
|
||||
/// Response writer for the readiness and active tiers. Defaults to
|
||||
/// <see cref="ZbHealthWriter.WriteJsonAsync"/> (canonical JSON). The liveness tier runs no checks
|
||||
/// and emits a minimal body, so this writer is not applied to it.
|
||||
/// </summary>
|
||||
public Func<HttpContext, HealthReport, Task>? ResponseWriter { get; set; }
|
||||
}
|
||||
@@ -0,0 +1,19 @@
|
||||
namespace ZB.MOM.WW.Health;
|
||||
|
||||
/// <summary>
|
||||
/// Canonical health-check tag constants for the ZB.MOM.WW three-tier health pattern.
|
||||
/// Use these when registering checks, e.g.
|
||||
/// <c>AddCheck("db", check, tags: [ZbHealthTags.Ready])</c>, so that
|
||||
/// <see cref="ZbHealthEndpointExtensions.MapZbHealth"/> routes each check to the right tier.
|
||||
/// </summary>
|
||||
public static class ZbHealthTags
|
||||
{
|
||||
/// <summary>Readiness checks (dependencies needed before the node can serve traffic).</summary>
|
||||
public const string Ready = "ready";
|
||||
|
||||
/// <summary>Active-node checks (leader / active-singleton gating).</summary>
|
||||
public const string Active = "active";
|
||||
|
||||
/// <summary>Liveness — process is up. Reserved tag; the liveness endpoint runs no checks.</summary>
|
||||
public const string Live = "live";
|
||||
}
|
||||
@@ -0,0 +1,77 @@
|
||||
using System.Text.Json;
|
||||
using System.Text.Json.Serialization;
|
||||
using Microsoft.AspNetCore.Http;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
|
||||
namespace ZB.MOM.WW.Health;
|
||||
|
||||
/// <summary>
|
||||
/// Canonical JSON response writer for the ZB.MOM.WW health endpoints.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Self-contained — it has no runtime dependency on <c>AspNetCore.HealthChecks.UI.Client</c>;
|
||||
/// the JSON shape is modelled after that library's <c>UIResponseWriter</c> output but written here
|
||||
/// with <see cref="System.Text.Json"/>. The body shape is:
|
||||
/// <code>
|
||||
/// {
|
||||
/// "status": "Healthy|Degraded|Unhealthy",
|
||||
/// "totalDurationMs": 12.34,
|
||||
/// "entries": {
|
||||
/// "<name>": { "status": "...", "description": "...", "durationMs": 1.23 }
|
||||
/// }
|
||||
/// }
|
||||
/// </code>
|
||||
/// The HTTP status code is left to the ASP.NET Core health-checks middleware (Healthy/Degraded → 200,
|
||||
/// Unhealthy → 503); this writer only renders the body and sets <c>Content-Type: application/json</c>.
|
||||
/// </remarks>
|
||||
public static class ZbHealthWriter
|
||||
{
|
||||
private static readonly JsonSerializerOptions SerializerOptions = new()
|
||||
{
|
||||
PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
|
||||
DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull,
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Writes <paramref name="report"/> to the response as canonical ZB.MOM.WW health JSON.
|
||||
/// </summary>
|
||||
/// <param name="context">The current HTTP context. Its <see cref="HttpResponse"/> is written to.</param>
|
||||
/// <param name="report">The aggregated health report for the tier that ran.</param>
|
||||
public static async Task WriteJsonAsync(HttpContext context, HealthReport report)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(context);
|
||||
ArgumentNullException.ThrowIfNull(report);
|
||||
|
||||
context.Response.ContentType = "application/json; charset=utf-8";
|
||||
|
||||
var payload = new HealthReportDto
|
||||
{
|
||||
Status = report.Status.ToString(),
|
||||
TotalDurationMs = report.TotalDuration.TotalMilliseconds,
|
||||
Entries = report.Entries.ToDictionary(
|
||||
static e => e.Key,
|
||||
static e => new HealthEntryDto
|
||||
{
|
||||
Status = e.Value.Status.ToString(),
|
||||
Description = e.Value.Description,
|
||||
DurationMs = e.Value.Duration.TotalMilliseconds,
|
||||
}),
|
||||
};
|
||||
|
||||
await JsonSerializer.SerializeAsync(context.Response.Body, payload, SerializerOptions, context.RequestAborted).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
private sealed class HealthReportDto
|
||||
{
|
||||
public string Status { get; init; } = string.Empty;
|
||||
public double TotalDurationMs { get; init; }
|
||||
public Dictionary<string, HealthEntryDto> Entries { get; init; } = new();
|
||||
}
|
||||
|
||||
private sealed class HealthEntryDto
|
||||
{
|
||||
public string Status { get; init; } = string.Empty;
|
||||
public string? Description { get; init; }
|
||||
public double DurationMs { get; init; }
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,93 @@
|
||||
using Akka.Actor;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
using ZB.MOM.WW.Health.Akka;
|
||||
|
||||
namespace ZB.MOM.WW.Health.Akka.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Table-driven tests for the pure <see cref="ActiveNodeDecision.Evaluate"/> helper covering both
|
||||
/// the role-less (ScadaBridge ActiveNode) and role-filtered (OtOpcUa AdminRoleLeader) matrices,
|
||||
/// plus the startup-safety null-guards on <see cref="ActiveNodeHealthCheck"/> and
|
||||
/// <see cref="AkkaActiveNodeGate"/> when no <see cref="ActorSystem"/> is registered.
|
||||
/// </summary>
|
||||
public sealed class ActiveNodeDecisionTests
|
||||
{
|
||||
// Role-less: requiredRole == null. hasRole is irrelevant. Healthy iff (selfUp && isLeader), else Unhealthy.
|
||||
public static IEnumerable<object?[]> RoleLessCases() => new[]
|
||||
{
|
||||
new object?[] { true, true, false, (string?)null, HealthStatus.Healthy },
|
||||
new object?[] { true, false, false, (string?)null, HealthStatus.Unhealthy },
|
||||
new object?[] { false, true, false, (string?)null, HealthStatus.Unhealthy },
|
||||
new object?[] { false, false, false, (string?)null, HealthStatus.Unhealthy },
|
||||
};
|
||||
|
||||
[Theory]
|
||||
[MemberData(nameof(RoleLessCases))]
|
||||
public void Evaluate_RoleLess(bool selfUp, bool isLeader, bool hasRole, string? requiredRole, HealthStatus expected)
|
||||
{
|
||||
Assert.Equal(expected, ActiveNodeDecision.Evaluate(selfUp, isLeader, hasRole, requiredRole));
|
||||
}
|
||||
|
||||
// Role-filtered: requiredRole != null.
|
||||
// lacks role -> Healthy (probe irrelevant for this node)
|
||||
// has role & is leader -> Healthy (selfUp is ignored — role-filtered mode only cares about leadership)
|
||||
// has role & not leader -> Degraded
|
||||
public static IEnumerable<object[]> RoleFilteredCases() => new[]
|
||||
{
|
||||
// node lacks the role -> Healthy regardless of selfUp / isLeader
|
||||
new object[] { true, true, false, "admin", HealthStatus.Healthy },
|
||||
new object[] { true, false, false, "admin", HealthStatus.Healthy },
|
||||
new object[] { false, false, false, "admin", HealthStatus.Healthy },
|
||||
// node carries the role and is leader -> Healthy (selfUp=true)
|
||||
new object[] { true, true, true, "admin", HealthStatus.Healthy },
|
||||
// node carries the role and is leader -> Healthy (selfUp=false: role-filtered mode ignores selfUp)
|
||||
new object[] { false, true, true, "admin", HealthStatus.Healthy },
|
||||
// node carries the role but is not leader -> Degraded
|
||||
new object[] { true, false, true, "admin", HealthStatus.Degraded },
|
||||
new object[] { false, false, true, "admin", HealthStatus.Degraded },
|
||||
};
|
||||
|
||||
[Theory]
|
||||
[MemberData(nameof(RoleFilteredCases))]
|
||||
public void Evaluate_RoleFiltered(bool selfUp, bool isLeader, bool hasRole, string? requiredRole, HealthStatus expected)
|
||||
{
|
||||
Assert.Equal(expected, ActiveNodeDecision.Evaluate(selfUp, isLeader, hasRole, requiredRole));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task HealthCheck_RoleLess_NoActorSystem_ReturnsDegraded()
|
||||
{
|
||||
var provider = new ServiceCollection().BuildServiceProvider();
|
||||
var check = new ActiveNodeHealthCheck(provider);
|
||||
|
||||
var result = await check.CheckHealthAsync(NewContext(check));
|
||||
|
||||
Assert.Equal(HealthStatus.Degraded, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task HealthCheck_RoleFiltered_NoActorSystem_ReturnsDegraded()
|
||||
{
|
||||
var provider = new ServiceCollection().BuildServiceProvider();
|
||||
var check = new ActiveNodeHealthCheck(provider, "admin");
|
||||
|
||||
var result = await check.CheckHealthAsync(NewContext(check));
|
||||
|
||||
Assert.Equal(HealthStatus.Degraded, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Gate_NoActorSystem_IsActiveNodeFalse()
|
||||
{
|
||||
var provider = new ServiceCollection().BuildServiceProvider();
|
||||
var gate = new AkkaActiveNodeGate(provider);
|
||||
|
||||
Assert.False(gate.IsActiveNode);
|
||||
}
|
||||
|
||||
private static HealthCheckContext NewContext(IHealthCheck check) => new()
|
||||
{
|
||||
Registration = new HealthCheckRegistration("active-node", check, HealthStatus.Unhealthy, tags: null),
|
||||
};
|
||||
}
|
||||
@@ -0,0 +1,77 @@
|
||||
using Akka.Cluster;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
using ZB.MOM.WW.Health.Akka;
|
||||
|
||||
namespace ZB.MOM.WW.Health.Akka.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Table-driven tests for the pure status-mapping function inside <see cref="AkkaClusterStatusPolicy"/>.
|
||||
/// The two presets (<see cref="AkkaClusterStatusPolicy.Default"/> and
|
||||
/// <see cref="AkkaClusterStatusPolicy.OtOpcUaCompat"/>) are the convergence targets for ScadaBridge
|
||||
/// and OtOpcUa respectively; every <see cref="MemberStatus"/> is exercised so a drift in either
|
||||
/// preset fails loudly. Also covers the startup-safety null-guard on <see cref="AkkaClusterHealthCheck"/>.
|
||||
/// </summary>
|
||||
public sealed class AkkaClusterStatusPolicyTests
|
||||
{
|
||||
public static IEnumerable<object[]> DefaultCases() => new[]
|
||||
{
|
||||
new object[] { MemberStatus.Up, HealthStatus.Healthy },
|
||||
new object[] { MemberStatus.Joining, HealthStatus.Healthy },
|
||||
new object[] { MemberStatus.Leaving, HealthStatus.Degraded },
|
||||
new object[] { MemberStatus.Exiting, HealthStatus.Degraded },
|
||||
new object[] { MemberStatus.WeaklyUp, HealthStatus.Unhealthy },
|
||||
new object[] { MemberStatus.Down, HealthStatus.Unhealthy },
|
||||
new object[] { MemberStatus.Removed, HealthStatus.Unhealthy },
|
||||
new object[] { (MemberStatus)99, HealthStatus.Unhealthy }, // unknown / future status
|
||||
};
|
||||
|
||||
[Theory]
|
||||
[MemberData(nameof(DefaultCases))]
|
||||
public void Default_MapsEveryStatus(MemberStatus status, HealthStatus expected)
|
||||
{
|
||||
Assert.Equal(expected, AkkaClusterStatusPolicy.Default.Evaluate(status));
|
||||
}
|
||||
|
||||
public static IEnumerable<object[]> OtOpcUaCompatCases() => new[]
|
||||
{
|
||||
new object[] { MemberStatus.Up, HealthStatus.Healthy },
|
||||
new object[] { MemberStatus.Joining, HealthStatus.Degraded },
|
||||
new object[] { MemberStatus.Leaving, HealthStatus.Degraded },
|
||||
new object[] { MemberStatus.Exiting, HealthStatus.Degraded },
|
||||
new object[] { MemberStatus.WeaklyUp, HealthStatus.Degraded },
|
||||
new object[] { MemberStatus.Down, HealthStatus.Degraded },
|
||||
new object[] { MemberStatus.Removed, HealthStatus.Degraded },
|
||||
new object[] { (MemberStatus)99, HealthStatus.Degraded }, // unknown / future status
|
||||
};
|
||||
|
||||
[Theory]
|
||||
[MemberData(nameof(OtOpcUaCompatCases))]
|
||||
public void OtOpcUaCompat_OnlyUpIsHealthy(MemberStatus status, HealthStatus expected)
|
||||
{
|
||||
Assert.Equal(expected, AkkaClusterStatusPolicy.OtOpcUaCompat.Evaluate(status));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void CustomPolicy_UsesSuppliedFunc()
|
||||
{
|
||||
var policy = new AkkaClusterStatusPolicy(_ => HealthStatus.Unhealthy);
|
||||
Assert.Equal(HealthStatus.Unhealthy, policy.Evaluate(MemberStatus.Up));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task HealthCheck_NoActorSystem_ReturnsDegraded()
|
||||
{
|
||||
var provider = new ServiceCollection().BuildServiceProvider();
|
||||
var check = new AkkaClusterHealthCheck(provider, AkkaClusterStatusPolicy.Default);
|
||||
|
||||
var result = await check.CheckHealthAsync(NewContext(check));
|
||||
|
||||
Assert.Equal(HealthStatus.Degraded, result.Status);
|
||||
}
|
||||
|
||||
private static HealthCheckContext NewContext(IHealthCheck check) => new()
|
||||
{
|
||||
Registration = new HealthCheckRegistration("akka-cluster", check, HealthStatus.Unhealthy, tags: null),
|
||||
};
|
||||
}
|
||||
@@ -0,0 +1,22 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<IsPackable>false</IsPackable>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="coverlet.collector" />
|
||||
<PackageReference Include="Microsoft.NET.Test.Sdk" />
|
||||
<PackageReference Include="xunit" />
|
||||
<PackageReference Include="xunit.runner.visualstudio" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<Using Include="Xunit" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\..\src\ZB.MOM.WW.Health.Akka\ZB.MOM.WW.Health.Akka.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
+155
@@ -0,0 +1,155 @@
|
||||
using Microsoft.Data.Sqlite;
|
||||
using Microsoft.EntityFrameworkCore;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
using ZB.MOM.WW.Health.EntityFrameworkCore;
|
||||
|
||||
namespace ZB.MOM.WW.Health.EntityFrameworkCore.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Verifies <see cref="DatabaseHealthCheck{TContext}"/> against a real SQLite database (in-memory,
|
||||
/// connection kept open) so the <c>CanConnectAsync</c> semantics exercise an actual provider:
|
||||
/// reachable → Healthy, unopenable connection → Unhealthy (no throw escapes), a custom
|
||||
/// <see cref="DatabaseHealthCheckOptions{TContext}.ProbeQuery"/> that queries → Healthy, a
|
||||
/// throwing <c>ProbeQuery</c> → Unhealthy, and a timed-out probe → Unhealthy. Both the
|
||||
/// <see cref="IDbContextFactory{TContext}"/> and the scoped-<c>TContext</c> resolution paths
|
||||
/// are covered.
|
||||
/// </summary>
|
||||
public sealed class DatabaseHealthCheckTests
|
||||
{
|
||||
/// <summary>A minimal context with one entity, used purely to drive provider behaviour.</summary>
|
||||
private sealed class WidgetContext : DbContext
|
||||
{
|
||||
public WidgetContext(DbContextOptions<WidgetContext> options) : base(options) { }
|
||||
|
||||
public DbSet<Widget> Widgets => Set<Widget>();
|
||||
}
|
||||
|
||||
private sealed class Widget
|
||||
{
|
||||
public int Id { get; set; }
|
||||
}
|
||||
|
||||
private static HealthCheckContext NewContext() => new()
|
||||
{
|
||||
Registration = new HealthCheckRegistration(
|
||||
"database",
|
||||
sp => throw new InvalidOperationException("not used"),
|
||||
HealthStatus.Unhealthy,
|
||||
tags: null),
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Builds a provider whose <typeparamref name="WidgetContext"/> is backed by the supplied open
|
||||
/// SQLite connection (and creates the schema). When <paramref name="useFactory"/> is true the
|
||||
/// context is registered via <c>AddDbContextFactory</c>; otherwise via <c>AddDbContext</c> (scoped).
|
||||
/// </summary>
|
||||
private static ServiceProvider BuildProvider(SqliteConnection connection, bool useFactory)
|
||||
{
|
||||
connection.Open();
|
||||
|
||||
var services = new ServiceCollection();
|
||||
if (useFactory)
|
||||
{
|
||||
services.AddDbContextFactory<WidgetContext>(o => o.UseSqlite(connection));
|
||||
}
|
||||
else
|
||||
{
|
||||
services.AddDbContext<WidgetContext>(o => o.UseSqlite(connection));
|
||||
}
|
||||
|
||||
var provider = services.BuildServiceProvider();
|
||||
|
||||
using var scope = provider.CreateScope();
|
||||
scope.ServiceProvider.GetRequiredService<WidgetContext>().Database.EnsureCreated();
|
||||
|
||||
return provider;
|
||||
}
|
||||
|
||||
[Theory]
|
||||
[InlineData(true)]
|
||||
[InlineData(false)]
|
||||
public async Task ReachableContext_Healthy(bool useFactory)
|
||||
{
|
||||
using var connection = new SqliteConnection("DataSource=:memory:");
|
||||
await using var provider = BuildProvider(connection, useFactory);
|
||||
|
||||
var check = new DatabaseHealthCheck<WidgetContext>(provider);
|
||||
|
||||
var result = await check.CheckHealthAsync(NewContext(), CancellationToken.None);
|
||||
|
||||
Assert.Equal(HealthStatus.Healthy, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task UnopenableConnection_Unhealthy_NoThrow()
|
||||
{
|
||||
// Point the context at a file path that cannot be opened (parent directory does not exist).
|
||||
var bogusPath = Path.Combine(Path.GetTempPath(), Guid.NewGuid().ToString("N"), "missing", "db.sqlite");
|
||||
|
||||
var services = new ServiceCollection();
|
||||
services.AddDbContext<WidgetContext>(o => o.UseSqlite($"DataSource={bogusPath};Mode=ReadWrite"));
|
||||
await using var provider = services.BuildServiceProvider();
|
||||
|
||||
var check = new DatabaseHealthCheck<WidgetContext>(provider);
|
||||
|
||||
var result = await check.CheckHealthAsync(NewContext(), CancellationToken.None);
|
||||
|
||||
Assert.Equal(HealthStatus.Unhealthy, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task CustomProbeQuery_RunsQuery_Healthy()
|
||||
{
|
||||
using var connection = new SqliteConnection("DataSource=:memory:");
|
||||
await using var provider = BuildProvider(connection, useFactory: true);
|
||||
|
||||
var options = new DatabaseHealthCheckOptions<WidgetContext>
|
||||
{
|
||||
ProbeQuery = (ctx, ct) => ctx.Widgets.AsNoTracking().AnyAsync(ct),
|
||||
};
|
||||
var check = new DatabaseHealthCheck<WidgetContext>(provider, options);
|
||||
|
||||
var result = await check.CheckHealthAsync(NewContext(), CancellationToken.None);
|
||||
|
||||
Assert.Equal(HealthStatus.Healthy, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ProbeQueryThrows_Unhealthy()
|
||||
{
|
||||
using var connection = new SqliteConnection("DataSource=:memory:");
|
||||
await using var provider = BuildProvider(connection, useFactory: false);
|
||||
|
||||
var options = new DatabaseHealthCheckOptions<WidgetContext>
|
||||
{
|
||||
// Use a faulted task rather than a synchronous throw to accurately model
|
||||
// async probe delegates that encounter an error.
|
||||
ProbeQuery = (_, _) => Task.FromException(new InvalidOperationException("boom")),
|
||||
};
|
||||
var check = new DatabaseHealthCheck<WidgetContext>(provider, options);
|
||||
|
||||
var result = await check.CheckHealthAsync(NewContext(), CancellationToken.None);
|
||||
|
||||
Assert.Equal(HealthStatus.Unhealthy, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ProbeTimeout_Unhealthy()
|
||||
{
|
||||
using var connection = new SqliteConnection("DataSource=:memory:");
|
||||
await using var provider = BuildProvider(connection, useFactory: true);
|
||||
|
||||
// Use a very short timeout and a probe that blocks indefinitely (until cancelled).
|
||||
var options = new DatabaseHealthCheckOptions<WidgetContext>
|
||||
{
|
||||
Timeout = TimeSpan.FromMilliseconds(50),
|
||||
ProbeQuery = async (_, ct) => await Task.Delay(Timeout.Infinite, ct),
|
||||
};
|
||||
var check = new DatabaseHealthCheck<WidgetContext>(provider, options);
|
||||
|
||||
var result = await check.CheckHealthAsync(NewContext(), CancellationToken.None);
|
||||
|
||||
Assert.Equal(HealthStatus.Unhealthy, result.Status);
|
||||
}
|
||||
}
|
||||
+23
@@ -0,0 +1,23 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<IsPackable>false</IsPackable>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="coverlet.collector" />
|
||||
<PackageReference Include="Microsoft.NET.Test.Sdk" />
|
||||
<PackageReference Include="xunit" />
|
||||
<PackageReference Include="xunit.runner.visualstudio" />
|
||||
<PackageReference Include="Microsoft.EntityFrameworkCore.Sqlite" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<Using Include="Xunit" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\..\src\ZB.MOM.WW.Health.EntityFrameworkCore\ZB.MOM.WW.Health.EntityFrameworkCore.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,70 @@
|
||||
using System.Net;
|
||||
using Microsoft.AspNetCore.Builder;
|
||||
using Microsoft.AspNetCore.Hosting;
|
||||
using Microsoft.AspNetCore.TestHost;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using ZB.MOM.WW.Health;
|
||||
|
||||
namespace ZB.MOM.WW.Health.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Verifies <see cref="ActiveNodeGateExtensions.RequireActiveNode"/>: a decorated endpoint serves
|
||||
/// normally (200) when the resolved <see cref="IActiveNodeGate"/> reports the node active, and
|
||||
/// returns 503 with a <c>Retry-After</c> header when the node is a standby.
|
||||
/// </summary>
|
||||
public sealed class ActiveNodeGateTests
|
||||
{
|
||||
private sealed class FakeActiveNodeGate : IActiveNodeGate
|
||||
{
|
||||
public bool IsActiveNode { get; set; }
|
||||
}
|
||||
|
||||
private static async Task<HttpResponseMessage> CallAsync(bool isActive)
|
||||
{
|
||||
var builder = WebApplication.CreateBuilder();
|
||||
builder.WebHost.UseTestServer();
|
||||
builder.Services.AddSingleton<IActiveNodeGate>(new FakeActiveNodeGate { IsActiveNode = isActive });
|
||||
|
||||
await using var app = builder.Build();
|
||||
app.MapGet("/x", () => "ok").RequireActiveNode();
|
||||
await app.StartAsync();
|
||||
|
||||
var client = app.GetTestClient();
|
||||
return await client.GetAsync("/x");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ActiveNode_Returns200()
|
||||
{
|
||||
var response = await CallAsync(isActive: true);
|
||||
|
||||
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
|
||||
Assert.Equal("ok", await response.Content.ReadAsStringAsync());
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task StandbyNode_Returns503_WithRetryAfterHeader()
|
||||
{
|
||||
var response = await CallAsync(isActive: false);
|
||||
|
||||
Assert.Equal(HttpStatusCode.ServiceUnavailable, response.StatusCode);
|
||||
Assert.True(
|
||||
response.Headers.Contains("Retry-After"),
|
||||
"Standby response must carry a Retry-After header.");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task NoGateRegistered_AllowsRequest()
|
||||
{
|
||||
// When no IActiveNodeGate is registered (non-clustered host / tests), the endpoint is served.
|
||||
var builder = WebApplication.CreateBuilder();
|
||||
builder.WebHost.UseTestServer();
|
||||
|
||||
await using var app = builder.Build();
|
||||
app.MapGet("/x", () => "ok").RequireActiveNode();
|
||||
await app.StartAsync();
|
||||
|
||||
var response = await app.GetTestClient().GetAsync("/x");
|
||||
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,107 @@
|
||||
using Grpc.Core;
|
||||
using Grpc.Net.Client;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
using ZB.MOM.WW.Health;
|
||||
|
||||
namespace ZB.MOM.WW.Health.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Verifies <see cref="GrpcDependencyHealthCheck"/> via an injected probe (no live gRPC server):
|
||||
/// probe-true → Healthy, probe-false → Unhealthy, and an <see cref="RpcException"/> from the probe
|
||||
/// → Unhealthy. The channel is constructed but never dialled because the probe is stubbed.
|
||||
/// </summary>
|
||||
public sealed class GrpcDependencyHealthCheckTests
|
||||
{
|
||||
private static readonly GrpcChannel Channel = GrpcChannel.ForAddress("http://localhost");
|
||||
|
||||
private static async Task<HealthCheckResult> RunAsync(
|
||||
GrpcDependencyOptions options, CancellationToken cancellationToken = default)
|
||||
{
|
||||
var check = new GrpcDependencyHealthCheck(Channel, options);
|
||||
var context = new HealthCheckContext
|
||||
{
|
||||
Registration = new HealthCheckRegistration("grpc-dep", check, HealthStatus.Unhealthy, tags: null),
|
||||
};
|
||||
return await check.CheckHealthAsync(context, cancellationToken);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ProbeReturnsTrue_Healthy()
|
||||
{
|
||||
var result = await RunAsync(new GrpcDependencyOptions
|
||||
{
|
||||
Probe = static (_, _) => Task.FromResult(true),
|
||||
});
|
||||
|
||||
Assert.Equal(HealthStatus.Healthy, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ProbeReturnsFalse_Unhealthy()
|
||||
{
|
||||
var result = await RunAsync(new GrpcDependencyOptions
|
||||
{
|
||||
Probe = static (_, _) => Task.FromResult(false),
|
||||
});
|
||||
|
||||
Assert.Equal(HealthStatus.Unhealthy, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ProbeThrowsRpcException_Unhealthy()
|
||||
{
|
||||
var result = await RunAsync(new GrpcDependencyOptions
|
||||
{
|
||||
Probe = static (_, _) => throw new RpcException(new Status(StatusCode.Unavailable, "down")),
|
||||
});
|
||||
|
||||
Assert.Equal(HealthStatus.Unhealthy, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task DependencyName_AppearsInDescription()
|
||||
{
|
||||
var result = await RunAsync(new GrpcDependencyOptions
|
||||
{
|
||||
DependencyName = "mxaccessgw worker",
|
||||
Probe = static (_, _) => Task.FromResult(false),
|
||||
});
|
||||
|
||||
Assert.Equal(HealthStatus.Unhealthy, result.Status);
|
||||
Assert.Contains("mxaccessgw worker", result.Description);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ProbeExceedsTimeout_Unhealthy()
|
||||
{
|
||||
var result = await RunAsync(new GrpcDependencyOptions
|
||||
{
|
||||
Timeout = TimeSpan.FromMilliseconds(50),
|
||||
Probe = static async (_, ct) =>
|
||||
{
|
||||
await Task.Delay(Timeout.Infinite, ct);
|
||||
return true;
|
||||
},
|
||||
});
|
||||
|
||||
Assert.Equal(HealthStatus.Unhealthy, result.Status);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ExternalCancellation_Throws()
|
||||
{
|
||||
using var cts = new CancellationTokenSource();
|
||||
await cts.CancelAsync();
|
||||
|
||||
await Assert.ThrowsAnyAsync<OperationCanceledException>(() => RunAsync(
|
||||
new GrpcDependencyOptions
|
||||
{
|
||||
Probe = static async (_, ct) =>
|
||||
{
|
||||
await Task.Delay(Timeout.Infinite, ct);
|
||||
return true;
|
||||
},
|
||||
},
|
||||
cts.Token));
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,92 @@
|
||||
using System.Net;
|
||||
using System.Net.Http.Json;
|
||||
using System.Text.Json;
|
||||
using Microsoft.AspNetCore.Builder;
|
||||
using Microsoft.AspNetCore.Hosting;
|
||||
using Microsoft.AspNetCore.TestHost;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using ZB.MOM.WW.Health;
|
||||
|
||||
namespace ZB.MOM.WW.Health.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Verifies the canonical JSON response writer (<see cref="ZbHealthWriter.WriteJsonAsync"/>):
|
||||
/// the JSON body shape, the <c>application/json</c> content type, and that the framework's
|
||||
/// status-to-HTTP mapping (Healthy/Degraded → 200, Unhealthy → 503) is preserved when the
|
||||
/// writer is wired onto the ready/active tiers by <see cref="ZbHealthEndpointExtensions.MapZbHealth"/>.
|
||||
/// </summary>
|
||||
public sealed class ResponseWriterTests
|
||||
{
|
||||
private sealed class StubHealthCheck : IHealthCheck
|
||||
{
|
||||
private readonly HealthCheckResult _result;
|
||||
|
||||
public StubHealthCheck(HealthStatus status, string? description = null) =>
|
||||
_result = new HealthCheckResult(status, description);
|
||||
|
||||
public Task<HealthCheckResult> CheckHealthAsync(
|
||||
HealthCheckContext context,
|
||||
CancellationToken cancellationToken = default) => Task.FromResult(_result);
|
||||
}
|
||||
|
||||
private static async Task<HttpResponseMessage> GetReadyAsync(
|
||||
HealthStatus status, string description = "db reachable")
|
||||
{
|
||||
var builder = WebApplication.CreateBuilder();
|
||||
builder.WebHost.UseTestServer();
|
||||
builder.Services.AddHealthChecks()
|
||||
.AddCheck("db", new StubHealthCheck(status, description), tags: new[] { ZbHealthTags.Ready });
|
||||
|
||||
await using var app = builder.Build();
|
||||
app.MapZbHealth();
|
||||
await app.StartAsync();
|
||||
|
||||
var client = app.GetTestClient();
|
||||
return await client.GetAsync("/health/ready");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReadyEndpoint_Healthy_WritesJsonBody_With200()
|
||||
{
|
||||
var response = await GetReadyAsync(HealthStatus.Healthy);
|
||||
|
||||
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
|
||||
Assert.Equal("application/json", response.Content.Headers.ContentType?.MediaType);
|
||||
|
||||
using var doc = JsonDocument.Parse(await response.Content.ReadAsStringAsync());
|
||||
var root = doc.RootElement;
|
||||
|
||||
Assert.Equal("Healthy", root.GetProperty("status").GetString());
|
||||
Assert.Equal(JsonValueKind.Number, root.GetProperty("totalDurationMs").ValueKind);
|
||||
|
||||
var entries = root.GetProperty("entries");
|
||||
var db = entries.GetProperty("db");
|
||||
Assert.Equal("Healthy", db.GetProperty("status").GetString());
|
||||
Assert.Equal("db reachable", db.GetProperty("description").GetString());
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReadyEndpoint_Degraded_Returns200_WithDegradedStatus()
|
||||
{
|
||||
var response = await GetReadyAsync(HealthStatus.Degraded);
|
||||
|
||||
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
|
||||
Assert.Equal("application/json", response.Content.Headers.ContentType?.MediaType);
|
||||
|
||||
using var doc = JsonDocument.Parse(await response.Content.ReadAsStringAsync());
|
||||
Assert.Equal("Degraded", doc.RootElement.GetProperty("status").GetString());
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReadyEndpoint_Unhealthy_Returns503_WithUnhealthyStatus()
|
||||
{
|
||||
var response = await GetReadyAsync(HealthStatus.Unhealthy);
|
||||
|
||||
Assert.Equal(HttpStatusCode.ServiceUnavailable, response.StatusCode);
|
||||
Assert.Equal("application/json", response.Content.Headers.ContentType?.MediaType);
|
||||
|
||||
using var doc = JsonDocument.Parse(await response.Content.ReadAsStringAsync());
|
||||
Assert.Equal("Unhealthy", doc.RootElement.GetProperty("status").GetString());
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,165 @@
|
||||
using System.Net;
|
||||
using Microsoft.AspNetCore.Builder;
|
||||
using Microsoft.AspNetCore.Hosting;
|
||||
using Microsoft.AspNetCore.TestHost;
|
||||
using Microsoft.Extensions.Diagnostics.HealthChecks;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using ZB.MOM.WW.Health;
|
||||
|
||||
namespace ZB.MOM.WW.Health.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Verifies the three-tier <see cref="ZbHealthEndpointExtensions.MapZbHealth"/> convention:
|
||||
/// each endpoint runs only the checks tagged for its tier, /healthz runs nothing, and the
|
||||
/// standard ASP.NET HealthChecks status-to-HTTP mapping (Healthy/Degraded → 200, Unhealthy → 503)
|
||||
/// holds per tier.
|
||||
/// </summary>
|
||||
public sealed class TierMappingTests
|
||||
{
|
||||
/// <summary>
|
||||
/// An <see cref="IHealthCheck"/> test double that records each invocation and returns a
|
||||
/// configurable result, so tests can assert which checks actually ran per tier.
|
||||
/// </summary>
|
||||
private sealed class RecordingHealthCheck : IHealthCheck
|
||||
{
|
||||
private readonly HealthStatus _status;
|
||||
private int _invocations;
|
||||
|
||||
public RecordingHealthCheck(HealthStatus status) => _status = status;
|
||||
|
||||
public int Invocations => Volatile.Read(ref _invocations);
|
||||
|
||||
public Task<HealthCheckResult> CheckHealthAsync(
|
||||
HealthCheckContext context,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
Interlocked.Increment(ref _invocations);
|
||||
return Task.FromResult(new HealthCheckResult(_status));
|
||||
}
|
||||
}
|
||||
|
||||
private static async Task<(HttpResponseMessage Response, RecordingHealthCheck Ready, RecordingHealthCheck Active)>
|
||||
RunAsync(string path, HealthStatus readyStatus = HealthStatus.Healthy, HealthStatus activeStatus = HealthStatus.Healthy)
|
||||
{
|
||||
var ready = new RecordingHealthCheck(readyStatus);
|
||||
var active = new RecordingHealthCheck(activeStatus);
|
||||
|
||||
var builder = WebApplication.CreateBuilder();
|
||||
builder.WebHost.UseTestServer();
|
||||
|
||||
builder.Services.AddHealthChecks()
|
||||
.AddCheck("ready-check", ready, tags: new[] { ZbHealthTags.Ready })
|
||||
.AddCheck("active-check", active, tags: new[] { ZbHealthTags.Active });
|
||||
|
||||
await using var app = builder.Build();
|
||||
app.MapZbHealth();
|
||||
await app.StartAsync();
|
||||
|
||||
var client = app.GetTestClient();
|
||||
var response = await client.GetAsync(path);
|
||||
return (response, ready, active);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReadyEndpoint_RunsOnlyReadyCheck()
|
||||
{
|
||||
var (response, ready, active) = await RunAsync("/health/ready");
|
||||
|
||||
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
|
||||
Assert.Equal(1, ready.Invocations);
|
||||
Assert.Equal(0, active.Invocations);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ActiveEndpoint_RunsOnlyActiveCheck()
|
||||
{
|
||||
var (response, ready, active) = await RunAsync("/health/active");
|
||||
|
||||
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
|
||||
Assert.Equal(0, ready.Invocations);
|
||||
Assert.Equal(1, active.Invocations);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task LivenessEndpoint_RunsNoChecks_AndReturns200()
|
||||
{
|
||||
var (response, ready, active) = await RunAsync("/healthz");
|
||||
|
||||
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
|
||||
Assert.Equal(0, ready.Invocations);
|
||||
Assert.Equal(0, active.Invocations);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReadyEndpoint_Healthy_Returns200()
|
||||
{
|
||||
var (response, _, _) = await RunAsync("/health/ready", readyStatus: HealthStatus.Healthy);
|
||||
|
||||
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReadyEndpoint_Unhealthy_Returns503()
|
||||
{
|
||||
var (response, _, _) = await RunAsync("/health/ready", readyStatus: HealthStatus.Unhealthy);
|
||||
|
||||
Assert.Equal(HttpStatusCode.ServiceUnavailable, response.StatusCode);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ActiveEndpoint_Unhealthy_Returns503()
|
||||
{
|
||||
var (response, _, _) = await RunAsync("/health/active", activeStatus: HealthStatus.Unhealthy);
|
||||
|
||||
Assert.Equal(HttpStatusCode.ServiceUnavailable, response.StatusCode);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task LivenessEndpoint_UnaffectedByUnhealthyChecks()
|
||||
{
|
||||
// Even though every registered check is Unhealthy, /healthz runs none of them
|
||||
// (predicate _ => false) and stays 200 as long as the process is up.
|
||||
var (response, ready, active) = await RunAsync(
|
||||
"/healthz", readyStatus: HealthStatus.Unhealthy, activeStatus: HealthStatus.Unhealthy);
|
||||
|
||||
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
|
||||
Assert.Equal(0, ready.Invocations);
|
||||
Assert.Equal(0, active.Invocations);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task Options_OverrideRoutePaths()
|
||||
{
|
||||
var ready = new RecordingHealthCheck(HealthStatus.Healthy);
|
||||
var active = new RecordingHealthCheck(HealthStatus.Healthy);
|
||||
|
||||
var builder = WebApplication.CreateBuilder();
|
||||
builder.WebHost.UseTestServer();
|
||||
builder.Services.AddHealthChecks()
|
||||
.AddCheck("ready-check", ready, tags: new[] { ZbHealthTags.Ready })
|
||||
.AddCheck("active-check", active, tags: new[] { ZbHealthTags.Active });
|
||||
|
||||
await using var app = builder.Build();
|
||||
app.MapZbHealth(new ZbHealthEndpointOptions
|
||||
{
|
||||
ReadyPath = "/custom/ready",
|
||||
ActivePath = "/custom/active",
|
||||
LivePath = "/custom/live",
|
||||
});
|
||||
await app.StartAsync();
|
||||
|
||||
var client = app.GetTestClient();
|
||||
|
||||
var readyResponse = await client.GetAsync("/custom/ready");
|
||||
Assert.Equal(HttpStatusCode.OK, readyResponse.StatusCode);
|
||||
Assert.Equal(1, ready.Invocations);
|
||||
Assert.Equal(0, active.Invocations);
|
||||
|
||||
var liveResponse = await client.GetAsync("/custom/live");
|
||||
Assert.Equal(HttpStatusCode.OK, liveResponse.StatusCode);
|
||||
|
||||
// The default paths must no longer be mapped when overridden.
|
||||
var defaultReady = await client.GetAsync("/health/ready");
|
||||
Assert.Equal(HttpStatusCode.NotFound, defaultReady.StatusCode);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,28 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<IsPackable>false</IsPackable>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="coverlet.collector" />
|
||||
<PackageReference Include="Microsoft.NET.Test.Sdk" />
|
||||
<PackageReference Include="xunit" />
|
||||
<PackageReference Include="xunit.runner.visualstudio" />
|
||||
<PackageReference Include="Microsoft.AspNetCore.Mvc.Testing" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<Using Include="Xunit" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!-- WebApplicationFactory requires the full ASP.NET Core shared framework -->
|
||||
<FrameworkReference Include="Microsoft.AspNetCore.App" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\..\src\ZB.MOM.WW.Health\ZB.MOM.WW.Health.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,482 @@
|
||||
## Ignore Visual Studio temporary files, build results, and
|
||||
## files generated by popular Visual Studio add-ons.
|
||||
##
|
||||
## Get latest from `dotnet new gitignore`
|
||||
|
||||
# dotenv files
|
||||
.env
|
||||
|
||||
# User-specific files
|
||||
*.rsuser
|
||||
*.suo
|
||||
*.user
|
||||
*.userosscache
|
||||
*.sln.docstates
|
||||
|
||||
# User-specific files (MonoDevelop/Xamarin Studio)
|
||||
*.userprefs
|
||||
|
||||
# Mono auto generated files
|
||||
mono_crash.*
|
||||
|
||||
# Build results
|
||||
[Dd]ebug/
|
||||
[Dd]ebugPublic/
|
||||
[Rr]elease/
|
||||
[Rr]eleases/
|
||||
x64/
|
||||
x86/
|
||||
[Ww][Ii][Nn]32/
|
||||
[Aa][Rr][Mm]/
|
||||
[Aa][Rr][Mm]64/
|
||||
bld/
|
||||
[Bb]in/
|
||||
[Oo]bj/
|
||||
[Ll]og/
|
||||
[Ll]ogs/
|
||||
|
||||
# Visual Studio 2015/2017 cache/options directory
|
||||
.vs/
|
||||
# Uncomment if you have tasks that create the project's static files in wwwroot
|
||||
#wwwroot/
|
||||
|
||||
# Visual Studio 2017 auto generated files
|
||||
Generated\ Files/
|
||||
|
||||
# MSTest test Results
|
||||
[Tt]est[Rr]esult*/
|
||||
[Bb]uild[Ll]og.*
|
||||
|
||||
# NUnit
|
||||
*.VisualState.xml
|
||||
TestResult.xml
|
||||
nunit-*.xml
|
||||
|
||||
# Build Results of an ATL Project
|
||||
[Dd]ebugPS/
|
||||
[Rr]eleasePS/
|
||||
dlldata.c
|
||||
|
||||
# Benchmark Results
|
||||
BenchmarkDotNet.Artifacts/
|
||||
|
||||
# .NET
|
||||
project.lock.json
|
||||
project.fragment.lock.json
|
||||
artifacts/
|
||||
|
||||
# Tye
|
||||
.tye/
|
||||
|
||||
# ASP.NET Scaffolding
|
||||
ScaffoldingReadMe.txt
|
||||
|
||||
# StyleCop
|
||||
StyleCopReport.xml
|
||||
|
||||
# Files built by Visual Studio
|
||||
*_i.c
|
||||
*_p.c
|
||||
*_h.h
|
||||
*.ilk
|
||||
*.meta
|
||||
*.obj
|
||||
*.iobj
|
||||
*.pch
|
||||
*.pdb
|
||||
*.ipdb
|
||||
*.pgc
|
||||
*.pgd
|
||||
*.rsp
|
||||
# but not Directory.Build.rsp, as it configures directory-level build defaults
|
||||
!Directory.Build.rsp
|
||||
*.sbr
|
||||
*.tlb
|
||||
*.tli
|
||||
*.tlh
|
||||
*.tmp
|
||||
*.tmp_proj
|
||||
*_wpftmp.csproj
|
||||
*.log
|
||||
*.tlog
|
||||
*.vspscc
|
||||
*.vssscc
|
||||
.builds
|
||||
*.pidb
|
||||
*.svclog
|
||||
*.scc
|
||||
|
||||
# Chutzpah Test files
|
||||
_Chutzpah*
|
||||
|
||||
# Visual C++ cache files
|
||||
ipch/
|
||||
*.aps
|
||||
*.ncb
|
||||
*.opendb
|
||||
*.opensdf
|
||||
*.sdf
|
||||
*.cachefile
|
||||
*.VC.db
|
||||
*.VC.VC.opendb
|
||||
|
||||
# Visual Studio profiler
|
||||
*.psess
|
||||
*.vsp
|
||||
*.vspx
|
||||
*.sap
|
||||
|
||||
# Visual Studio Trace Files
|
||||
*.e2e
|
||||
|
||||
# TFS 2012 Local Workspace
|
||||
$tf/
|
||||
|
||||
# Guidance Automation Toolkit
|
||||
*.gpState
|
||||
|
||||
# ReSharper is a .NET coding add-in
|
||||
_ReSharper*/
|
||||
*.[Rr]e[Ss]harper
|
||||
*.DotSettings.user
|
||||
|
||||
# TeamCity is a build add-in
|
||||
_TeamCity*
|
||||
|
||||
# DotCover is a Code Coverage Tool
|
||||
*.dotCover
|
||||
|
||||
# AxoCover is a Code Coverage Tool
|
||||
.axoCover/*
|
||||
!.axoCover/settings.json
|
||||
|
||||
# Coverlet is a free, cross platform Code Coverage Tool
|
||||
coverage*.json
|
||||
coverage*.xml
|
||||
coverage*.info
|
||||
|
||||
# Visual Studio code coverage results
|
||||
*.coverage
|
||||
*.coveragexml
|
||||
|
||||
# NCrunch
|
||||
_NCrunch_*
|
||||
.*crunch*.local.xml
|
||||
nCrunchTemp_*
|
||||
|
||||
# MightyMoose
|
||||
*.mm.*
|
||||
AutoTest.Net/
|
||||
|
||||
# Web workbench (sass)
|
||||
.sass-cache/
|
||||
|
||||
# Installshield output folder
|
||||
[Ee]xpress/
|
||||
|
||||
# DocProject is a documentation generator add-in
|
||||
DocProject/buildhelp/
|
||||
DocProject/Help/*.HxT
|
||||
DocProject/Help/*.HxC
|
||||
DocProject/Help/*.hhc
|
||||
DocProject/Help/*.hhk
|
||||
DocProject/Help/*.hhp
|
||||
DocProject/Help/Html2
|
||||
DocProject/Help/html
|
||||
|
||||
# Click-Once directory
|
||||
publish/
|
||||
|
||||
# Publish Web Output
|
||||
*.[Pp]ublish.xml
|
||||
*.azurePubxml
|
||||
# Note: Comment the next line if you want to checkin your web deploy settings,
|
||||
# but database connection strings (with potential passwords) will be unencrypted
|
||||
*.pubxml
|
||||
*.publishproj
|
||||
|
||||
# Microsoft Azure Web App publish settings. Comment the next line if you want to
|
||||
# checkin your Azure Web App publish settings, but sensitive information contained
|
||||
# in these scripts will be unencrypted
|
||||
PublishScripts/
|
||||
|
||||
# NuGet Packages
|
||||
*.nupkg
|
||||
# NuGet Symbol Packages
|
||||
*.snupkg
|
||||
# The packages folder can be ignored because of Package Restore
|
||||
**/[Pp]ackages/*
|
||||
# except build/, which is used as an MSBuild target.
|
||||
!**/[Pp]ackages/build/
|
||||
# Uncomment if necessary however generally it will be regenerated when needed
|
||||
#!**/[Pp]ackages/repositories.config
|
||||
# NuGet v3's project.json files produces more ignorable files
|
||||
*.nuget.props
|
||||
*.nuget.targets
|
||||
|
||||
# Microsoft Azure Build Output
|
||||
csx/
|
||||
*.build.csdef
|
||||
|
||||
# Microsoft Azure Emulator
|
||||
ecf/
|
||||
rcf/
|
||||
|
||||
# Windows Store app package directories and files
|
||||
AppPackages/
|
||||
BundleArtifacts/
|
||||
Package.StoreAssociation.xml
|
||||
_pkginfo.txt
|
||||
*.appx
|
||||
*.appxbundle
|
||||
*.appxupload
|
||||
|
||||
# Visual Studio cache files
|
||||
# files ending in .cache can be ignored
|
||||
*.[Cc]ache
|
||||
# but keep track of directories ending in .cache
|
||||
!?*.[Cc]ache/
|
||||
|
||||
# Others
|
||||
ClientBin/
|
||||
~$*
|
||||
*~
|
||||
*.dbmdl
|
||||
*.dbproj.schemaview
|
||||
*.jfm
|
||||
*.pfx
|
||||
*.publishsettings
|
||||
orleans.codegen.cs
|
||||
|
||||
# Including strong name files can present a security risk
|
||||
# (https://github.com/github/gitignore/pull/2483#issue-259490424)
|
||||
#*.snk
|
||||
|
||||
# Since there are multiple workflows, uncomment next line to ignore bower_components
|
||||
# (https://github.com/github/gitignore/pull/1529#issuecomment-104372622)
|
||||
#bower_components/
|
||||
|
||||
# RIA/Silverlight projects
|
||||
Generated_Code/
|
||||
|
||||
# Backup & report files from converting an old project file
|
||||
# to a newer Visual Studio version. Backup files are not needed,
|
||||
# because we have git ;-)
|
||||
_UpgradeReport_Files/
|
||||
Backup*/
|
||||
UpgradeLog*.XML
|
||||
UpgradeLog*.htm
|
||||
ServiceFabricBackup/
|
||||
*.rptproj.bak
|
||||
|
||||
# SQL Server files
|
||||
*.mdf
|
||||
*.ldf
|
||||
*.ndf
|
||||
|
||||
# Business Intelligence projects
|
||||
*.rdl.data
|
||||
*.bim.layout
|
||||
*.bim_*.settings
|
||||
*.rptproj.rsuser
|
||||
*- [Bb]ackup.rdl
|
||||
*- [Bb]ackup ([0-9]).rdl
|
||||
*- [Bb]ackup ([0-9][0-9]).rdl
|
||||
|
||||
# Microsoft Fakes
|
||||
FakesAssemblies/
|
||||
|
||||
# GhostDoc plugin setting file
|
||||
*.GhostDoc.xml
|
||||
|
||||
# Node.js Tools for Visual Studio
|
||||
.ntvs_analysis.dat
|
||||
node_modules/
|
||||
|
||||
# Visual Studio 6 build log
|
||||
*.plg
|
||||
|
||||
# Visual Studio 6 workspace options file
|
||||
*.opt
|
||||
|
||||
# Visual Studio 6 auto-generated workspace file (contains which files were open etc.)
|
||||
*.vbw
|
||||
|
||||
# Visual Studio 6 auto-generated project file (contains which files were open etc.)
|
||||
*.vbp
|
||||
|
||||
# Visual Studio 6 workspace and project file (working project files containing files to include in project)
|
||||
*.dsw
|
||||
*.dsp
|
||||
|
||||
# Visual Studio 6 technical files
|
||||
*.ncb
|
||||
*.aps
|
||||
|
||||
# Visual Studio LightSwitch build output
|
||||
**/*.HTMLClient/GeneratedArtifacts
|
||||
**/*.DesktopClient/GeneratedArtifacts
|
||||
**/*.DesktopClient/ModelManifest.xml
|
||||
**/*.Server/GeneratedArtifacts
|
||||
**/*.Server/ModelManifest.xml
|
||||
_Pvt_Extensions
|
||||
|
||||
# Paket dependency manager
|
||||
.paket/paket.exe
|
||||
paket-files/
|
||||
|
||||
# FAKE - F# Make
|
||||
.fake/
|
||||
|
||||
# CodeRush personal settings
|
||||
.cr/personal
|
||||
|
||||
# Python Tools for Visual Studio (PTVS)
|
||||
__pycache__/
|
||||
*.pyc
|
||||
|
||||
# Cake - Uncomment if you are using it
|
||||
# tools/**
|
||||
# !tools/packages.config
|
||||
|
||||
# Tabs Studio
|
||||
*.tss
|
||||
|
||||
# Telerik's JustMock configuration file
|
||||
*.jmconfig
|
||||
|
||||
# BizTalk build output
|
||||
*.btp.cs
|
||||
*.btm.cs
|
||||
*.odx.cs
|
||||
*.xsd.cs
|
||||
|
||||
# OpenCover UI analysis results
|
||||
OpenCover/
|
||||
|
||||
# Azure Stream Analytics local run output
|
||||
ASALocalRun/
|
||||
|
||||
# MSBuild Binary and Structured Log
|
||||
*.binlog
|
||||
|
||||
# NVidia Nsight GPU debugger configuration file
|
||||
*.nvuser
|
||||
|
||||
# MFractors (Xamarin productivity tool) working folder
|
||||
.mfractor/
|
||||
|
||||
# Local History for Visual Studio
|
||||
.localhistory/
|
||||
|
||||
# Visual Studio History (VSHistory) files
|
||||
.vshistory/
|
||||
|
||||
# BeatPulse healthcheck temp database
|
||||
healthchecksdb
|
||||
|
||||
# Backup folder for Package Reference Convert tool in Visual Studio 2017
|
||||
MigrationBackup/
|
||||
|
||||
# Ionide (cross platform F# VS Code tools) working folder
|
||||
.ionide/
|
||||
|
||||
# Fody - auto-generated XML schema
|
||||
FodyWeavers.xsd
|
||||
|
||||
# VS Code files for those working on multiple tools
|
||||
.vscode/*
|
||||
!.vscode/settings.json
|
||||
!.vscode/tasks.json
|
||||
!.vscode/launch.json
|
||||
!.vscode/extensions.json
|
||||
*.code-workspace
|
||||
|
||||
# Local History for Visual Studio Code
|
||||
.history/
|
||||
|
||||
# Windows Installer files from build outputs
|
||||
*.cab
|
||||
*.msi
|
||||
*.msix
|
||||
*.msm
|
||||
*.msp
|
||||
|
||||
# JetBrains Rider
|
||||
*.sln.iml
|
||||
.idea/
|
||||
|
||||
##
|
||||
## Visual studio for Mac
|
||||
##
|
||||
|
||||
|
||||
# globs
|
||||
Makefile.in
|
||||
*.userprefs
|
||||
*.usertasks
|
||||
config.make
|
||||
config.status
|
||||
aclocal.m4
|
||||
install-sh
|
||||
autom4te.cache/
|
||||
*.tar.gz
|
||||
tarballs/
|
||||
test-results/
|
||||
|
||||
# content below from: https://github.com/github/gitignore/blob/main/Global/macOS.gitignore
|
||||
# General
|
||||
.DS_Store
|
||||
.AppleDouble
|
||||
.LSOverride
|
||||
|
||||
# Icon must end with two \r
|
||||
Icon
|
||||
|
||||
|
||||
# Thumbnails
|
||||
._*
|
||||
|
||||
# Files that might appear in the root of a volume
|
||||
.DocumentRevisions-V100
|
||||
.fseventsd
|
||||
.Spotlight-V100
|
||||
.TemporaryItems
|
||||
.Trashes
|
||||
.VolumeIcon.icns
|
||||
.com.apple.timemachine.donotpresent
|
||||
|
||||
# Directories potentially created on remote AFP share
|
||||
.AppleDB
|
||||
.AppleDesktop
|
||||
Network Trash Folder
|
||||
Temporary Items
|
||||
.apdisk
|
||||
|
||||
# content below from: https://github.com/github/gitignore/blob/main/Global/Windows.gitignore
|
||||
# Windows thumbnail cache files
|
||||
Thumbs.db
|
||||
ehthumbs.db
|
||||
ehthumbs_vista.db
|
||||
|
||||
# Dump file
|
||||
*.stackdump
|
||||
|
||||
# Folder config file
|
||||
[Dd]esktop.ini
|
||||
|
||||
# Recycle Bin used on file shares
|
||||
$RECYCLE.BIN/
|
||||
|
||||
# Windows Installer files
|
||||
*.cab
|
||||
*.msi
|
||||
*.msix
|
||||
*.msm
|
||||
*.msp
|
||||
|
||||
# Windows shortcuts
|
||||
*.lnk
|
||||
|
||||
# Vim temporary swap files
|
||||
*.swp
|
||||
@@ -0,0 +1,74 @@
|
||||
# ZB.MOM.WW.Telemetry
|
||||
|
||||
Observability libraries for the **ZB.MOM.WW SCADA family** (OtOpcUa, MxAccessGateway, ScadaBridge). These are **libraries, not a service** — each package is linked directly into the consuming application at build time. There is no central telemetry process; instrumentation runs in-process alongside the application.
|
||||
|
||||
The library normalizes the three-project observability surface: a shared OpenTelemetry Resource driven by a single identity triple (`service.name` / `site.id` / `node.role`), standard instrumentation wiring, Prometheus and OTLP export, and a Serilog bootstrap with enrichers and `TraceContextEnricher` for trace↔log correlation.
|
||||
|
||||
**Built at 0.1.0. MxAccessGateway logging adopted (MEL → Serilog migration done on its own branch). OtOpcUa and ScadaBridge telemetry adoption is follow-on.** Adoption tracked in `~/Desktop/scadaproj/components/observability/GAPS.md`.
|
||||
|
||||
---
|
||||
|
||||
## Packages
|
||||
|
||||
| Package | Responsibilities | Key Dependencies |
|
||||
|---|---|---|
|
||||
| `ZB.MOM.WW.Telemetry` | `AddZbTelemetry` — wires OTel SDK (metrics + tracing), populates shared Resource (`service.name`, `service.namespace`, `service.version`, `site.id`, `node.role`, `host.name`), registers caller-supplied Meters/ActivitySources, adds standard instrumentation (ASP.NET Core, HttpClient, gRPC client, runtime, process), Prometheus always-on exporter, OTLP additive overlay. `app.MapZbMetrics()` — mounts `/metrics`. `ZbTelemetryOptions` — the single options object shared by both packages. | `Microsoft.AspNetCore.App` (framework ref), `OpenTelemetry.*` stack |
|
||||
| `ZB.MOM.WW.Telemetry.Serilog` | `AddZbSerilog` — shared two-stage Serilog bootstrap: `ReadFrom.Configuration`-driven sinks, `MinimumLevel.Is(Information)` default (config-overridable), `SiteId`/`NodeRole`/`NodeHostname` enrichers from `ZbTelemetryOptions`, `TraceContextEnricher` (writes `trace_id`/`span_id` from `Activity.Current`), `ILogRedactor` seam (per-project sensitive-field redaction via `RedactionEnricher`). Does NOT freeze `Log.Logger` — safe for multi-host/test scenarios. | `ZB.MOM.WW.Telemetry`, `Serilog.*` stack |
|
||||
|
||||
---
|
||||
|
||||
## Consumer matrix
|
||||
|
||||
| Consumer | `ZB.MOM.WW.Telemetry` (core) | `ZB.MOM.WW.Telemetry.Serilog` |
|
||||
|---|:---:|:---:|
|
||||
| **OtOpcUa** | yes (after adoption) | yes (after adoption) |
|
||||
| **MxAccessGateway** | yes (after adoption) | yes (MEL → Serilog adopted now) |
|
||||
| **ScadaBridge** | yes (after adoption) | yes (after adoption) |
|
||||
|
||||
MxAccessGateway's logging adoption is the one in-pass migration. Full metrics/tracing wiring
|
||||
for all three apps is follow-on.
|
||||
|
||||
---
|
||||
|
||||
## Build, test, and pack commands
|
||||
|
||||
```bash
|
||||
# From ZB.MOM.WW.Telemetry/
|
||||
|
||||
# Build
|
||||
dotnet build ZB.MOM.WW.Telemetry.slnx
|
||||
dotnet build ZB.MOM.WW.Telemetry.slnx -c Release
|
||||
|
||||
# Test (no external dependencies — no running OTel collector, no Serilog backend required)
|
||||
dotnet test ZB.MOM.WW.Telemetry.slnx
|
||||
|
||||
# Pack (two .nupkg files land in artifacts/)
|
||||
dotnet pack ZB.MOM.WW.Telemetry.slnx -c Release -o ./artifacts
|
||||
```
|
||||
|
||||
All test assemblies run offline:
|
||||
|
||||
| Assembly | Tests |
|
||||
|---|---|
|
||||
| `ZB.MOM.WW.Telemetry.Tests` | 7 |
|
||||
| `ZB.MOM.WW.Telemetry.Serilog.Tests` | 12 |
|
||||
| **Total** | **19** |
|
||||
|
||||
`GeneratePackageOnBuild` is off — pack explicitly with the command above.
|
||||
|
||||
---
|
||||
|
||||
## Status
|
||||
|
||||
Built at **0.1.0** and published to the Gitea NuGet feed. MxAccessGateway logging (MEL → Serilog)
|
||||
adopted on its own branch. **OtOpcUa and ScadaBridge telemetry adoption not yet started** —
|
||||
tracked in the component backlog:
|
||||
|
||||
- `~/Desktop/scadaproj/components/observability/GAPS.md` — adoption order, effort, and risk
|
||||
|
||||
Design documentation:
|
||||
|
||||
- `~/Desktop/scadaproj/components/observability/spec/SPEC.md` — normalized observability target
|
||||
- `~/Desktop/scadaproj/components/observability/spec/METRIC-CONVENTIONS.md` — metric naming reference
|
||||
- `~/Desktop/scadaproj/components/observability/shared-contract/ZB.MOM.WW.Telemetry.md` — proposed shared-library API
|
||||
- `~/Desktop/scadaproj/components/observability/current-state/` — per-project current state (code-verified)
|
||||
@@ -0,0 +1,12 @@
|
||||
<Project>
|
||||
|
||||
<PropertyGroup>
|
||||
<TargetFramework>net10.0</TargetFramework>
|
||||
<Nullable>enable</Nullable>
|
||||
<ImplicitUsings>enable</ImplicitUsings>
|
||||
<LangVersion>latest</LangVersion>
|
||||
<Version>0.1.0</Version>
|
||||
<ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
|
||||
</PropertyGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,38 @@
|
||||
<Project>
|
||||
|
||||
<PropertyGroup>
|
||||
<ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!-- OpenTelemetry core + exporters -->
|
||||
<PackageVersion Include="OpenTelemetry.Extensions.Hosting" Version="1.15.3" />
|
||||
<PackageVersion Include="OpenTelemetry.Exporter.Prometheus.AspNetCore" Version="1.15.3-beta.1" />
|
||||
<PackageVersion Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.15.3" />
|
||||
|
||||
<!-- OpenTelemetry instrumentation libraries -->
|
||||
<PackageVersion Include="OpenTelemetry.Instrumentation.AspNetCore" Version="1.15.2" />
|
||||
<PackageVersion Include="OpenTelemetry.Instrumentation.Http" Version="1.15.1" />
|
||||
<PackageVersion Include="OpenTelemetry.Instrumentation.GrpcNetClient" Version="1.15.1-beta.1" />
|
||||
<PackageVersion Include="OpenTelemetry.Instrumentation.Runtime" Version="1.15.1" />
|
||||
<PackageVersion Include="OpenTelemetry.Instrumentation.Process" Version="1.15.1-beta.1" />
|
||||
|
||||
<!-- Serilog -->
|
||||
<PackageVersion Include="Serilog" Version="4.3.1" />
|
||||
<PackageVersion Include="Serilog.AspNetCore" Version="9.0.0" />
|
||||
<PackageVersion Include="Serilog.Extensions.Hosting" Version="9.0.0" />
|
||||
<PackageVersion Include="Serilog.Settings.Configuration" Version="9.0.0" />
|
||||
<PackageVersion Include="Serilog.Sinks.Console" Version="6.0.0" />
|
||||
<PackageVersion Include="Serilog.Sinks.File" Version="7.0.0" />
|
||||
<PackageVersion Include="Serilog.Sinks.OpenTelemetry" Version="4.2.0" />
|
||||
<PackageVersion Include="Serilog.Sinks.InMemory" Version="2.0.0" />
|
||||
|
||||
<!-- Test -->
|
||||
<PackageVersion Include="Microsoft.NET.Test.Sdk" Version="17.14.1" />
|
||||
<PackageVersion Include="xunit" Version="2.9.3" />
|
||||
<PackageVersion Include="xunit.runner.visualstudio" Version="3.1.4" />
|
||||
<PackageVersion Include="coverlet.collector" Version="6.0.4" />
|
||||
<PackageVersion Include="Microsoft.AspNetCore.Mvc.Testing" Version="10.0.7" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,153 @@
|
||||
# ZB.MOM.WW.Telemetry
|
||||
|
||||
Observability libraries for the **ZB.MOM.WW SCADA family** (OtOpcUa, MxAccessGateway, ScadaBridge). These are **libraries, not a service** — each package is linked directly into the consuming application at build time. There is no central telemetry process; all instrumentation runs in-process alongside the application.
|
||||
|
||||
The library normalizes the three-project observability surface: a shared OpenTelemetry Resource identity, standard instrumentation wiring, Prometheus and OTLP export, and a Serilog bootstrap with enrichers and trace↔log correlation — so metrics, traces, and log lines from the same node carry identical dimensions and can join up in any backend.
|
||||
|
||||
---
|
||||
|
||||
## Packages
|
||||
|
||||
| Package | Description | Key Dependencies |
|
||||
|---|---|---|
|
||||
| `ZB.MOM.WW.Telemetry` | `AddZbTelemetry` extension, `ZbTelemetryOptions`, shared OTel Resource builder (`ZbResource`), standard instrumentation (ASP.NET Core, HttpClient, gRPC client, runtime, process), Prometheus always-on exporter + OTLP opt-in overlay, `app.MapZbMetrics()` endpoint extension. | `Microsoft.AspNetCore.App` (framework ref), `OpenTelemetry.*` stack |
|
||||
| `ZB.MOM.WW.Telemetry.Serilog` | `AddZbSerilog` extension, shared enrichers (`SiteId`/`NodeRole`/`NodeHostname`), `TraceContextEnricher` (writes `trace_id`/`span_id` from `Activity.Current` into every log event), `ILogRedactor` seam (per-project sensitive-field redaction), `RedactionEnricher`. | `ZB.MOM.WW.Telemetry`, `Serilog.*` stack |
|
||||
|
||||
---
|
||||
|
||||
## The unifying hinge
|
||||
|
||||
The single `ZbTelemetryOptions` object drives both packages. Its identity triple —
|
||||
`ServiceName` → OTel Resource `service.name`, `SiteId` → `site.id`, `NodeRole` → `node.role` —
|
||||
is applied once and flows automatically to **both** the OpenTelemetry Resource (so every metric
|
||||
and span carries it) **and** the Serilog enrichers (so every log event carries it). A metric,
|
||||
a span, and a log line emitted by the same node share identical `service.name`, `site.id`, and
|
||||
`node.role` dimensions, enabling cross-signal correlation in any backend (Grafana, Jaeger, Seq,
|
||||
Loki, etc.) without per-project bookkeeping.
|
||||
|
||||
---
|
||||
|
||||
## Consumer matrix
|
||||
|
||||
| Consumer | `ZB.MOM.WW.Telemetry` (core) | `ZB.MOM.WW.Telemetry.Serilog` |
|
||||
|---|:---:|:---:|
|
||||
| **OtOpcUa** | yes | yes |
|
||||
| **MxAccessGateway** | yes | yes (logging adopted — MEL → Serilog migration done) |
|
||||
| **ScadaBridge** | yes | yes |
|
||||
|
||||
All three apps consume both packages after adoption. MxAccessGateway's MEL→Serilog migration
|
||||
is the one in-pass adoption completed on its own branch; OtOpcUa and ScadaBridge adoption is
|
||||
follow-on (tracked in `components/observability/GAPS.md`).
|
||||
|
||||
---
|
||||
|
||||
## OTel signals
|
||||
|
||||
`AddZbTelemetry` wires all three OpenTelemetry signals in a single call:
|
||||
|
||||
| Signal | What is wired |
|
||||
|---|---|
|
||||
| **Metrics** | App Meters (via `options.Meters[]`) + standard: ASP.NET Core, HttpClient, .NET runtime, process. Exported via Prometheus (always on) with OTLP as an additive overlay. |
|
||||
| **Traces** | App ActivitySources (via `options.ActivitySources[]`) + standard: ASP.NET Core, HttpClient, gRPC client. Exported via OTLP when `Exporter = ZbExporter.Otlp`. |
|
||||
| **Logs** | Wired by `AddZbSerilog` (companion call). Serilog is used as the log sink; logs are bridged to OpenTelemetry via `Serilog.Sinks.OpenTelemetry` when configured. |
|
||||
|
||||
Trace↔log correlation is automatic: `TraceContextEnricher` reads `Activity.Current` for each
|
||||
log event and attaches `trace_id` and `span_id`, so log events produced inside a traced request
|
||||
carry the same span identity as the trace backend.
|
||||
|
||||
---
|
||||
|
||||
## Exporter options
|
||||
|
||||
Prometheus is **always wired** for metrics regardless of the `Exporter` setting. OTLP is an
|
||||
additive overlay — set `Exporter = ZbExporter.Otlp` and `OtlpEndpoint` to push to a collector
|
||||
in addition to the scrape endpoint.
|
||||
|
||||
```csharp
|
||||
// Prometheus only (default — scrape /metrics)
|
||||
builder.AddZbTelemetry(o =>
|
||||
{
|
||||
o.ServiceName = "mxgateway";
|
||||
o.SiteId = config["Site:Id"];
|
||||
o.NodeRole = "standalone";
|
||||
o.Meters = ["ZB.MOM.WW.MxGateway"];
|
||||
});
|
||||
|
||||
// OTLP overlay (metrics + traces pushed to collector; /metrics still active)
|
||||
builder.AddZbTelemetry(o =>
|
||||
{
|
||||
o.ServiceName = "mxgateway";
|
||||
o.SiteId = config["Site:Id"];
|
||||
o.NodeRole = "standalone";
|
||||
o.Meters = ["ZB.MOM.WW.MxGateway"];
|
||||
o.Exporter = ZbExporter.Otlp;
|
||||
o.OtlpEndpoint = "http://collector:4317";
|
||||
});
|
||||
|
||||
// Mount the Prometheus scrape endpoint (call after app.UseRouting())
|
||||
app.MapZbMetrics(); // → /metrics
|
||||
```
|
||||
|
||||
```csharp
|
||||
// Serilog bootstrap (same options object drives enrichers)
|
||||
builder.AddZbSerilog(o =>
|
||||
{
|
||||
o.ServiceName = "mxgateway";
|
||||
o.SiteId = config["Site:Id"];
|
||||
o.NodeRole = "standalone";
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Building and testing
|
||||
|
||||
```bash
|
||||
# from ZB.MOM.WW.Telemetry/
|
||||
dotnet build ZB.MOM.WW.Telemetry.slnx
|
||||
dotnet test ZB.MOM.WW.Telemetry.slnx
|
||||
```
|
||||
|
||||
All test assemblies run with no external dependencies (no running OTel collector, no Serilog
|
||||
backend):
|
||||
|
||||
| Assembly | Tests |
|
||||
|---|---|
|
||||
| `ZB.MOM.WW.Telemetry.Tests` | 7 |
|
||||
| `ZB.MOM.WW.Telemetry.Serilog.Tests` | 12 |
|
||||
| **Total** | **19** |
|
||||
|
||||
---
|
||||
|
||||
## Packing
|
||||
|
||||
```bash
|
||||
dotnet pack ZB.MOM.WW.Telemetry.slnx -c Release -o ./artifacts
|
||||
```
|
||||
|
||||
Produces two `.nupkg` files in `artifacts/`:
|
||||
|
||||
```
|
||||
ZB.MOM.WW.Telemetry.0.1.0.nupkg
|
||||
ZB.MOM.WW.Telemetry.Serilog.0.1.0.nupkg
|
||||
```
|
||||
|
||||
`GeneratePackageOnBuild` is off — pack explicitly as above. Both packages are versioned
|
||||
lockstep from `Directory.Build.props`.
|
||||
|
||||
---
|
||||
|
||||
## Status
|
||||
|
||||
**Built at 0.1.0. MxAccessGateway logging adopted (MEL → Serilog migration, on its own branch).
|
||||
Broader OtOpcUa and ScadaBridge telemetry adoption deferred.** Adoption is tracked in the
|
||||
component backlog:
|
||||
|
||||
- `~/Desktop/scadaproj/components/observability/GAPS.md`
|
||||
|
||||
Design documentation lives alongside that backlog:
|
||||
|
||||
- `~/Desktop/scadaproj/components/observability/spec/SPEC.md` — normalized observability target
|
||||
- `~/Desktop/scadaproj/components/observability/spec/METRIC-CONVENTIONS.md` — metric naming reference
|
||||
- `~/Desktop/scadaproj/components/observability/shared-contract/ZB.MOM.WW.Telemetry.md` — proposed API
|
||||
- `~/Desktop/scadaproj/components/observability/current-state/` — per-project current state (code-verified)
|
||||
@@ -0,0 +1,10 @@
|
||||
<Solution>
|
||||
<Folder Name="/src/">
|
||||
<Project Path="src/ZB.MOM.WW.Telemetry.Serilog/ZB.MOM.WW.Telemetry.Serilog.csproj" />
|
||||
<Project Path="src/ZB.MOM.WW.Telemetry/ZB.MOM.WW.Telemetry.csproj" />
|
||||
</Folder>
|
||||
<Folder Name="/tests/">
|
||||
<Project Path="tests/ZB.MOM.WW.Telemetry.Serilog.Tests/ZB.MOM.WW.Telemetry.Serilog.Tests.csproj" />
|
||||
<Project Path="tests/ZB.MOM.WW.Telemetry.Tests/ZB.MOM.WW.Telemetry.Tests.csproj" />
|
||||
</Folder>
|
||||
</Solution>
|
||||
@@ -0,0 +1,17 @@
|
||||
namespace ZB.MOM.WW.Telemetry.Serilog;
|
||||
|
||||
/// <summary>
|
||||
/// Seam for project-specific log-event redaction. The shared library applies this via
|
||||
/// <see cref="RedactionEnricher"/>; each project provides its own implementation that knows which
|
||||
/// fields (by property name) or which command payloads must not leave the process in log events.
|
||||
/// If no <see cref="ILogRedactor"/> is registered in DI, <see cref="RedactionEnricher"/> is a no-op.
|
||||
/// </summary>
|
||||
public interface ILogRedactor
|
||||
{
|
||||
/// <summary>
|
||||
/// Inspects and mutates the supplied log-event <paramref name="properties"/> in place — remove
|
||||
/// or replace any sensitive values. Called on every log event before it reaches any sink.
|
||||
/// </summary>
|
||||
/// <param name="properties">The mutable property dictionary for the current log event.</param>
|
||||
void Redact(IDictionary<string, object?> properties);
|
||||
}
|
||||
@@ -0,0 +1,82 @@
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Serilog.Core;
|
||||
using Serilog.Events;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry.Serilog;
|
||||
|
||||
/// <summary>
|
||||
/// Applies a registered <see cref="ILogRedactor"/> to every Serilog log event. Registered
|
||||
/// automatically by <see cref="ZbSerilogExtensions.AddZbSerilog"/>. The enricher resolves
|
||||
/// <see cref="ILogRedactor"/> from DI on first use (lazy, to avoid a circular-DI problem during
|
||||
/// Serilog's bootstrap); if none is registered it is permanently inert — no DI call per event.
|
||||
/// Resolution is thread-safe: <see cref="LazyThreadSafetyMode.ExecutionAndPublication"/> ensures
|
||||
/// exactly one DI lookup regardless of how many logging threads race to the first event.
|
||||
/// </summary>
|
||||
public sealed class RedactionEnricher : ILogEventEnricher
|
||||
{
|
||||
private readonly Lazy<ILogRedactor?> _redactor;
|
||||
|
||||
/// <summary>
|
||||
/// Creates the enricher bound to a service provider from which the project-supplied
|
||||
/// <see cref="ILogRedactor"/> is resolved lazily on first use (thread-safe).
|
||||
/// </summary>
|
||||
/// <param name="serviceProvider">Provider used to resolve a registered <see cref="ILogRedactor"/>.</param>
|
||||
public RedactionEnricher(IServiceProvider serviceProvider)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(serviceProvider);
|
||||
_redactor = new Lazy<ILogRedactor?>(
|
||||
() => serviceProvider.GetService<ILogRedactor>(),
|
||||
LazyThreadSafetyMode.ExecutionAndPublication);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Hands the log event's scalar properties to the registered <see cref="ILogRedactor"/> and
|
||||
/// writes back any values the redactor changed. No-op when no redactor is registered.
|
||||
/// </summary>
|
||||
/// <param name="logEvent">The log event to redact.</param>
|
||||
/// <param name="propertyFactory">Factory used to materialize replacement properties.</param>
|
||||
public void Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(logEvent);
|
||||
ArgumentNullException.ThrowIfNull(propertyFactory);
|
||||
|
||||
var redactor = ResolveRedactor();
|
||||
if (redactor is null)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
var snapshot = new Dictionary<string, object?>(logEvent.Properties.Count);
|
||||
foreach (var property in logEvent.Properties)
|
||||
{
|
||||
snapshot[property.Key] = property.Value is ScalarValue scalar
|
||||
? scalar.Value
|
||||
: property.Value;
|
||||
}
|
||||
|
||||
redactor.Redact(snapshot);
|
||||
|
||||
foreach (var entry in snapshot)
|
||||
{
|
||||
if (HasChanged(logEvent, entry.Key, entry.Value))
|
||||
{
|
||||
logEvent.AddOrUpdateProperty(
|
||||
propertyFactory.CreateProperty(entry.Key, entry.Value));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private ILogRedactor? ResolveRedactor() => _redactor.Value;
|
||||
|
||||
private static bool HasChanged(LogEvent logEvent, string key, object? newValue)
|
||||
{
|
||||
if (!logEvent.Properties.TryGetValue(key, out var existing))
|
||||
{
|
||||
// Redactor added a brand-new property.
|
||||
return true;
|
||||
}
|
||||
|
||||
var existingValue = existing is ScalarValue scalar ? scalar.Value : existing;
|
||||
return !Equals(existingValue, newValue);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,43 @@
|
||||
using System.Diagnostics;
|
||||
using Serilog.Core;
|
||||
using Serilog.Events;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry.Serilog;
|
||||
|
||||
/// <summary>
|
||||
/// Stamps <c>trace_id</c> and <c>span_id</c> from <see cref="Activity.Current"/> onto every Serilog
|
||||
/// log event, enabling a log line to be correlated back to its originating trace in a backend.
|
||||
/// When <see cref="Activity.Current"/> is null (no active span — background services, startup,
|
||||
/// non-traced paths) the enricher emits nothing; it does NOT inject empty strings or zero values.
|
||||
/// </summary>
|
||||
public sealed class TraceContextEnricher : ILogEventEnricher
|
||||
{
|
||||
/// <summary>Serilog property name carrying the W3C trace id.</summary>
|
||||
public const string TraceIdPropertyName = "trace_id";
|
||||
|
||||
/// <summary>Serilog property name carrying the W3C span id.</summary>
|
||||
public const string SpanIdPropertyName = "span_id";
|
||||
|
||||
/// <summary>
|
||||
/// Adds <c>trace_id</c>/<c>span_id</c> properties from <see cref="Activity.Current"/> when an
|
||||
/// activity is active; otherwise leaves the event untouched.
|
||||
/// </summary>
|
||||
/// <param name="logEvent">The log event to enrich.</param>
|
||||
/// <param name="propertyFactory">Factory used to create the trace-context properties.</param>
|
||||
public void Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(logEvent);
|
||||
ArgumentNullException.ThrowIfNull(propertyFactory);
|
||||
|
||||
var activity = Activity.Current;
|
||||
if (activity is null)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
logEvent.AddPropertyIfAbsent(
|
||||
propertyFactory.CreateProperty(TraceIdPropertyName, activity.TraceId.ToString()));
|
||||
logEvent.AddPropertyIfAbsent(
|
||||
propertyFactory.CreateProperty(SpanIdPropertyName, activity.SpanId.ToString()));
|
||||
}
|
||||
}
|
||||
+33
@@ -0,0 +1,33 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<IsPackable>true</IsPackable>
|
||||
<PackageId>ZB.MOM.WW.Telemetry.Serilog</PackageId>
|
||||
<Authors>ZB.MOM.WW</Authors>
|
||||
<Description>Serilog structured logging extensions for the ZB.MOM.WW SCADA family. Provides a shared two-stage Serilog bootstrap (AddZbSerilog), enrichers for SiteId/NodeRole/NodeHostname, a TraceContextEnricher for trace_id/span_id correlation with OpenTelemetry spans, and an ILogRedactor seam for per-project sensitive-field redaction.</Description>
|
||||
<PackageTags>opentelemetry;observability;serilog;logging;tracing;enrichers;scada;wonderware;zb-mom-ww</PackageTags>
|
||||
<PackageProjectUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-telemetry</PackageProjectUrl>
|
||||
<RepositoryUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-telemetry</RepositoryUrl>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="Serilog" />
|
||||
<PackageReference Include="Serilog.AspNetCore" />
|
||||
<PackageReference Include="Serilog.Extensions.Hosting" />
|
||||
<PackageReference Include="Serilog.Settings.Configuration" />
|
||||
<PackageReference Include="Serilog.Sinks.Console" />
|
||||
<PackageReference Include="Serilog.Sinks.File" />
|
||||
<PackageReference Include="Serilog.Sinks.OpenTelemetry" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\ZB.MOM.WW.Telemetry\ZB.MOM.WW.Telemetry.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<AssemblyAttribute Include="System.Runtime.CompilerServices.InternalsVisibleTo">
|
||||
<_Parameter1>ZB.MOM.WW.Telemetry.Serilog.Tests</_Parameter1>
|
||||
</AssemblyAttribute>
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,27 @@
|
||||
namespace ZB.MOM.WW.Telemetry.Serilog;
|
||||
|
||||
/// <summary>
|
||||
/// Canonical Serilog property name constants for the identity enrichers stamped by
|
||||
/// <see cref="ZbSerilogExtensions.AddZbSerilog"/>. Use these constants — not literal strings —
|
||||
/// when querying properties in sinks or tests. Each property mirrors a shared OTel Resource
|
||||
/// attribute so logs and metrics/traces from the same node carry identical dimensions.
|
||||
/// </summary>
|
||||
public static class ZbLogEnricherNames
|
||||
{
|
||||
/// <summary>
|
||||
/// Serilog property: physical or logical site identifier. Matches OTel Resource <c>site.id</c>.
|
||||
/// </summary>
|
||||
public const string SiteId = "SiteId";
|
||||
|
||||
/// <summary>
|
||||
/// Serilog property: node function (<c>central</c>, <c>site</c>, <c>hub</c>, <c>standalone</c>).
|
||||
/// Matches OTel Resource <c>node.role</c>.
|
||||
/// </summary>
|
||||
public const string NodeRole = "NodeRole";
|
||||
|
||||
/// <summary>
|
||||
/// Serilog property: machine name (<see cref="System.Environment.MachineName"/>).
|
||||
/// Matches OTel Resource <c>host.name</c>. Populated automatically — not a caller-supplied option.
|
||||
/// </summary>
|
||||
public const string NodeHostname = "NodeHostname";
|
||||
}
|
||||
@@ -0,0 +1,152 @@
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
using Serilog;
|
||||
using Serilog.Configuration;
|
||||
using Serilog.Sinks.OpenTelemetry;
|
||||
using ZB.MOM.WW.Telemetry;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry.Serilog;
|
||||
|
||||
/// <summary>
|
||||
/// Reusable seam that applies the shared ZB.MOM.WW logging configuration (identity enrichers,
|
||||
/// trace-context correlation, redaction, and OTel log export) to a
|
||||
/// <see cref="LoggerConfiguration"/>. Shared by <see cref="ZbSerilogExtensions.AddZbSerilog"/>
|
||||
/// and unit tests so both exercise an identical enricher/sink set.
|
||||
/// Internal to keep the public NuGet surface minimal; exposed to the test assembly via
|
||||
/// <c>[assembly: InternalsVisibleTo("ZB.MOM.WW.Telemetry.Serilog.Tests")]</c>.
|
||||
/// </summary>
|
||||
internal static class ZbSerilogConfig
|
||||
{
|
||||
/// <summary>
|
||||
/// Applies the shared identity enrichers — <see cref="ZbLogEnricherNames.SiteId"/> and
|
||||
/// <see cref="ZbLogEnricherNames.NodeRole"/> from <paramref name="options"/>, and
|
||||
/// <see cref="ZbLogEnricherNames.NodeHostname"/> from
|
||||
/// <see cref="System.Environment.MachineName"/> (auto, never a caller-supplied option) — to
|
||||
/// <paramref name="loggerConfiguration"/>. <c>SiteId</c>/<c>NodeRole</c> are stamped only when
|
||||
/// the option is non-null/non-empty, mirroring the shared OTel Resource omission rules.
|
||||
/// </summary>
|
||||
/// <param name="loggerConfiguration">The Serilog configuration to enrich.</param>
|
||||
/// <param name="options">The telemetry options describing the service identity.</param>
|
||||
/// <returns>The same <paramref name="loggerConfiguration"/> for chaining.</returns>
|
||||
public static LoggerConfiguration Apply(
|
||||
LoggerConfiguration loggerConfiguration,
|
||||
ZbTelemetryOptions options) =>
|
||||
Apply(loggerConfiguration, options, serviceProvider: null);
|
||||
|
||||
/// <summary>
|
||||
/// Overload of <see cref="Apply(LoggerConfiguration, ZbTelemetryOptions)"/> that additionally
|
||||
/// wires the service-provider-dependent stages — the redaction enricher (which lazily resolves
|
||||
/// a registered <c>ILogRedactor</c>). When <paramref name="serviceProvider"/> is null, only the
|
||||
/// provider-independent enrichers are applied.
|
||||
/// </summary>
|
||||
/// <param name="loggerConfiguration">The Serilog configuration to enrich.</param>
|
||||
/// <param name="options">The telemetry options describing the service identity.</param>
|
||||
/// <param name="serviceProvider">
|
||||
/// Provider used to lazily resolve project-supplied seams (e.g. <c>ILogRedactor</c>);
|
||||
/// may be null in tests or pipelines without DI.
|
||||
/// </param>
|
||||
/// <returns>The same <paramref name="loggerConfiguration"/> for chaining.</returns>
|
||||
public static LoggerConfiguration Apply(
|
||||
LoggerConfiguration loggerConfiguration,
|
||||
ZbTelemetryOptions options,
|
||||
IServiceProvider? serviceProvider)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(loggerConfiguration);
|
||||
ArgumentNullException.ThrowIfNull(options);
|
||||
|
||||
LoggerEnrichmentConfiguration enrich = loggerConfiguration.Enrich;
|
||||
|
||||
if (!string.IsNullOrEmpty(options.SiteId))
|
||||
{
|
||||
enrich.WithProperty(ZbLogEnricherNames.SiteId, options.SiteId);
|
||||
}
|
||||
|
||||
if (!string.IsNullOrEmpty(options.NodeRole))
|
||||
{
|
||||
enrich.WithProperty(ZbLogEnricherNames.NodeRole, options.NodeRole);
|
||||
}
|
||||
|
||||
enrich.WithProperty(ZbLogEnricherNames.NodeHostname, Environment.MachineName);
|
||||
|
||||
enrich.With(new TraceContextEnricher());
|
||||
|
||||
if (serviceProvider is not null)
|
||||
{
|
||||
enrich.With(new RedactionEnricher(serviceProvider));
|
||||
}
|
||||
|
||||
ApplyOpenTelemetryExport(loggerConfiguration, options);
|
||||
|
||||
return loggerConfiguration;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Adds a <c>WriteTo.OpenTelemetry</c> log sink when an OTLP exporter is explicitly
|
||||
/// selected (<see cref="ZbTelemetryOptions.Exporter"/> = <see cref="ZbExporter.Otlp"/>).
|
||||
/// <see cref="ZbTelemetryOptions.OtlpEndpoint"/> is the address used when OTLP is selected
|
||||
/// — it is NOT an independent enable. This matches the core OTel path behaviour so that
|
||||
/// an endpoint-only config (without <c>Exporter=Otlp</c>) exports nothing to OTLP on any
|
||||
/// signal. The sink carries the same Resource attributes as <c>ZbResource</c>
|
||||
/// (<c>service.name</c>/<c>service.namespace</c>/<c>service.version</c>/
|
||||
/// <c>service.instance.id</c>/<c>site.id</c>/<c>node.role</c>/<c>host.name</c>) so logs
|
||||
/// correlate with metrics and traces in the backend.
|
||||
/// </summary>
|
||||
private static void ApplyOpenTelemetryExport(
|
||||
LoggerConfiguration loggerConfiguration,
|
||||
ZbTelemetryOptions options)
|
||||
{
|
||||
if (options.Exporter != ZbExporter.Otlp)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
var resourceAttributes = BuildResourceAttributes(options);
|
||||
|
||||
loggerConfiguration.WriteTo.OpenTelemetry(sink =>
|
||||
{
|
||||
if (!string.IsNullOrEmpty(options.OtlpEndpoint))
|
||||
{
|
||||
sink.Endpoint = options.OtlpEndpoint;
|
||||
}
|
||||
|
||||
sink.Protocol = OtlpProtocol.Grpc;
|
||||
sink.ResourceAttributes = resourceAttributes;
|
||||
});
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Builds the OTLP Resource-attribute map mirroring <c>ZbResource</c>. Null/empty optional
|
||||
/// attributes are omitted, matching the shared Resource's omission rules. The
|
||||
/// <c>service.instance.id</c> is sourced from <see cref="ZbResource.InstanceId"/> — the
|
||||
/// same deterministic <c>MachineName:ProcessId</c> value used by the OTel SDK path — so
|
||||
/// all three signals carry an identical instance identifier. Internal so it can be asserted
|
||||
/// by the test assembly without being part of the public NuGet API.
|
||||
/// </summary>
|
||||
internal static IDictionary<string, object> BuildResourceAttributes(ZbTelemetryOptions options)
|
||||
{
|
||||
var attributes = new Dictionary<string, object>
|
||||
{
|
||||
["service.name"] = options.ServiceName,
|
||||
["service.namespace"] = options.ServiceNamespace,
|
||||
["service.instance.id"] = ZbResource.InstanceId,
|
||||
["host.name"] = Environment.MachineName,
|
||||
};
|
||||
|
||||
if (!string.IsNullOrEmpty(options.ServiceVersion))
|
||||
{
|
||||
attributes["service.version"] = options.ServiceVersion;
|
||||
}
|
||||
|
||||
if (!string.IsNullOrEmpty(options.SiteId))
|
||||
{
|
||||
attributes["site.id"] = options.SiteId;
|
||||
}
|
||||
|
||||
if (!string.IsNullOrEmpty(options.NodeRole))
|
||||
{
|
||||
attributes["node.role"] = options.NodeRole;
|
||||
}
|
||||
|
||||
return attributes;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,84 @@
|
||||
using Microsoft.Extensions.Hosting;
|
||||
using Serilog;
|
||||
using Serilog.Events;
|
||||
using ZB.MOM.WW.Telemetry;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry.Serilog;
|
||||
|
||||
/// <summary>
|
||||
/// Extension point for configuring the shared Serilog application logger on an
|
||||
/// <see cref="IHostApplicationBuilder"/>. Wires config-driven sinks
|
||||
/// (<c>ReadFrom.Configuration</c>), an explicit minimum level (<c>Serilog:MinimumLevel</c>,
|
||||
/// default <see cref="LogEventLevel.Information"/>), and the shared enricher/redaction/OTel-export
|
||||
/// set via <see cref="ZbSerilogConfig"/>. Does NOT configure OTel metrics/traces — call
|
||||
/// <c>AddZbTelemetry</c> in the core package for that.
|
||||
///
|
||||
/// <para>
|
||||
/// This method intentionally does <strong>not</strong> set the process-global
|
||||
/// <see cref="Log.Logger"/> (via <c>CreateBootstrapLogger</c> or otherwise). Mutating
|
||||
/// process-global state in a shared library causes "logger is already frozen" exceptions
|
||||
/// when multiple hosts are built in the same process (integration tests, multi-host apps).
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Apps that need a pre-<c>Build()</c> bootstrap logger to capture early startup exceptions
|
||||
/// should set <see cref="Log.Logger"/> themselves in <c>Program.cs</c> before calling
|
||||
/// <c>AddZbSerilog</c>:
|
||||
/// <code>
|
||||
/// Log.Logger = new LoggerConfiguration().WriteTo.Console().CreateBootstrapLogger();
|
||||
/// // ... then build the host ...
|
||||
/// builder.AddZbSerilog(o => { ... });
|
||||
/// </code>
|
||||
/// This keeps global-state mutation firmly in the application, not the library.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
public static class ZbSerilogExtensions
|
||||
{
|
||||
/// <summary>
|
||||
/// Registers the Serilog application logger in DI. Wires configuration-driven sinks
|
||||
/// (<c>ReadFrom.Configuration</c>), a code default of <see cref="LogEventLevel.Information"/>
|
||||
/// that config can override via <c>Serilog:MinimumLevel</c> or namespace overrides, plus
|
||||
/// the identity enrichers (<c>SiteId</c>/<c>NodeRole</c> from <paramref name="configure"/>,
|
||||
/// <c>NodeHostname</c> = <see cref="System.Environment.MachineName"/>).
|
||||
///
|
||||
/// <para>
|
||||
/// This method does <strong>not</strong> set the process-global <see cref="Log.Logger"/>.
|
||||
/// <c>preserveStaticLogger: true</c> is passed to <c>AddSerilog</c> so the static logger
|
||||
/// is left entirely untouched — safe to call multiple times in the same process (integration
|
||||
/// tests, multi-host scenarios) without hitting "logger is already frozen".
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// If early-startup bootstrap logging is required (before <c>Build()</c>), set
|
||||
/// <c>Log.Logger = new LoggerConfiguration().WriteTo.Console().CreateBootstrapLogger();</c>
|
||||
/// in <c>Program.cs</c> before calling this method. That decision belongs to the
|
||||
/// application, not the shared library.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
/// <param name="builder">The host application builder.</param>
|
||||
/// <param name="configure">Populates the <see cref="ZbTelemetryOptions"/>.</param>
|
||||
public static IHostApplicationBuilder AddZbSerilog(
|
||||
this IHostApplicationBuilder builder,
|
||||
Action<ZbTelemetryOptions> configure)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(builder);
|
||||
ArgumentNullException.ThrowIfNull(configure);
|
||||
|
||||
var options = new ZbTelemetryOptions();
|
||||
configure(options);
|
||||
|
||||
// Register the application logger in DI only. preserveStaticLogger: true ensures
|
||||
// AddSerilog does NOT freeze or replace Log.Logger — critical for multi-host
|
||||
// processes (integration tests etc.) where AddZbSerilog may be called more than once.
|
||||
builder.Services.AddSerilog(
|
||||
(serviceProvider, loggerConfiguration) =>
|
||||
{
|
||||
loggerConfiguration
|
||||
.MinimumLevel.Is(LogEventLevel.Information)
|
||||
.ReadFrom.Configuration(builder.Configuration);
|
||||
|
||||
ZbSerilogConfig.Apply(loggerConfiguration, options, serviceProvider);
|
||||
},
|
||||
preserveStaticLogger: true);
|
||||
|
||||
return builder;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,34 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<IsPackable>true</IsPackable>
|
||||
<PackageId>ZB.MOM.WW.Telemetry</PackageId>
|
||||
<Authors>ZB.MOM.WW</Authors>
|
||||
<Description>Core OpenTelemetry extensions for the ZB.MOM.WW SCADA family. Wires the OTel SDK (metrics + tracing + logs), populates a shared Resource (service.name/site.id/node.role), registers standard instrumentation (ASP.NET Core, HttpClient, runtime, process), and maps a Prometheus /metrics endpoint. OTLP exporter opt-in overlay included.</Description>
|
||||
<PackageTags>opentelemetry;observability;metrics;tracing;prometheus;otlp;aspnetcore;scada;wonderware;zb-mom-ww</PackageTags>
|
||||
<PackageProjectUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-telemetry</PackageProjectUrl>
|
||||
<RepositoryUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-telemetry</RepositoryUrl>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!--
|
||||
Microsoft.AspNetCore.App is a shared framework, not a NuGet package. It brings in the
|
||||
ASP.NET Core middleware surface (MapZbMetrics, instrumentation, routing, etc.).
|
||||
Referencing the shared framework is the supported path for net10.0 libraries that
|
||||
target ASP.NET Core.
|
||||
-->
|
||||
<FrameworkReference Include="Microsoft.AspNetCore.App" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="OpenTelemetry.Extensions.Hosting" />
|
||||
<PackageReference Include="OpenTelemetry.Exporter.Prometheus.AspNetCore" />
|
||||
<PackageReference Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" />
|
||||
<PackageReference Include="OpenTelemetry.Instrumentation.AspNetCore" />
|
||||
<PackageReference Include="OpenTelemetry.Instrumentation.Http" />
|
||||
<PackageReference Include="OpenTelemetry.Instrumentation.GrpcNetClient" />
|
||||
<PackageReference Include="OpenTelemetry.Instrumentation.Runtime" />
|
||||
<PackageReference Include="OpenTelemetry.Instrumentation.Process" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,22 @@
|
||||
using Microsoft.AspNetCore.Builder;
|
||||
using Microsoft.AspNetCore.Routing;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry;
|
||||
|
||||
/// <summary>
|
||||
/// Endpoint extension for mounting the Prometheus <c>/metrics</c> scrape endpoint.
|
||||
/// </summary>
|
||||
public static class ZbMetricsEndpointExtensions
|
||||
{
|
||||
/// <summary>
|
||||
/// Mounts the Prometheus <c>/metrics</c> endpoint. Only valid when
|
||||
/// <see cref="ZbTelemetryOptions.Exporter"/> = <see cref="ZbExporter.Prometheus"/>.
|
||||
/// Call after <c>app.UseRouting()</c>.
|
||||
/// </summary>
|
||||
/// <param name="endpoints">The endpoint route builder.</param>
|
||||
public static IEndpointConventionBuilder MapZbMetrics(this IEndpointRouteBuilder endpoints)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(endpoints);
|
||||
return endpoints.MapPrometheusScrapingEndpoint();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,65 @@
|
||||
using System.Collections.Generic;
|
||||
using OpenTelemetry.Resources;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry;
|
||||
|
||||
/// <summary>
|
||||
/// Builds the shared OpenTelemetry ResourceBuilder from <see cref="ZbTelemetryOptions"/>.
|
||||
/// Used internally by <c>AddZbTelemetry</c> so metrics, traces, and logs carry an identical
|
||||
/// Resource. Exposed for tests and custom pipelines.
|
||||
/// </summary>
|
||||
public static class ZbResource
|
||||
{
|
||||
/// <summary>
|
||||
/// Deterministic, process-stable service instance identifier. Formatted as
|
||||
/// <c>MachineName:ProcessId</c> so that every signal (metrics, traces, logs) from the same
|
||||
/// process carries the exact same <c>service.instance.id</c>, enabling cross-signal
|
||||
/// correlation without a random GUID that changes on each startup.
|
||||
/// </summary>
|
||||
public static string InstanceId =>
|
||||
$"{System.Environment.MachineName}:{System.Environment.ProcessId}";
|
||||
|
||||
/// <summary>
|
||||
/// Returns a <see cref="ResourceBuilder"/> pre-populated with <c>service.name</c>,
|
||||
/// <c>service.namespace</c>, <c>service.version</c>, <c>service.instance.id</c>,
|
||||
/// <c>site.id</c>, <c>node.role</c>, and <c>host.name</c> (always
|
||||
/// <see cref="System.Environment.MachineName"/>). Attributes with null values are omitted
|
||||
/// from the Resource.
|
||||
/// </summary>
|
||||
/// <param name="options">The telemetry options describing the service identity.</param>
|
||||
public static ResourceBuilder Build(ZbTelemetryOptions options) =>
|
||||
Configure(ResourceBuilder.CreateDefault(), options);
|
||||
|
||||
/// <summary>
|
||||
/// Applies the shared ZB.MOM.WW Resource attributes to an existing <see cref="ResourceBuilder"/>.
|
||||
/// Internal seam so the <c>AddZbTelemetry</c> pipeline produces a Resource identical to
|
||||
/// <see cref="Build"/>.
|
||||
/// </summary>
|
||||
internal static ResourceBuilder Configure(ResourceBuilder builder, ZbTelemetryOptions options)
|
||||
{
|
||||
builder.AddService(
|
||||
serviceName: options.ServiceName,
|
||||
serviceNamespace: options.ServiceNamespace,
|
||||
serviceVersion: options.ServiceVersion,
|
||||
autoGenerateServiceInstanceId: false,
|
||||
serviceInstanceId: InstanceId);
|
||||
|
||||
var attributes = new List<KeyValuePair<string, object>>
|
||||
{
|
||||
new("host.name", System.Environment.MachineName),
|
||||
};
|
||||
|
||||
if (!string.IsNullOrEmpty(options.SiteId))
|
||||
{
|
||||
attributes.Add(new("site.id", options.SiteId));
|
||||
}
|
||||
|
||||
if (!string.IsNullOrEmpty(options.NodeRole))
|
||||
{
|
||||
attributes.Add(new("node.role", options.NodeRole));
|
||||
}
|
||||
|
||||
builder.AddAttributes(attributes);
|
||||
return builder;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,136 @@
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Hosting;
|
||||
using OpenTelemetry.Metrics;
|
||||
using OpenTelemetry.Trace;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry;
|
||||
|
||||
/// <summary>
|
||||
/// Extension point for configuring the OpenTelemetry metrics + traces bootstrap on an
|
||||
/// <see cref="IHostApplicationBuilder"/> (or directly on an <see cref="IServiceCollection"/>).
|
||||
/// Wires the shared Resource, standard instrumentation, the app's own Meters and
|
||||
/// ActivitySources, and the selected exporter. Does NOT configure Serilog.
|
||||
/// </summary>
|
||||
public static class ZbTelemetryExtensions
|
||||
{
|
||||
/// <summary>
|
||||
/// Configures the OpenTelemetry MeterProvider and TracerProvider with the shared Resource,
|
||||
/// standard instrumentation (ASP.NET Core, HttpClient, gRPC client, runtime, process), the
|
||||
/// app's own Meters and ActivitySources, and the selected exporter.
|
||||
/// </summary>
|
||||
/// <param name="builder">The host application builder.</param>
|
||||
/// <param name="configure">Populates the <see cref="ZbTelemetryOptions"/>.</param>
|
||||
public static IHostApplicationBuilder AddZbTelemetry(
|
||||
this IHostApplicationBuilder builder,
|
||||
Action<ZbTelemetryOptions> configure)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(builder);
|
||||
ArgumentNullException.ThrowIfNull(configure);
|
||||
builder.Services.AddZbTelemetry(BuildOptions(configure));
|
||||
return builder;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// <see cref="IServiceCollection"/> overload for contexts where
|
||||
/// <see cref="IHostApplicationBuilder"/> is not available. Requires the caller to supply a
|
||||
/// pre-built <see cref="ZbTelemetryOptions"/>.
|
||||
/// </summary>
|
||||
/// <param name="services">The service collection.</param>
|
||||
/// <param name="options">The fully-populated telemetry options.</param>
|
||||
public static IServiceCollection AddZbTelemetry(
|
||||
this IServiceCollection services,
|
||||
ZbTelemetryOptions options)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(services);
|
||||
ArgumentNullException.ThrowIfNull(options);
|
||||
|
||||
services.AddOpenTelemetry()
|
||||
.ConfigureResource(rb => ZbResource.Configure(rb, options))
|
||||
.WithMetrics(metrics =>
|
||||
{
|
||||
foreach (var meter in options.Meters)
|
||||
{
|
||||
metrics.AddMeter(meter);
|
||||
}
|
||||
|
||||
metrics
|
||||
.AddAspNetCoreInstrumentation()
|
||||
.AddHttpClientInstrumentation()
|
||||
.AddRuntimeInstrumentation()
|
||||
.AddProcessInstrumentation();
|
||||
|
||||
ApplyMetricsExporter(metrics, options);
|
||||
})
|
||||
.WithTracing(tracing =>
|
||||
{
|
||||
foreach (var source in options.ActivitySources)
|
||||
{
|
||||
tracing.AddSource(source);
|
||||
}
|
||||
|
||||
tracing
|
||||
.AddAspNetCoreInstrumentation()
|
||||
.AddHttpClientInstrumentation()
|
||||
.AddGrpcClientInstrumentation();
|
||||
|
||||
ApplyTracingExporter(tracing, options);
|
||||
});
|
||||
|
||||
return services;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// IServiceCollection overload that accepts a configure delegate (convenience for callers
|
||||
/// that only have an <see cref="IServiceCollection"/> but prefer the lambda form).
|
||||
/// </summary>
|
||||
/// <param name="services">The service collection.</param>
|
||||
/// <param name="configure">Populates the <see cref="ZbTelemetryOptions"/>.</param>
|
||||
public static IServiceCollection AddZbTelemetry(
|
||||
this IServiceCollection services,
|
||||
Action<ZbTelemetryOptions> configure) =>
|
||||
services.AddZbTelemetry(BuildOptions(configure));
|
||||
|
||||
private static ZbTelemetryOptions BuildOptions(Action<ZbTelemetryOptions> configure)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(configure);
|
||||
var options = new ZbTelemetryOptions();
|
||||
configure(options);
|
||||
if (string.IsNullOrWhiteSpace(options.ServiceName))
|
||||
{
|
||||
throw new ArgumentException(
|
||||
"ZbTelemetryOptions.ServiceName is required (e.g. \"otopcua\").",
|
||||
nameof(configure));
|
||||
}
|
||||
return options;
|
||||
}
|
||||
|
||||
private static void ApplyMetricsExporter(MeterProviderBuilder metrics, ZbTelemetryOptions options)
|
||||
{
|
||||
// Prometheus is always wired so that /metrics and MapZbMetrics() work regardless of
|
||||
// the exporter setting. OTLP is an additive overlay when explicitly requested.
|
||||
metrics.AddPrometheusExporter();
|
||||
if (options.Exporter == ZbExporter.Otlp)
|
||||
{
|
||||
metrics.AddOtlpExporter(o => ConfigureOtlp(o, options));
|
||||
}
|
||||
}
|
||||
|
||||
private static void ApplyTracingExporter(TracerProviderBuilder tracing, ZbTelemetryOptions options)
|
||||
{
|
||||
// Prometheus is metrics-only; traces have no Prometheus path. Only OTLP exports traces.
|
||||
if (options.Exporter == ZbExporter.Otlp)
|
||||
{
|
||||
tracing.AddOtlpExporter(o => ConfigureOtlp(o, options));
|
||||
}
|
||||
}
|
||||
|
||||
private static void ConfigureOtlp(
|
||||
OpenTelemetry.Exporter.OtlpExporterOptions otlp,
|
||||
ZbTelemetryOptions options)
|
||||
{
|
||||
if (!string.IsNullOrEmpty(options.OtlpEndpoint))
|
||||
{
|
||||
otlp.Endpoint = new Uri(options.OtlpEndpoint);
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,76 @@
|
||||
namespace ZB.MOM.WW.Telemetry;
|
||||
|
||||
/// <summary>
|
||||
/// Selects how instrumentation data is exported.
|
||||
/// </summary>
|
||||
public enum ZbExporter
|
||||
{
|
||||
/// <summary>
|
||||
/// Prometheus scrape endpoint (default). Call <c>app.MapZbMetrics()</c> to mount <c>/metrics</c>.
|
||||
/// </summary>
|
||||
Prometheus,
|
||||
|
||||
/// <summary>
|
||||
/// OTLP gRPC export. Set <see cref="ZbTelemetryOptions.OtlpEndpoint"/>
|
||||
/// (e.g. <c>"http://collector:4317"</c>).
|
||||
/// </summary>
|
||||
Otlp,
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Options for <c>AddZbTelemetry</c>. All properties feed the shared OpenTelemetry Resource.
|
||||
/// </summary>
|
||||
public sealed class ZbTelemetryOptions
|
||||
{
|
||||
/// <summary>
|
||||
/// Required. Short lower-case app identifier — e.g. <c>"otopcua"</c>, <c>"mxgateway"</c>,
|
||||
/// <c>"scadabridge"</c>. Populates OTel Resource <c>service.name</c>.
|
||||
/// </summary>
|
||||
public string ServiceName { get; set; } = "";
|
||||
|
||||
/// <summary>
|
||||
/// Fleet-wide namespace. Default <c>"ZB.MOM.WW"</c>. Do not override per-app.
|
||||
/// Populates OTel Resource <c>service.namespace</c>.
|
||||
/// </summary>
|
||||
public string ServiceNamespace { get; set; } = "ZB.MOM.WW";
|
||||
|
||||
/// <summary>
|
||||
/// Optional. Populate from <c>AssemblyInformationalVersion</c>.
|
||||
/// Populates OTel Resource <c>service.version</c>.
|
||||
/// </summary>
|
||||
public string? ServiceVersion { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// Optional. Physical or logical site identifier.
|
||||
/// Populates OTel Resource <c>site.id</c>.
|
||||
/// </summary>
|
||||
public string? SiteId { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// Optional. Node function: <c>"central"</c>, <c>"site"</c>, <c>"hub"</c>, <c>"standalone"</c>.
|
||||
/// Populates OTel Resource <c>node.role</c>.
|
||||
/// </summary>
|
||||
public string? NodeRole { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// App-specific Meter names to register with the OTel MeterProvider. Standard instrumentation
|
||||
/// meters are added automatically (ASP.NET Core, HttpClient, runtime, process).
|
||||
/// </summary>
|
||||
public string[] Meters { get; set; } = [];
|
||||
|
||||
/// <summary>
|
||||
/// App-specific ActivitySource names to register with the OTel TracerProvider.
|
||||
/// </summary>
|
||||
public string[] ActivitySources { get; set; } = [];
|
||||
|
||||
/// <summary>
|
||||
/// Export path. Default Prometheus; use <see cref="ZbExporter.Otlp"/> for a real collector.
|
||||
/// </summary>
|
||||
public ZbExporter Exporter { get; set; } = ZbExporter.Prometheus;
|
||||
|
||||
/// <summary>
|
||||
/// Required when <see cref="Exporter"/> = <see cref="ZbExporter.Otlp"/>.
|
||||
/// OTLP gRPC endpoint, e.g. <c>"http://collector:4317"</c>.
|
||||
/// </summary>
|
||||
public string? OtlpEndpoint { get; set; }
|
||||
}
|
||||
@@ -0,0 +1,75 @@
|
||||
using Serilog;
|
||||
using Serilog.Events;
|
||||
using Serilog.Sinks.InMemory;
|
||||
using ZB.MOM.WW.Telemetry;
|
||||
using ZB.MOM.WW.Telemetry.Serilog;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry.Serilog.Tests;
|
||||
|
||||
public sealed class EnricherTests
|
||||
{
|
||||
private static string ScalarValue(LogEvent logEvent, string propertyName)
|
||||
{
|
||||
Assert.True(
|
||||
logEvent.Properties.TryGetValue(propertyName, out var value),
|
||||
$"expected property '{propertyName}' to be present");
|
||||
var scalar = Assert.IsType<ScalarValue>(value);
|
||||
return scalar.Value?.ToString() ?? "";
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Identity_enrichers_stamp_SiteId_NodeRole_and_NodeHostname()
|
||||
{
|
||||
var sink = new InMemorySink();
|
||||
var options = new ZbTelemetryOptions
|
||||
{
|
||||
ServiceName = "otopcua",
|
||||
SiteId = "s1",
|
||||
NodeRole = "Central",
|
||||
};
|
||||
|
||||
var loggerConfig = new LoggerConfiguration();
|
||||
ZbSerilogConfig.Apply(loggerConfig, options);
|
||||
using var logger = loggerConfig
|
||||
.WriteTo.Sink(sink)
|
||||
.CreateLogger();
|
||||
|
||||
logger.Information("hello");
|
||||
|
||||
var logEvent = Assert.Single(sink.LogEvents);
|
||||
Assert.Equal("s1", ScalarValue(logEvent, ZbLogEnricherNames.SiteId));
|
||||
Assert.Equal("Central", ScalarValue(logEvent, ZbLogEnricherNames.NodeRole));
|
||||
Assert.Equal(
|
||||
Environment.MachineName,
|
||||
ScalarValue(logEvent, ZbLogEnricherNames.NodeHostname));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Null_SiteId_and_NodeRole_are_suppressed_but_NodeHostname_is_always_present()
|
||||
{
|
||||
var sink = new InMemorySink();
|
||||
var options = new ZbTelemetryOptions
|
||||
{
|
||||
ServiceName = "otopcua",
|
||||
SiteId = null,
|
||||
NodeRole = null,
|
||||
};
|
||||
|
||||
var loggerConfig = new LoggerConfiguration();
|
||||
ZbSerilogConfig.Apply(loggerConfig, options);
|
||||
using var logger = loggerConfig
|
||||
.WriteTo.Sink(sink)
|
||||
.CreateLogger();
|
||||
|
||||
logger.Information("hello");
|
||||
|
||||
var logEvent = Assert.Single(sink.LogEvents);
|
||||
Assert.False(logEvent.Properties.ContainsKey(ZbLogEnricherNames.SiteId),
|
||||
"SiteId should be absent when null");
|
||||
Assert.False(logEvent.Properties.ContainsKey(ZbLogEnricherNames.NodeRole),
|
||||
"NodeRole should be absent when null");
|
||||
Assert.Equal(
|
||||
Environment.MachineName,
|
||||
ScalarValue(logEvent, ZbLogEnricherNames.NodeHostname));
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,102 @@
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Hosting;
|
||||
using Serilog;
|
||||
using ZB.MOM.WW.Telemetry;
|
||||
using ZB.MOM.WW.Telemetry.Serilog;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry.Serilog.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Regression tests for the process-global-state hazard: AddZbSerilog must not set or
|
||||
/// freeze Log.Logger. When multiple hosts are built in the same process (integration
|
||||
/// tests, multi-host apps) AddZbSerilog must be callable repeatedly without throwing
|
||||
/// "The logger is already frozen".
|
||||
/// </summary>
|
||||
public sealed class MultiHostTests
|
||||
{
|
||||
[Fact]
|
||||
public void AddZbSerilog_called_twice_in_same_process_does_not_throw()
|
||||
{
|
||||
// Arrange + Act: build two completely independent hosts in the same test process.
|
||||
// Prior to the fix, the second call to AddZbSerilog would crash with
|
||||
// "The logger is already frozen" because Stage-1 set the process-global Log.Logger.
|
||||
var exception = Record.Exception(() =>
|
||||
{
|
||||
var builder1 = Host.CreateApplicationBuilder();
|
||||
builder1.AddZbSerilog(o =>
|
||||
{
|
||||
o.ServiceName = "host-one";
|
||||
o.SiteId = "s1";
|
||||
o.NodeRole = "central";
|
||||
});
|
||||
using var host1 = builder1.Build();
|
||||
|
||||
var builder2 = Host.CreateApplicationBuilder();
|
||||
builder2.AddZbSerilog(o =>
|
||||
{
|
||||
o.ServiceName = "host-two";
|
||||
o.SiteId = "s2";
|
||||
o.NodeRole = "site";
|
||||
});
|
||||
using var host2 = builder2.Build();
|
||||
});
|
||||
|
||||
Assert.Null(exception);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void AddZbSerilog_does_not_mutate_global_Log_Logger()
|
||||
{
|
||||
// Capture whatever the static logger is before calling AddZbSerilog.
|
||||
var loggerBefore = Log.Logger;
|
||||
|
||||
var builder = Host.CreateApplicationBuilder();
|
||||
builder.AddZbSerilog(o =>
|
||||
{
|
||||
o.ServiceName = "no-global-state";
|
||||
});
|
||||
using var host = builder.Build();
|
||||
|
||||
// AddZbSerilog must leave Log.Logger exactly as it was found.
|
||||
// (ReferenceEquals is the right check — it must be the *same* instance, not
|
||||
// just an equivalent one, so we know the library never touched the static field.)
|
||||
Assert.True(
|
||||
ReferenceEquals(loggerBefore, Log.Logger),
|
||||
"AddZbSerilog must not replace or freeze the global Log.Logger");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void AddZbSerilog_each_host_resolves_its_own_DI_ILogger()
|
||||
{
|
||||
// Both hosts must resolve a working Serilog ILogger from DI independently —
|
||||
// neither host's logger is the process-global Log.Logger.
|
||||
var builder1 = Host.CreateApplicationBuilder();
|
||||
builder1.AddZbSerilog(o => { o.ServiceName = "host-a"; });
|
||||
using var host1 = builder1.Build();
|
||||
|
||||
var builder2 = Host.CreateApplicationBuilder();
|
||||
builder2.AddZbSerilog(o => { o.ServiceName = "host-b"; });
|
||||
using var host2 = builder2.Build();
|
||||
|
||||
var logger1 = host1.Services.GetRequiredService<ILogger>();
|
||||
var logger2 = host2.Services.GetRequiredService<ILogger>();
|
||||
|
||||
// Both are non-null and independently functional.
|
||||
Assert.NotNull(logger1);
|
||||
Assert.NotNull(logger2);
|
||||
|
||||
// They are distinct instances (each host has its own application logger).
|
||||
Assert.False(
|
||||
ReferenceEquals(logger1, logger2),
|
||||
"each host must have its own DI-registered ILogger instance");
|
||||
|
||||
// Neither matches the global Log.Logger — the library must not have promoted
|
||||
// a DI logger to process-global state.
|
||||
Assert.False(
|
||||
ReferenceEquals(logger1, Log.Logger),
|
||||
"host-a's DI logger must not be the global Log.Logger");
|
||||
Assert.False(
|
||||
ReferenceEquals(logger2, Log.Logger),
|
||||
"host-b's DI logger must not be the global Log.Logger");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,204 @@
|
||||
using Microsoft.Extensions.Configuration;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Hosting;
|
||||
using Serilog;
|
||||
using Serilog.Core;
|
||||
using Serilog.Events;
|
||||
using Serilog.Sinks.InMemory;
|
||||
using ZB.MOM.WW.Telemetry;
|
||||
using ZB.MOM.WW.Telemetry.Serilog;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry.Serilog.Tests;
|
||||
|
||||
public sealed class RedactionTests
|
||||
{
|
||||
private const string Masked = "***";
|
||||
|
||||
private sealed class FakeRedactor : ILogRedactor
|
||||
{
|
||||
public void Redact(IDictionary<string, object?> properties)
|
||||
{
|
||||
if (properties.ContainsKey("apiKey"))
|
||||
{
|
||||
properties["apiKey"] = Masked;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private static string? ScalarOrNull(LogEvent logEvent, string propertyName) =>
|
||||
logEvent.Properties.TryGetValue(propertyName, out var value) && value is ScalarValue scalar
|
||||
? scalar.Value?.ToString()
|
||||
: null;
|
||||
|
||||
[Fact]
|
||||
public void Registered_redactor_masks_sensitive_property()
|
||||
{
|
||||
var serviceProvider = new ServiceCollection()
|
||||
.AddSingleton<ILogRedactor>(new FakeRedactor())
|
||||
.BuildServiceProvider();
|
||||
|
||||
var sink = new InMemorySink();
|
||||
var options = new ZbTelemetryOptions { ServiceName = "mxgateway" };
|
||||
|
||||
var loggerConfig = new LoggerConfiguration();
|
||||
ZbSerilogConfig.Apply(loggerConfig, options, serviceProvider);
|
||||
using Logger logger = loggerConfig.WriteTo.Sink(sink).CreateLogger();
|
||||
|
||||
logger.Information("authenticating {apiKey}", "mxgw_secret");
|
||||
|
||||
var logEvent = Assert.Single(sink.LogEvents);
|
||||
Assert.Equal(Masked, ScalarOrNull(logEvent, "apiKey"));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void No_redactor_registered_is_a_no_op()
|
||||
{
|
||||
var serviceProvider = new ServiceCollection().BuildServiceProvider();
|
||||
|
||||
var sink = new InMemorySink();
|
||||
var options = new ZbTelemetryOptions { ServiceName = "mxgateway" };
|
||||
|
||||
var loggerConfig = new LoggerConfiguration();
|
||||
ZbSerilogConfig.Apply(loggerConfig, options, serviceProvider);
|
||||
using Logger logger = loggerConfig.WriteTo.Sink(sink).CreateLogger();
|
||||
|
||||
logger.Information("authenticating {apiKey}", "mxgw_secret");
|
||||
|
||||
var logEvent = Assert.Single(sink.LogEvents);
|
||||
Assert.Equal("mxgw_secret", ScalarOrNull(logEvent, "apiKey"));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void AddZbSerilog_with_otlp_options_builds_without_error()
|
||||
{
|
||||
var builder = Host.CreateApplicationBuilder();
|
||||
|
||||
builder.AddZbSerilog(o =>
|
||||
{
|
||||
o.ServiceName = "mxgateway";
|
||||
o.SiteId = "s1";
|
||||
o.NodeRole = "central";
|
||||
o.Exporter = ZbExporter.Otlp;
|
||||
o.OtlpEndpoint = "http://localhost:4317";
|
||||
});
|
||||
|
||||
using var host = builder.Build();
|
||||
|
||||
// Serilog.ILogger is registered by AddSerilog — not Microsoft.Extensions.Logging.ILogger.
|
||||
var logger = host.Services.GetRequiredService<ILogger>();
|
||||
logger.Information("otlp wiring smoke test");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildResourceAttributes_contains_required_keys_and_optional_keys_when_set()
|
||||
{
|
||||
var options = new ZbTelemetryOptions
|
||||
{
|
||||
ServiceName = "mxgateway",
|
||||
ServiceNamespace = "ZB.MOM.WW",
|
||||
SiteId = "site-a",
|
||||
NodeRole = "central",
|
||||
};
|
||||
|
||||
var attributes = ZbSerilogConfig.BuildResourceAttributes(options);
|
||||
|
||||
// Required keys always present.
|
||||
Assert.True(attributes.ContainsKey("service.name"), "service.name must be present");
|
||||
Assert.True(attributes.ContainsKey("service.namespace"), "service.namespace must be present");
|
||||
Assert.True(attributes.ContainsKey("host.name"), "host.name must be present");
|
||||
|
||||
// service.instance.id must be present and match ZbResource.InstanceId (parity with OTel SDK path).
|
||||
Assert.True(attributes.ContainsKey("service.instance.id"), "service.instance.id must be present");
|
||||
Assert.Equal(ZbResource.InstanceId, attributes["service.instance.id"]);
|
||||
|
||||
// Optional keys present when options supply them.
|
||||
Assert.True(attributes.ContainsKey("site.id"), "site.id must be present when SiteId is set");
|
||||
Assert.True(attributes.ContainsKey("node.role"), "node.role must be present when NodeRole is set");
|
||||
|
||||
Assert.Equal("mxgateway", attributes["service.name"]);
|
||||
Assert.Equal("ZB.MOM.WW", attributes["service.namespace"]);
|
||||
Assert.Equal(Environment.MachineName, attributes["host.name"]);
|
||||
Assert.Equal("site-a", attributes["site.id"]);
|
||||
Assert.Equal("central", attributes["node.role"]);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildResourceAttributes_omits_optional_keys_when_not_set()
|
||||
{
|
||||
var options = new ZbTelemetryOptions
|
||||
{
|
||||
ServiceName = "mxgateway",
|
||||
SiteId = null,
|
||||
NodeRole = null,
|
||||
};
|
||||
|
||||
var attributes = ZbSerilogConfig.BuildResourceAttributes(options);
|
||||
|
||||
Assert.False(attributes.ContainsKey("site.id"), "site.id must be absent when SiteId is null");
|
||||
Assert.False(attributes.ContainsKey("node.role"), "node.role must be absent when NodeRole is null");
|
||||
// service.instance.id is always present regardless of optional fields.
|
||||
Assert.True(attributes.ContainsKey("service.instance.id"), "service.instance.id must always be present");
|
||||
Assert.Equal(ZbResource.InstanceId, attributes["service.instance.id"]);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Fix 1 — Symmetric OTLP trigger: the Serilog path must only activate the OTel log sink
|
||||
/// when <c>Exporter == ZbExporter.Otlp</c>, NOT merely when <c>OtlpEndpoint</c> is set.
|
||||
/// This matches the core OTel metrics/traces path that ignores a bare endpoint without
|
||||
/// <c>Exporter=Otlp</c>.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public void ApplyOpenTelemetryExport_does_not_activate_when_only_endpoint_is_set()
|
||||
{
|
||||
// Arrange: set OtlpEndpoint but leave Exporter at the default (not Otlp).
|
||||
var options = new ZbTelemetryOptions
|
||||
{
|
||||
ServiceName = "mxgateway",
|
||||
OtlpEndpoint = "http://localhost:4317",
|
||||
// Exporter is intentionally left at default (ZbExporter.None / Prometheus only)
|
||||
};
|
||||
|
||||
// Act: Apply the shared Serilog config — if the bug is present this will attempt to
|
||||
// connect to localhost:4317 and the OpenTelemetry sink will be registered.
|
||||
// We verify by inspecting the LoggerConfiguration directly: after Apply, if WriteTo
|
||||
// contained an OTel sink the LoggerConfiguration's internal list would be non-empty.
|
||||
// The simplest observable proxy: building the logger must not throw, and we assert
|
||||
// the exporter is not Otlp.
|
||||
Assert.NotEqual(ZbExporter.Otlp, options.Exporter);
|
||||
|
||||
// Building the logger with only OtlpEndpoint set (no Exporter=Otlp) must not throw
|
||||
// and must not attempt any OTLP connection — the sink should simply be absent.
|
||||
var exception = Record.Exception(() =>
|
||||
{
|
||||
var loggerConfig = new LoggerConfiguration();
|
||||
ZbSerilogConfig.Apply(loggerConfig, options);
|
||||
using var logger = loggerConfig.CreateLogger();
|
||||
logger.Information("no otlp sink expected");
|
||||
});
|
||||
|
||||
Assert.Null(exception);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ApplyOpenTelemetryExport_activates_when_Exporter_is_Otlp()
|
||||
{
|
||||
// Arrange: Exporter explicitly set to Otlp (no endpoint — exporter registered but won't connect).
|
||||
var options = new ZbTelemetryOptions
|
||||
{
|
||||
ServiceName = "mxgateway",
|
||||
Exporter = ZbExporter.Otlp,
|
||||
// OtlpEndpoint intentionally left null — we test the trigger, not the connection.
|
||||
};
|
||||
|
||||
// Act + Assert: must not throw (the sink is registered but won't connect in tests).
|
||||
var exception = Record.Exception(() =>
|
||||
{
|
||||
var loggerConfig = new LoggerConfiguration();
|
||||
ZbSerilogConfig.Apply(loggerConfig, options);
|
||||
using var logger = loggerConfig.CreateLogger();
|
||||
logger.Information("otlp sink registered");
|
||||
});
|
||||
|
||||
Assert.Null(exception);
|
||||
}
|
||||
}
|
||||
+71
@@ -0,0 +1,71 @@
|
||||
using System.Diagnostics;
|
||||
using Serilog;
|
||||
using Serilog.Core;
|
||||
using Serilog.Events;
|
||||
using Serilog.Sinks.InMemory;
|
||||
using ZB.MOM.WW.Telemetry;
|
||||
using ZB.MOM.WW.Telemetry.Serilog;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry.Serilog.Tests;
|
||||
|
||||
public sealed class TraceContextEnricherTests
|
||||
{
|
||||
private const string SourceName = "ZB.MOM.WW.Telemetry.Serilog.Tests.TraceContext";
|
||||
|
||||
private static Logger BuildLogger(InMemorySink sink) =>
|
||||
new LoggerConfiguration()
|
||||
.Enrich.With(new TraceContextEnricher())
|
||||
.WriteTo.Sink(sink)
|
||||
.CreateLogger();
|
||||
|
||||
private static string? ScalarOrNull(LogEvent logEvent, string propertyName) =>
|
||||
logEvent.Properties.TryGetValue(propertyName, out var value) && value is ScalarValue scalar
|
||||
? scalar.Value?.ToString()
|
||||
: null;
|
||||
|
||||
[Fact]
|
||||
public void Active_activity_stamps_trace_id_and_span_id()
|
||||
{
|
||||
using var listener = new ActivityListener
|
||||
{
|
||||
ShouldListenTo = source => source.Name == SourceName,
|
||||
Sample = (ref ActivityCreationOptions<ActivityContext> _) =>
|
||||
ActivitySamplingResult.AllDataAndRecorded,
|
||||
};
|
||||
ActivitySource.AddActivityListener(listener);
|
||||
|
||||
using var activitySource = new ActivitySource(SourceName);
|
||||
var sink = new InMemorySink();
|
||||
using var logger = BuildLogger(sink);
|
||||
|
||||
using var activity = activitySource.StartActivity("unit-test");
|
||||
Assert.NotNull(activity);
|
||||
Assert.NotNull(Activity.Current);
|
||||
|
||||
// Capture IDs before the log call so assertions are not sensitive to activity
|
||||
// lifecycle — Activity.Current may differ after the log call returns.
|
||||
var expectedTraceId = activity.TraceId.ToString();
|
||||
var expectedSpanId = activity.SpanId.ToString();
|
||||
|
||||
logger.Information("traced");
|
||||
|
||||
var logEvent = Assert.Single(sink.LogEvents);
|
||||
Assert.Equal(expectedTraceId, ScalarOrNull(logEvent, "trace_id"));
|
||||
Assert.Equal(expectedSpanId, ScalarOrNull(logEvent, "span_id"));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void No_active_activity_omits_trace_id_and_span_id()
|
||||
{
|
||||
Assert.Null(Activity.Current);
|
||||
|
||||
var sink = new InMemorySink();
|
||||
using var logger = BuildLogger(sink);
|
||||
|
||||
logger.Information("untraced");
|
||||
|
||||
var logEvent = Assert.Single(sink.LogEvents);
|
||||
Assert.False(logEvent.Properties.ContainsKey("trace_id"));
|
||||
Assert.False(logEvent.Properties.ContainsKey("span_id"));
|
||||
}
|
||||
}
|
||||
+23
@@ -0,0 +1,23 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<IsPackable>false</IsPackable>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="coverlet.collector" />
|
||||
<PackageReference Include="Microsoft.NET.Test.Sdk" />
|
||||
<PackageReference Include="xunit" />
|
||||
<PackageReference Include="xunit.runner.visualstudio" />
|
||||
<PackageReference Include="Serilog.Sinks.InMemory" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<Using Include="Xunit" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\..\src\ZB.MOM.WW.Telemetry.Serilog\ZB.MOM.WW.Telemetry.Serilog.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,143 @@
|
||||
using System.Diagnostics.Metrics;
|
||||
using Microsoft.AspNetCore.Builder;
|
||||
using Microsoft.AspNetCore.Hosting;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using OpenTelemetry;
|
||||
using OpenTelemetry.Metrics;
|
||||
using OpenTelemetry.Resources;
|
||||
using ZB.MOM.WW.Telemetry;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry.Tests;
|
||||
|
||||
public sealed class AddZbTelemetryTests
|
||||
{
|
||||
// Fix #2: empty ServiceName must throw ArgumentException --------------------------
|
||||
|
||||
[Fact]
|
||||
public void AddZbTelemetry_Throws_WhenServiceNameIsEmpty()
|
||||
{
|
||||
var builder = WebApplication.CreateBuilder();
|
||||
var ex = Assert.Throws<ArgumentException>(() =>
|
||||
builder.AddZbTelemetry(o =>
|
||||
{
|
||||
o.ServiceName = ""; // explicitly empty
|
||||
}));
|
||||
Assert.Equal("configure", ex.ParamName);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void AddZbTelemetry_Throws_WhenServiceNameIsWhitespace()
|
||||
{
|
||||
var builder = WebApplication.CreateBuilder();
|
||||
var ex = Assert.Throws<ArgumentException>(() =>
|
||||
builder.AddZbTelemetry(o =>
|
||||
{
|
||||
o.ServiceName = " ";
|
||||
}));
|
||||
Assert.Equal("configure", ex.ParamName);
|
||||
}
|
||||
|
||||
// Fix #1: Prometheus coexists with OTLP — /metrics must still serve under Otlp exporter
|
||||
|
||||
[Fact]
|
||||
public async Task AddZbTelemetry_OtlpExporter_StillServesPrometheusEndpoint()
|
||||
{
|
||||
var builder = WebApplication.CreateBuilder();
|
||||
builder.WebHost.UseUrls("http://127.0.0.1:0");
|
||||
builder.AddZbTelemetry(o =>
|
||||
{
|
||||
o.ServiceName = "telemetry-test";
|
||||
o.Exporter = ZbExporter.Otlp;
|
||||
// OtlpEndpoint intentionally left null — exporter will be registered but won't
|
||||
// connect anywhere; we are only verifying Prometheus remains present.
|
||||
o.Meters = ["Test.OtlpCoexist.Meter"];
|
||||
});
|
||||
|
||||
await using var app = builder.Build();
|
||||
app.MapZbMetrics();
|
||||
|
||||
await app.StartAsync();
|
||||
|
||||
var address = app.Urls.First();
|
||||
using var client = new HttpClient { BaseAddress = new Uri(address) };
|
||||
|
||||
var response = await client.GetAsync("/metrics");
|
||||
|
||||
Assert.Equal(System.Net.HttpStatusCode.OK, response.StatusCode);
|
||||
Assert.Equal("text/plain", response.Content.Headers.ContentType?.MediaType);
|
||||
|
||||
await app.StopAsync();
|
||||
}
|
||||
|
||||
// Existing test ---------------------------------------------------------------
|
||||
|
||||
[Fact]
|
||||
public void AddZbTelemetry_ExportsAppMeter_WithSharedResource()
|
||||
{
|
||||
// 1.15.x note: AddInMemoryExporter moved out of the core OpenTelemetry assembly into a
|
||||
// separate OpenTelemetry.Exporter.InMemory package (not referenced here). We attach a
|
||||
// BaseExporter<Metric> directly instead — it both collects metric names and exposes the
|
||||
// MeterProvider Resource via ParentProvider.GetResource().
|
||||
var capture = new CapturingMetricExporter();
|
||||
|
||||
var builder = WebApplication.CreateBuilder();
|
||||
builder.AddZbTelemetry(o =>
|
||||
{
|
||||
o.ServiceName = "t";
|
||||
o.SiteId = "site-test";
|
||||
o.NodeRole = "central";
|
||||
o.Meters = ["Test.Meter"];
|
||||
});
|
||||
|
||||
// Compose a capturing reader onto the pipeline AddZbTelemetry already registered.
|
||||
builder.Services.ConfigureOpenTelemetryMeterProvider(b =>
|
||||
b.AddReader(new PeriodicExportingMetricReader(capture)
|
||||
{
|
||||
TemporalityPreference = MetricReaderTemporalityPreference.Cumulative,
|
||||
}));
|
||||
|
||||
// Create the meter + instrument BEFORE the provider is built so the MeterProvider's
|
||||
// listener subscribes to it during construction.
|
||||
using var meter = new Meter("Test.Meter");
|
||||
var counter = meter.CreateCounter<long>("test.events.count");
|
||||
|
||||
using var app = builder.Build();
|
||||
|
||||
var meterProvider = app.Services.GetRequiredService<MeterProvider>();
|
||||
counter.Add(1);
|
||||
meterProvider.ForceFlush();
|
||||
|
||||
// The app's meter was registered and its instrument was collected through the pipeline.
|
||||
Assert.Contains("test.events.count", capture.MetricNames);
|
||||
|
||||
// The exported metric carries the shared Resource (identical to ZbResource.Build).
|
||||
Assert.NotNull(capture.CapturedResource);
|
||||
var attrs = capture.CapturedResource!.Attributes.ToDictionary(a => a.Key, a => a.Value);
|
||||
Assert.Equal("t", attrs["service.name"]);
|
||||
Assert.Equal("ZB.MOM.WW", attrs["service.namespace"]);
|
||||
Assert.Equal("site-test", attrs["site.id"]);
|
||||
Assert.Equal("central", attrs["node.role"]);
|
||||
Assert.Equal(Environment.MachineName, attrs["host.name"]);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Collects exported metric names and captures the MeterProvider Resource on first export so
|
||||
/// the test can assert the pipeline wired both the app meter and the shared Resource.
|
||||
/// </summary>
|
||||
private sealed class CapturingMetricExporter : BaseExporter<Metric>
|
||||
{
|
||||
public List<string> MetricNames { get; } = [];
|
||||
public Resource? CapturedResource { get; private set; }
|
||||
|
||||
public override ExportResult Export(in Batch<Metric> batch)
|
||||
{
|
||||
CapturedResource ??= ParentProvider?.GetResource();
|
||||
foreach (var metric in batch)
|
||||
{
|
||||
MetricNames.Add(metric.Name);
|
||||
}
|
||||
|
||||
return ExportResult.Success;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,38 @@
|
||||
using Microsoft.AspNetCore.Builder;
|
||||
using Microsoft.AspNetCore.Hosting;
|
||||
using Microsoft.Extensions.Hosting;
|
||||
using ZB.MOM.WW.Telemetry;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry.Tests;
|
||||
|
||||
public sealed class MapZbMetricsTests
|
||||
{
|
||||
[Fact]
|
||||
public async Task MapZbMetrics_ServesPrometheusEndpoint()
|
||||
{
|
||||
var builder = WebApplication.CreateBuilder();
|
||||
builder.WebHost.UseUrls("http://127.0.0.1:0");
|
||||
builder.AddZbTelemetry(o =>
|
||||
{
|
||||
o.ServiceName = "t";
|
||||
o.Exporter = ZbExporter.Prometheus;
|
||||
o.Meters = ["Test.Meter"];
|
||||
});
|
||||
|
||||
await using var app = builder.Build();
|
||||
app.MapZbMetrics();
|
||||
|
||||
await app.StartAsync();
|
||||
|
||||
var address = app.Urls.First();
|
||||
using var client = new HttpClient { BaseAddress = new Uri(address) };
|
||||
|
||||
var response = await client.GetAsync("/metrics");
|
||||
|
||||
Assert.Equal(System.Net.HttpStatusCode.OK, response.StatusCode);
|
||||
Assert.NotNull(response.Content.Headers.ContentType);
|
||||
Assert.Equal("text/plain", response.Content.Headers.ContentType!.MediaType);
|
||||
|
||||
await app.StopAsync();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,28 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<IsPackable>false</IsPackable>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="coverlet.collector" />
|
||||
<PackageReference Include="Microsoft.NET.Test.Sdk" />
|
||||
<PackageReference Include="xunit" />
|
||||
<PackageReference Include="xunit.runner.visualstudio" />
|
||||
<PackageReference Include="Microsoft.AspNetCore.Mvc.Testing" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<Using Include="Xunit" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!-- WebApplicationFactory requires the full ASP.NET Core shared framework -->
|
||||
<FrameworkReference Include="Microsoft.AspNetCore.App" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\..\src\ZB.MOM.WW.Telemetry\ZB.MOM.WW.Telemetry.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -0,0 +1,73 @@
|
||||
using OpenTelemetry.Resources;
|
||||
using ZB.MOM.WW.Telemetry;
|
||||
|
||||
namespace ZB.MOM.WW.Telemetry.Tests;
|
||||
|
||||
public sealed class ZbResourceTests
|
||||
{
|
||||
[Fact]
|
||||
public void Build_PopulatesAllResourceAttributes()
|
||||
{
|
||||
var options = new ZbTelemetryOptions
|
||||
{
|
||||
ServiceName = "otopcua",
|
||||
ServiceNamespace = "ZB.MOM.WW",
|
||||
ServiceVersion = "1.2.3",
|
||||
SiteId = "site-7",
|
||||
NodeRole = "central",
|
||||
};
|
||||
|
||||
var resource = ZbResource.Build(options).Build();
|
||||
var attributes = resource.Attributes.ToDictionary(a => a.Key, a => a.Value);
|
||||
|
||||
Assert.Equal("otopcua", attributes["service.name"]);
|
||||
Assert.Equal("ZB.MOM.WW", attributes["service.namespace"]);
|
||||
Assert.Equal("1.2.3", attributes["service.version"]);
|
||||
Assert.Equal("site-7", attributes["site.id"]);
|
||||
Assert.Equal("central", attributes["node.role"]);
|
||||
Assert.Equal(Environment.MachineName, attributes["host.name"]);
|
||||
// service.instance.id must be the deterministic MachineName:ProcessId — NOT a random GUID.
|
||||
Assert.Equal(ZbResource.InstanceId, attributes["service.instance.id"]);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Build_OmitsOptionalAttributes_WhenNull()
|
||||
{
|
||||
var options = new ZbTelemetryOptions
|
||||
{
|
||||
ServiceName = "mxgateway",
|
||||
// ServiceVersion / SiteId / NodeRole left null
|
||||
};
|
||||
|
||||
var resource = ZbResource.Build(options).Build();
|
||||
var keys = resource.Attributes.Select(a => a.Key).ToHashSet();
|
||||
|
||||
Assert.Contains("service.name", keys);
|
||||
Assert.Contains("service.namespace", keys);
|
||||
Assert.Contains("host.name", keys);
|
||||
// service.instance.id is always present (deterministic, not optional).
|
||||
Assert.Contains("service.instance.id", keys);
|
||||
Assert.DoesNotContain("service.version", keys);
|
||||
Assert.DoesNotContain("site.id", keys);
|
||||
Assert.DoesNotContain("node.role", keys);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void InstanceId_is_deterministic_MachineName_colon_ProcessId()
|
||||
{
|
||||
// InstanceId must be stable within the process and follow the MachineName:ProcessId format.
|
||||
var expected = $"{Environment.MachineName}:{Environment.ProcessId}";
|
||||
Assert.Equal(expected, ZbResource.InstanceId);
|
||||
// Calling it twice returns the same value (no random component).
|
||||
Assert.Equal(ZbResource.InstanceId, ZbResource.InstanceId);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void InstanceId_does_not_contain_a_random_guid()
|
||||
{
|
||||
// The old OTel SDK default was a random GUID; the deterministic id must NOT be a GUID.
|
||||
Assert.False(
|
||||
Guid.TryParse(ZbResource.InstanceId, out _),
|
||||
$"service.instance.id must not be a GUID; got '{ZbResource.InstanceId}'");
|
||||
}
|
||||
}
|
||||
@@ -19,6 +19,9 @@ specs and analyses that *drive* changes made in the individual repos.
|
||||
|---|---|---|---|---|
|
||||
| Auth (login / identity / authz) | Draft | OtOpcUa, MxAccessGateway, ScadaBridge | Path to shared code (`ZB.MOM.WW.Auth`) | [`auth/`](auth/) |
|
||||
| UI Theme (layout / tokens / components) | Draft | OtOpcUa, MxAccessGateway, ScadaBridge | Path to shared code (`ZB.MOM.WW.Theme`) | [`ui-theme/`](ui-theme/) |
|
||||
| Health (readiness / liveness / active-node) | Draft | OtOpcUa, MxAccessGateway, ScadaBridge | Shared `ZB.MOM.WW.Health` lib (3 packages) | [`health/`](health/) |
|
||||
| Observability (metrics / traces / logs) | Draft | OtOpcUa, MxAccessGateway, ScadaBridge | Shared `ZB.MOM.WW.Telemetry` lib (2 packages) | [`observability/`](observability/) |
|
||||
| Audit (event model + writer seam) | Draft | OtOpcUa, MxAccessGateway, ScadaBridge | Path to shared code (`ZB.MOM.WW.Audit`) | [`audit/`](audit/) |
|
||||
|
||||
> Add a row when you start normalizing a new component. Status: `Draft` → `Reviewed` → `Adopting` → `Converged`.
|
||||
|
||||
|
||||
@@ -0,0 +1,114 @@
|
||||
# Audit — gaps & adoption backlog
|
||||
|
||||
Divergence of each project from [`spec/SPEC.md`](spec/SPEC.md), and the ordered backlog to
|
||||
reach the shared `ZB.MOM.WW.Audit` library. Status legend: ⛔ gap · 🟡 partial · ✅ matches.
|
||||
|
||||
> **Adoption is deferred this round.** The library is being designed (shared contract in
|
||||
> [`shared-contract/ZB.MOM.WW.Audit.md`](shared-contract/ZB.MOM.WW.Audit.md)) but is not yet
|
||||
> wired into any app — exactly where `ZB.MOM.WW.Auth` and `ZB.MOM.WW.Theme` sit today.
|
||||
> The items below are the follow-on work; each lands as a separate PR per project.
|
||||
|
||||
## Divergence vs spec
|
||||
|
||||
### §1 Canonical record (`AuditEvent`)
|
||||
|
||||
| Canonical field | OtOpcUa | MxAccessGateway | ScadaBridge |
|
||||
|---|---|---|---|
|
||||
| `EventId` (Guid, required) | ✅ — idempotency key; buffer key + filtered-unique DB index | ⛔ — no event key; only an `AUTOINCREMENT` rowid (`AuditId`) | ✅ — direct |
|
||||
| `OccurredAtUtc` (DateTimeOffset, required) | 🟡 — `DateTime` UTC; widen at mapping boundary | 🟡 — `DateTimeOffset` but store-assigned (not caller-supplied); direct after widening | 🟡 — `DateTime` UTC-forced; widen at mapping boundary |
|
||||
| `Actor` (string, required) | ✅ — direct (`AuditEvent.Actor` → `ConfigAuditLog.Principal`) | 🟡 — `KeyId` nullable; keyless events (`init-db`/`list-keys`) need a `"system"`/`"cli"` fallback | 🟡 — nullable on system-originated rows; fallback needed |
|
||||
| `Action` (string, required) | 🟡 — `Action` field exists, but persisted as `"{Category}:{Action}"` composite in `EventType`; canonical keeps them separate | ✅ — `EventType` literal direct | 🟡 — derived as `{Channel}.{Kind}` (e.g. `ApiOutbound.ApiCall`) |
|
||||
| `Outcome` (AuditOutcome, required) | ⛔ **NEW** — derived from `EventType` vocabulary; not stored today | ⛔ **NEW** — derived: `constraint-denied`→`Denied`, else `Success` | ⛔ **NEW** — derived from `Status` (+`InboundAuthFailure` Kind→`Denied`) |
|
||||
| `Category` (string?) | ✅ — `AuditEvent.Category` (e.g. `"Config"`) | ⛔ — no field; constant `"ApiKey"` at mapping | ✅ — `Channel` |
|
||||
| `Target` (string?) | ⛔ — no dedicated field; closest is `DetailsJson` | ⛔ — embedded in `Details` text (`commandKind`/`target`) | ✅ — direct |
|
||||
| `SourceNode` (string?) | ✅ — `SourceNode` (logical cluster node / host name, NOT an OPC UA NodeId) | 🟡 — `RemoteAddress`; dashboard path only (null on CLI/constraint paths) | ✅ — direct |
|
||||
| `CorrelationId` (Guid?) | ✅ — direct (`CorrelationId.Value`) | ⛔ — not captured today; left null | ✅ — direct |
|
||||
| `DetailsJson` (string?) | ✅ — direct (JSON CHECK constraint enforced) | 🟡 — `Details` is a plain string, not JSON; wrap or store as-is | 🟡 — ~15 rich/plumbing fields serialize here at the cross-project reporting boundary |
|
||||
|
||||
### §2 `IAuditWriter` seam
|
||||
|
||||
| | OtOpcUa | MxAccessGateway | ScadaBridge |
|
||||
|---|---|---|---|
|
||||
| Named seam | ⛔ — no `IAuditWriter`; `AuditWriterActor` is the sink, consumed directly via Akka messaging | ⛔ — `IApiKeyAuditStore` (narrow, two-method) is the seam; no general `IAuditWriter` | ✅ — `IAuditWriter` with `WriteAsync(AuditEvent, CancellationToken)` signature; "failures must NEVER abort the user-facing action" contract; best-effort |
|
||||
| Best-effort / never throws | 🟡 — the actor drops a failed flush (best-effort), but the seam is not a typed interface a caller can inject independently | ⛔ — no contract; `AppendAsync` may propagate | ✅ |
|
||||
| Record type at the seam | 🟡 — OtOpcUa's own `AuditEvent` (8 fields, with Commons value-types `NodeId`/`CorrelationId`) | ⛔ — `ApiKeyAuditEntry` (4 fields) | 🟡 — ScadaBridge's ~25-field `AuditEvent` (rich record; adoption = keep own record, adopt canonical interface name + `AuditOutcome`) |
|
||||
|
||||
### §3 `IAuditRedactor` seam
|
||||
|
||||
| | OtOpcUa | MxAccessGateway | ScadaBridge |
|
||||
|---|---|---|---|
|
||||
| Named seam | ⛔ — no redactor; no payload filtering today | ⛔ — no redactor; safety by construction (entry type cannot carry a secret) | ✅ — `IAuditPayloadFilter` (`AuditEvent Apply(AuditEvent)`, pure/never-throws/over-redacts); **only the name differs** from canonical `IAuditRedactor` |
|
||||
| Over-redacts on failure | ⛔ — n/a | ⛔ — n/a | ✅ — `SafeDefaultAuditPayloadFilter` is the reference |
|
||||
|
||||
### §4 `AuditOutcome` — the new normalized field
|
||||
|
||||
`Outcome` is a **genuinely new field** across all three projects. No app stores it today;
|
||||
each encodes it implicitly. All three must derive and emit it at adoption:
|
||||
|
||||
→ **Gap O1 (OtOpcUa):** derive from `EventType` vocabulary — `OpcUaAccessDenied` /
|
||||
`CrossClusterNamespaceAttempt` → `Denied`; config-write verbs → `Success`. No `Failure`
|
||||
value exists in OtOpcUa's vocabulary today (failed flushes are dropped, not emitted), so
|
||||
OtOpcUa will produce only `Success` / `Denied` until/unless failure events are added.
|
||||
|
||||
→ **Gap O2 (MxGateway):** derive — `constraint-denied` → `Denied`; all others → `Success`.
|
||||
No `Failure` events are emitted today.
|
||||
|
||||
→ **Gap O3 (ScadaBridge):** derive from `AuditStatus` — `Delivered` → `Success`;
|
||||
`Failed` / `Parked` / `Discarded` → `Failure`; `Kind = InboundAuthFailure` → `Denied`.
|
||||
In-flight states (`Submitted` / `Forwarded` / `Attempted`) collapse to the last-known
|
||||
terminal state when projecting; `Skipped` is excluded from the canonical projection.
|
||||
|
||||
### §5 `Actor` → Auth principal
|
||||
|
||||
At adoption, every emit site should supply the `ZB.MOM.WW.Auth` principal as `Actor`
|
||||
(string). The library carries no Auth dependency — `Actor` is a plain `string` — but the
|
||||
handshake with Auth is the semantic goal (closes the loop).
|
||||
|
||||
→ **Gap P1 (all 3):** at adoption, update emit sites to populate `Actor` from the Auth
|
||||
principal (LDAP user / API-key name). Auth adoption (#8 in `components/auth/GAPS.md`) is a
|
||||
prerequisite for the full story; until then, use the existing actor string.
|
||||
|
||||
### §6 OtOpcUa two-producer problem
|
||||
|
||||
OtOpcUa has **two writers to `ConfigAuditLog`**: the structured Akka `AuditEvent` path AND
|
||||
older SQL stored procedures that `INSERT` directly (bare `EventType`, NULL `EventId` /
|
||||
`CorrelationId`, populated `ClusterId` / `GenerationId`). Normalization targets the
|
||||
structured path only; the SP path stays per-project.
|
||||
|
||||
→ **Gap Q1 (OtOpcUa):** decide at adoption whether to route SP events through the actor
|
||||
or leave them non-idempotent. Also: the `ClusterId`-filter / actor-never-sets-`ClusterId`
|
||||
mismatch (Admin UI `ClusterAudit.razor` filters by `ClusterId`, but the actor path sets
|
||||
`NodeId` not `ClusterId`, so structured rows are invisible to the cluster view). Fix when
|
||||
normalizing the query surface.
|
||||
|
||||
## Adoption backlog (ordered)
|
||||
|
||||
| # | Item | Projects | Priority | Effort | Risk | Notes |
|
||||
|---|---|---|---|---|---|---|
|
||||
| 1 | **OtOpcUa:** rename `AuditWriterActor` → implements `IAuditWriter`; replace `Commons/Messages/Audit/AuditEvent.cs` with canonical record; add `Outcome` derivation at every emit site (Gap O1) | OtOpcUa | Med | M | Med | Actor internals (batching / dedup / flush triggers) stay bespoke; only the seam type and record change. Commons value-types `NodeId`/`CorrelationId` bridged at construction. |
|
||||
| 2 | **MxGateway:** map `IApiKeyAuditStore` / `ApiKeyAuditEntry` / `ApiKeyAuditRecord` → `IAuditWriter` / `AuditEvent`; generate `EventId` per write; add `"system"`/`"cli"` Actor fallback; constant `Category = "ApiKey"`; `constraint-denied`→`Outcome.Denied` (Gaps O2, record gaps) | MxGateway | Low | S | Med | ⚠ **COORDINATE** — a parallel session is editing this repo for the MEL→Serilog migration (Health/Telemetry normalization). Do NOT start until the Serilog session has landed (or is explicitly fenced off); the two efforts share `Security/Authentication/` DI wiring. |
|
||||
| 3 | **ScadaBridge:** rename `IAuditPayloadFilter` → `IAuditRedactor` (or alias during transition); adopt canonical `AuditOutcome` enum (Gap O3); confirm writer contract matches (already byte-for-byte) | ScadaBridge | Low | S | High | **"Align, don't replace."** Blast radius is HIGH — `IAuditPayloadFilter` is used across the entire pipeline (site, central, wiring). Rename + alias only; no transport/storage/record change. `DefaultAuditPayloadFilter` / `SafeDefaultAuditPayloadFilter` implementations unchanged. |
|
||||
| 4 | **All:** populate `Actor` from `ZB.MOM.WW.Auth` principal at emit sites (Gap P1) | All 3 | Low | S | Low | **Prerequisite:** Auth adoption per `components/auth/GAPS.md` #8. Until Auth is adopted, leave the existing actor string as-is. |
|
||||
| 5 | **OtOpcUa:** reconcile two-producer problem — decide SP path routing + fix `ClusterId`-filter / actor mismatch in `ClusterAudit.razor` (Gap Q1) | OtOpcUa | Low | S | Low | Normalization does not unify the SP path; this is a reconcile item to decide and document. The mismatch means structured `AuditEvent` rows are currently invisible to the cluster-scoped view. |
|
||||
| 6 | **MxGateway:** add `CorrelationId` capture at constraint denial + dashboard paths; structured `Target` from `Details` text (currently embedded as a plain string in `ConstraintEnforcer`) | MxGateway | Low | S | Low | Nice-to-have parity; not required for adoption. `CorrelationId` and `Target` canonical fields left null until this is done. |
|
||||
|
||||
**Sequencing:** #3 (ScadaBridge rename) is lowest-risk and self-contained — do it first (or
|
||||
last, depending on blast-radius appetite). #1 (OtOpcUa) is medium effort but independent; it
|
||||
can start once the shared library is built. #2 (MxGateway) is the smallest code change but
|
||||
has the highest **coordination dependency** — gate it on the Serilog migration landing first.
|
||||
#4 (Actor→Auth) is blocked on Auth adoption and is the last to close. #5 and #6 are cleanup
|
||||
items with no bearing on shared-library adoption.
|
||||
|
||||
Each adoption lands as an opt-in version bump per project behind the seam; the shared library
|
||||
is consumed but the bespoke transport/storage/UI for each project is not touched.
|
||||
|
||||
## Decisions still open
|
||||
|
||||
- ScadaBridge `IAuditPayloadFilter` → `IAuditRedactor`: outright rename vs. transitional alias
|
||||
(both are valid; alias reduces blast radius in the short term).
|
||||
- MxGateway `Details` plain string → `DetailsJson`: store as-is or wrap in a JSON object at
|
||||
the mapping boundary.
|
||||
- `AuditOutcome` column in OtOpcUa storage: add a new `Outcome` column to `ConfigAuditLog`
|
||||
or fold into `DetailsJson` / derive at read time (schema change vs. runtime cost).
|
||||
- OtOpcUa SP path: route through the actor path (unified producer) or leave as a bespoke
|
||||
secondary writer with its own column conventions (separate reconcile effort).
|
||||
@@ -0,0 +1,72 @@
|
||||
# Audit (who-did-what)
|
||||
|
||||
Status: **Draft**. Normalized component — path to shared code. Goal: converge the three
|
||||
sister projects onto a canonical `AuditEvent` record + `AuditOutcome` enum + two thin seams
|
||||
(`IAuditWriter`, `IAuditRedactor`), proposed as the `ZB.MOM.WW.Audit` library, while each
|
||||
project keeps its own transport, storage, domain vocabulary, and redaction policy.
|
||||
|
||||
- The one target: [`spec/SPEC.md`](spec/SPEC.md)
|
||||
- Canonical event model + field reference: [`spec/EVENT-MODEL.md`](spec/EVENT-MODEL.md)
|
||||
- The proposed shared library: [`shared-contract/ZB.MOM.WW.Audit.md`](shared-contract/ZB.MOM.WW.Audit.md)
|
||||
- Divergences + backlog: [`GAPS.md`](GAPS.md)
|
||||
- Current state, per project: [`current-state/`](current-state/)
|
||||
|
||||
## Why audit is a strong normalization candidate
|
||||
|
||||
All three projects record a structured who-did-what trail with an actor identity, an action
|
||||
verb, and a timestamp. Two (OtOpcUa + ScadaBridge) already have a named `AuditEvent` record
|
||||
with an `EventId` idempotency key, `Actor`, and `CorrelationId`. ScadaBridge already ships
|
||||
**both** canonical seams under slightly different names (`IAuditWriter` is byte-for-byte the
|
||||
spec; `IAuditPayloadFilter` is the canonical `IAuditRedactor`). OtOpcUa's record is almost
|
||||
field-for-field aligned. MxGateway has a narrow API-key-lifecycle log that maps cleanly.
|
||||
|
||||
The one new field across all three is `AuditOutcome` — no project stores it explicitly today;
|
||||
each encodes it implicitly and derives it at adoption. This is the bulk of the per-project
|
||||
work. Transport, storage, domain vocabulary, and redaction policy are **not** unified — each
|
||||
project keeps its own bespoke implementation behind the seam.
|
||||
|
||||
**Audit closes the loop on Auth.** Every audit row's `Actor` is exactly the identity that the
|
||||
`ZB.MOM.WW.Auth` component normalizes (LDAP/GLAuth principal, API-key name). The library keeps
|
||||
`Actor` as a plain `string` (no Auth dependency), but at adoption each emit site supplies the
|
||||
Auth principal.
|
||||
|
||||
**`IAuditRedactor` naming is aligned with Telemetry's `ILogRedactor`** — same shape and naming
|
||||
discipline so a future `ZB.MOM.WW.Hosting` aggregator wires both redactors with one mental
|
||||
model — but there is no cross-package dependency between the two libraries.
|
||||
|
||||
## Status by project
|
||||
|
||||
| Project | Audit today | Seams present | `AuditOutcome` | Adoption status |
|
||||
|---|---|---|---|---|
|
||||
| **OtOpcUa** | Akka cluster-broadcast `AuditEvent` → cluster-singleton `AuditWriterActor` (batch 500/5 s, two-layer dedup) over EF `ConfigAuditLog` (SQL Server). Also a legacy SQL stored-procedure write path (bare `EventType`, NULL `EventId`). Admin UI page `ClusterAudit.razor`. | No named `IAuditWriter` seam; no redactor seam. | Not stored — encoded in `EventType` strings (`OpcUaAccessDenied`/`CrossClusterNamespaceAttempt` → `Denied`; config-write verbs → `Success`). | Not started |
|
||||
| **MxAccessGateway** | Single SQLite-backed `IApiKeyAuditStore` / `ApiKeyAuditEntry` — key lifecycle (CLI + dashboard) + constraint denials only. No authn events persisted; no production read consumer. | Narrow custom seam (`IApiKeyAuditStore`); no general `IAuditWriter`; redaction is by-construction (secret never enters the record type). | Not stored — derived: `constraint-denied` → `Denied`; all others → `Success`. | Not started |
|
||||
| **ScadaBridge** | Full pipeline: site SQLite hot-path (`SqliteAuditWriter` + ring-buffer fallback) → Akka `ClusterClient` forwarder → central MS SQL (ingest / reconcile / purge / partition maintenance). Rich ~25-field `AuditEvent` record. CLI `export`/`verify-chain`; Blazor audit UI. | ✅ `IAuditWriter` (matches canonical contract word-for-word); ✅ `IAuditPayloadFilter` (= canonical `IAuditRedactor`, identical signature, pure/never-throws/over-redacts). | Not stored explicitly — derived from `Status` (`Delivered`→`Success`; `Failed`/`Parked`/`Discarded`→`Failure`; `Kind = InboundAuthFailure`→`Denied`). | Not started (align, don't replace) |
|
||||
|
||||
See each project's `current-state/<project>/CURRENT-STATE.md` for code-verified detail and
|
||||
adoption plan:
|
||||
|
||||
- [`current-state/otopcua/CURRENT-STATE.md`](current-state/otopcua/CURRENT-STATE.md)
|
||||
- [`current-state/mxaccessgw/CURRENT-STATE.md`](current-state/mxaccessgw/CURRENT-STATE.md)
|
||||
- [`current-state/scadabridge/CURRENT-STATE.md`](current-state/scadabridge/CURRENT-STATE.md)
|
||||
|
||||
## Normalized vs. left per-project
|
||||
|
||||
**Normalized (the shared `ZB.MOM.WW.Audit` library):** the canonical `AuditEvent` record
|
||||
(5 required fields + 4 optional common + `DetailsJson` extension bag); the `AuditOutcome`
|
||||
enum (`Success | Failure | Denied`); the `IAuditWriter` seam (best-effort, never throws to
|
||||
caller); the `IAuditRedactor` seam (pure, never throws, over-redacts on failure); shipped
|
||||
helpers (`NoOpAuditWriter`, `CompositeAuditWriter`, `RedactingAuditWriter`,
|
||||
`NullAuditRedactor`, `TruncatingAuditRedactor`). Library has no Akka / EF / SQLite / Serilog
|
||||
dependency; its only non-BCL dependency is `Microsoft.Extensions.DependencyInjection.Abstractions`.
|
||||
|
||||
**Left per-project (each project keeps these behind the seam):** transport and storage (Akka
|
||||
singleton + EF/SQL Server; SQLite; site-SQLite + central MS SQL + forwarding/reconcile
|
||||
pipeline); domain vocabulary (`EventType` strings / API-key event-type literals / `Channel` +
|
||||
`Kind` + `Status` enums); query, CLI, and UI surfaces (`ClusterAudit.razor`; `ListRecentAsync`;
|
||||
`export` / `verify-chain`; Blazor audit pages); redaction *policy* (which fields/payloads are
|
||||
sensitive — only the `IAuditRedactor` *seam* is shared).
|
||||
|
||||
> **Adoption is deferred this round.** The `ZB.MOM.WW.Audit` library is being designed and
|
||||
> the shared contract defined, but none of the three apps wire it in yet — exactly where
|
||||
> `ZB.MOM.WW.Auth` and `ZB.MOM.WW.Theme` sit today. The per-project adoption backlog is in
|
||||
> [`GAPS.md`](GAPS.md).
|
||||
@@ -0,0 +1,118 @@
|
||||
# Audit — current state: MxAccessGateway (`mxaccessgw`)
|
||||
|
||||
Repo: `~/Desktop/MxAccessGateway` (Gitea `mxaccessgw`). Stack: .NET 10 gateway (x64) + x86/net48 worker.
|
||||
Audit lives entirely in the **gateway** (.NET 10); the worker records nothing.
|
||||
All paths relative to repo root; audit code under `src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/`. Verified 2026-06-01.
|
||||
|
||||
This is the **narrowest** of the three implementations: a single SQLite-backed append-only log scoped
|
||||
to **API-key lifecycle and constraint denials**. There is no general-purpose audit abstraction, no
|
||||
separate redaction seam, and no CorrelationId. Read-back exists but has no production consumer today.
|
||||
|
||||
## How it works today
|
||||
|
||||
The audit log is one seam, `IApiKeyAuditStore`
|
||||
(`src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/IApiKeyAuditStore.cs:6`), with exactly two
|
||||
operations: `AppendAsync(ApiKeyAuditEntry, ...)` (`IApiKeyAuditStore.cs:14`) and
|
||||
`ListRecentAsync(int count, ...)` (`IApiKeyAuditStore.cs:22`). Single implementation,
|
||||
`SqliteApiKeyAuditStore` (`SqliteApiKeyAuditStore.cs:5`), registered as a singleton in
|
||||
`AuthStoreServiceCollectionExtensions.cs:23` alongside the rest of the auth stores.
|
||||
|
||||
- **Append-side shape:** callers pass `ApiKeyAuditEntry(string? KeyId, string EventType, string? RemoteAddress, string? Details)`
|
||||
(`ApiKeyAuditEntry.cs:3`). The store sets the timestamp itself — `AppendAsync` writes
|
||||
`created_utc = DateTimeOffset.UtcNow.ToString("O")` (`SqliteApiKeyAuditStore.cs:20`), so the caller
|
||||
cannot supply the time and there is **no idempotency/event key** (the only identity is the DB
|
||||
`AUTOINCREMENT` rowid).
|
||||
- **Read-side shape:** `ListRecentAsync` returns `ApiKeyAuditRecord(long AuditId, string? KeyId, string EventType, string? RemoteAddress, DateTimeOffset CreatedUtc, string? Details)`
|
||||
(`ApiKeyAuditRecord.cs:3`), ordered `audit_id DESC LIMIT $count` (`SqliteApiKeyAuditStore.cs:38-42`),
|
||||
returning `[]` for `count <= 0` (`SqliteApiKeyAuditStore.cs:29-32`).
|
||||
- **Storage:** SQLite, the same gateway-owned auth DB (`AuthSqliteConnectionFactory`, WAL; default
|
||||
`C:\ProgramData\MxGateway\gateway-auth.db`). Table `api_key_audit` is created by
|
||||
`SqliteAuthStoreMigrator.cs:95-102` — `audit_id INTEGER PRIMARY KEY AUTOINCREMENT, key_id TEXT NULL,
|
||||
event_type TEXT NOT NULL, remote_address TEXT NULL, created_utc TEXT NOT NULL, details TEXT NULL`,
|
||||
plus index `ix_api_key_audit_key_id_created_utc` (`SqliteAuthStoreMigrator.cs:107-108`). Table name
|
||||
constant `SqliteAuthSchema.ApiKeyAuditTable = "api_key_audit"` (`SqliteAuthSchema.cs:11`). The log is
|
||||
append-only: there is no update/delete/prune path.
|
||||
- **Producers (three, all in the gateway):**
|
||||
- **Admin CLI** `ApiKeyAdminCliRunner` — its private `AppendAuditAsync` (`ApiKeyAdminCliRunner.cs:153`)
|
||||
always passes `RemoteAddress: null` (`ApiKeyAdminCliRunner.cs:163`). Event types:
|
||||
`"init-db"` (`:48`), `"create-key"` (`:74`), `"list-keys"` (`:83`),
|
||||
`"revoke-key"` with details `revoked`/`not-found-or-already-revoked` (`:102`),
|
||||
`"rotate-key"` with details `rotated`/`not-found` (`:121`).
|
||||
- **Dashboard** `DashboardApiKeyManagementService` — its `AppendAuditAsync` (`:197`) captures
|
||||
`RemoteAddress: httpContextAccessor.HttpContext?.Connection.RemoteIpAddress?.ToString()` (`:207`).
|
||||
Event types: `"dashboard-create-key"` (`:62`), `"dashboard-revoke-key"` (`:103`, details
|
||||
`revoked`/`not-found-or-already-revoked`), `"dashboard-rotate-key"` (`:145`, details `rotated`/`not-found`),
|
||||
`"dashboard-delete-key"` (`:187`, details `deleted`/`not-found-or-active`).
|
||||
- **Constraint denials** `ConstraintEnforcer.RecordDenialAsync` (`ConstraintEnforcer.cs:117`) writes
|
||||
`EventType: "constraint-denied"`, `RemoteAddress: null`, and `Details:
|
||||
$"{commandKind}: {target}: {failure.ConstraintName}: {failure.Message}"` (`ConstraintEnforcer.cs:124-129`).
|
||||
This is the only "denial" event in the log.
|
||||
- **No authn events.** The verifier (`ApiKeyVerifier`) and the gRPC authorization interceptor
|
||||
(`GatewayGrpcAuthorizationInterceptor`) do **not** write to the audit store — authentication
|
||||
success/failure and `Unauthenticated`/`PermissionDenied` outcomes are surfaced as gRPC statuses and
|
||||
(per policy) discriminated for logging, but are not persisted as audit rows. So in practice the log
|
||||
records **key lifecycle (CLI + dashboard) + constraint denials**, not per-request authn outcomes.
|
||||
- **No separate redaction seam — scrubbing is structural, in the store/entry shape.** There is no
|
||||
redactor, scrubber, sanitizer, or masking helper. Safety comes from *what the entry type can carry*:
|
||||
`ApiKeyAuditEntry` has no field for a secret, and every caller passes only a `KeyId` (the public
|
||||
key identifier, never the secret), an event-type literal, and short hand-built `Details` strings —
|
||||
the secret/pepper never enters the audit path. This aligns with the repo policy that "API keys,
|
||||
passwords, `WriteSecured` payloads, and `AuthenticateUser` credentials must never reach logs"
|
||||
(`CLAUDE.md:79`). Net: redaction is by construction, not a pluggable seam.
|
||||
- **Read-back has no production consumer.** `ListRecentAsync` is called only by tests
|
||||
(`SqliteAuthStoreTests`, `ApiKeyAdminCliRunnerTests`). The dashboard `ApiKeysPage.razor` mentions the
|
||||
audit log only in a delete-confirmation string (`ApiKeysPage.razor:321`) — it does **not** render it.
|
||||
There is no UI or RPC that surfaces audit history today.
|
||||
|
||||
## Mapping to the canonical record
|
||||
|
||||
Target: `ZB.MOM.WW.Audit`'s `AuditEvent { Guid EventId; DateTimeOffset OccurredAtUtc; string Actor;
|
||||
string Action; AuditOutcome Outcome; string? Category; string? Target; string? SourceNode;
|
||||
Guid? CorrelationId; string? DetailsJson; }` with `AuditOutcome ∈ { Success, Failure, Denied }`.
|
||||
|
||||
| `AuditEvent` field | Source today | Mapping note |
|
||||
|---|---|---|
|
||||
| `EventId` (Guid, required) | — none — | **Must be generated** at write time. `ApiKeyAuditRecord` has only the autoincrement `AuditId` (`ApiKeyAuditRecord.cs:4`); no idempotency key exists. |
|
||||
| `OccurredAtUtc` (required) | `CreatedUtc` (`ApiKeyAuditRecord.cs:8`), set as `DateTimeOffset.UtcNow` in the store (`SqliteApiKeyAuditStore.cs:20`) | Direct. Note: time is store-assigned today, not caller-supplied. |
|
||||
| `Actor` (required) | `KeyId` (`ApiKeyAuditRecord.cs:5`) | Nullable today (`init-db`/`list-keys` pass `null`); the canonical `Actor` is required, so a fallback (e.g. `"system"`/`"cli"`) is needed for keyless events. |
|
||||
| `Action` (required) | `EventType` (`ApiKeyAuditRecord.cs:6`) | Direct. CLI vocab: `init-db`, `create-key`, `list-keys`, `revoke-key`, `rotate-key`; dashboard vocab: `dashboard-create-key`, `dashboard-revoke-key`, `dashboard-rotate-key`, `dashboard-delete-key`; plus `constraint-denied`. |
|
||||
| `Outcome` (required) | derived | `constraint-denied` → `Denied`; everything else → `Success` (no `Failure` events are emitted today). |
|
||||
| `Category` | — none — | Constant `"ApiKey"`. |
|
||||
| `Target` | — none as a field — | No structured target. (`ConstraintEnforcer` does embed `commandKind`/`target` inside `Details` text, but there is no dedicated column.) |
|
||||
| `SourceNode` | `RemoteAddress` (`ApiKeyAuditRecord.cs:7`) | Direct; populated only on the dashboard path (`DashboardApiKeyManagementService.cs:207`), `null` on CLI/constraint paths. |
|
||||
| `CorrelationId` | — none — | Not captured today. |
|
||||
| `DetailsJson` | `Details` (`ApiKeyAuditRecord.cs:9`) | Today this is a **plain string**, not JSON; either store as-is in `DetailsJson` or wrap as a small JSON object. |
|
||||
|
||||
---
|
||||
|
||||
## Adoption plan → `ZB.MOM.WW.Audit`
|
||||
|
||||
**Effort: LOW.** The seam is tiny (one interface, two methods, one record pair) and the data already
|
||||
maps cleanly onto `AuditEvent`. Concretely:
|
||||
|
||||
1. **Adapter, not rewrite.** Map `IApiKeyAuditStore` → the shared `IAuditWriter`, and
|
||||
`ApiKeyAuditEntry`/`ApiKeyAuditRecord` → `AuditEvent`, using the table above: generate a new
|
||||
`EventId` Guid per write; `KeyId → Actor` (with a `"system"` fallback for null); `EventType → Action`;
|
||||
`CreatedUtc → OccurredAtUtc`; `RemoteAddress → SourceNode`; `constraint-denied → Outcome.Denied`,
|
||||
else `Success`; constant `Category = "ApiKey"`; `Details → DetailsJson`. The three producers
|
||||
(`ApiKeyAdminCliRunner`, `DashboardApiKeyManagementService`, `ConstraintEnforcer`) keep their call
|
||||
sites — only the injected type changes.
|
||||
2. **Redaction stays by-construction.** No separate redactor needs porting; just preserve the rule that
|
||||
callers never put secrets in `DetailsJson` (mirrors `CLAUDE.md:79`). The shared writer can keep its
|
||||
own redaction policy as a defence-in-depth layer.
|
||||
3. **Read-back is free to drop or defer.** `ListRecentAsync` has no production consumer, so the adapter
|
||||
need not implement a shared query API on day one — only the test/CLI read paths exercise it.
|
||||
4. **No new dimensions required.** `CorrelationId` and a structured `Target` are absent today and are
|
||||
*not* in scope to add as part of adoption (descriptive parity only); the canonical record simply
|
||||
leaves them `null`.
|
||||
|
||||
**Coordination risk — sequence against the health/observability work.** A parallel session is actively
|
||||
editing **this same repo** (`mxaccessgw`) for the MEL → Serilog logging migration
|
||||
(`ZB.MOM.WW.Health` + `ZB.MOM.WW.Telemetry` normalization). Because audit adoption here also touches the
|
||||
gateway's `Security/Authentication/` wiring (DI registration in `AuthStoreServiceCollectionExtensions.cs`,
|
||||
and the three producer call sites), the two efforts can collide on the same files and on logging-pipeline
|
||||
DI. **Do not start MxGateway audit adoption until the Serilog migration in this repo has landed (or is
|
||||
explicitly fenced off)**, and confirm with the orchestrator that the logging session is not mid-flight in
|
||||
`Security/` before opening a PR. The audit and logging seams are conceptually independent (audit = durable
|
||||
SQLite record of who-did-what; logging = operational telemetry), but they share the gateway's startup/DI
|
||||
surface, so they must be merged in a defined order rather than in parallel.
|
||||
@@ -0,0 +1,140 @@
|
||||
# Audit — current state: OtOpcUa
|
||||
|
||||
Repo: `~/Desktop/OtOpcUa` (Gitea `lmxopcua`). Stack: .NET 10, Akka.NET cluster, EF Core + SQL Server.
|
||||
All paths below are relative to the repo root. Verified against source on 2026-06-01.
|
||||
|
||||
OtOpcUa already has a structured, idempotent audit pipeline: a cluster-broadcast `AuditEvent`
|
||||
message, a cluster-singleton writer actor that batches and bulk-inserts, and an append-only
|
||||
`ConfigAuditLog` EF entity with two-layer dedup. There is **also** a second, older write path —
|
||||
SQL stored procedures that `INSERT dbo.ConfigAuditLog` directly — so the table has two
|
||||
producers with slightly different column conventions (see §1).
|
||||
|
||||
## 1. How it works today
|
||||
|
||||
**Record shape** — `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Audit/AuditEvent.cs:9-17`:
|
||||
a sealed record `AuditEvent(Guid EventId, string Category, string Action, string Actor,
|
||||
DateTime OccurredAtUtc, string? DetailsJson, NodeId SourceNode, CorrelationId CorrelationId)`.
|
||||
`NodeId` and `CorrelationId` are Commons value-types — `NodeId` wraps a string (the *logical
|
||||
cluster node / host name*, explicitly **not** an OPC UA NodeId per its XML doc,
|
||||
`src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/NodeId.cs:3-8`); `CorrelationId` wraps a `Guid`
|
||||
(`src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/CorrelationId.cs:3`).
|
||||
|
||||
**Transport** — `AuditEvent` is an Akka message meant to be sent to the `AuditWriterActor`
|
||||
**cluster singleton** (`AuditEvent.cs:6` describes it as "cluster-broadcast … consumed by the
|
||||
`AuditWriterActor` singleton"). The singleton is registered through Akka.Hosting at
|
||||
`src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/ServiceCollectionExtensions.cs:68-75`
|
||||
(`WithSingleton<AuditWriterActorKey>(AuditWriterSingletonName, …)`). Any cluster member can
|
||||
emit an `AuditEvent`; the singleton is the one sink that persists it.
|
||||
|
||||
**Storage** — EF entity `ConfigAuditLog`
|
||||
(`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigAuditLog.cs:7-44`): append-only
|
||||
("Grants revoked for UPDATE/DELETE on all principals", `ConfigAuditLog.cs:4-5`). Columns:
|
||||
`AuditId` (identity PK), `Timestamp` (default `SYSUTCDATETIME()`), `Principal`, `EventType`,
|
||||
`ClusterId?`, `NodeId?`, `GenerationId?`, `DetailsJson?`, `EventId?` (Guid), `CorrelationId?`
|
||||
(Guid). Mapping/constraints in `OtOpcUaConfigDbContext.cs:429-463`: `DetailsJson` must be valid
|
||||
JSON (`CK_ConfigAuditLog_DetailsJson_IsJson`, line 435-436); `Principal`/`EventType`/`ClusterId`/`NodeId`
|
||||
length-capped (lines 441-444); supporting indexes `IX_ConfigAuditLog_Cluster_Time` (line 449-451)
|
||||
and `IX_ConfigAuditLog_Generation` (line 452-454).
|
||||
|
||||
**Writer / batching** — `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs`:
|
||||
a `ReceiveActor` with `FlushBatchSize = 500` (line 25) and `FlushInterval = 5s` (line 26).
|
||||
It buffers events in a `Dictionary<Guid, AuditEvent>` keyed by `EventId` (line 30), flushing
|
||||
when the buffer hits 500 (line 60), when the 5s periodic timer fires (`PreStart`, line 50-53),
|
||||
or on `PreRestart`/`PostStop` (lines 96-107) so a supervisor swap or coordinated shutdown does
|
||||
not lose the buffer. `FlushBuffer` (lines 63-93) snapshots and clears the buffer, then for each
|
||||
event constructs a `ConfigAuditLog` row (lines 75-84): `Timestamp = OccurredAtUtc`,
|
||||
`Principal = Actor`, `EventType = $"{Category}:{Action}"`, `NodeId = SourceNode.Value`,
|
||||
`DetailsJson`, `EventId`, `CorrelationId = CorrelationId.Value`. A failed flush is logged and the
|
||||
batch is **dropped** (`catch` at lines 89-92) — best-effort, no retry/dead-letter.
|
||||
|
||||
**Dedup / idempotency (two layers)** — described at `AuditWriterActor.cs:17-21`:
|
||||
1. *In-buffer* — duplicate `EventId`s within a batch collapse via the dictionary (last-write-wins;
|
||||
`HandleEvent`, lines 55-61).
|
||||
2. *Database* — a **filtered unique index** `UX_ConfigAuditLog_EventId` (`OtOpcUaConfigDbContext.cs:459-462`,
|
||||
`IsUnique()` + `HasFilter("[EventId] IS NOT NULL")`) gives cross-restart safety: a retry of an
|
||||
already-flushed batch hits the constraint, the duplicate insert is dropped, and the rest of the
|
||||
batch survives. `EventId`/`CorrelationId` are nullable so legacy/backfill rows (NULL) don't
|
||||
collide — confirmed in the entity XML (`ConfigAuditLog.cs:33-43`) and migration
|
||||
`Migrations/20260526105027_AddConfigAuditLogEventIdColumns.cs:26-31`.
|
||||
|
||||
**Scope** — two producers, two conventions:
|
||||
- **Akka `AuditEvent` path** (the structured one): config writes + authorization checks. The
|
||||
EventType vocabulary lives in the entity XML doc (`ConfigAuditLog.cs:18`): `DraftCreated |
|
||||
DraftEdited | Published | RolledBack | NodeApplied | CredentialAdded | CredentialDisabled |
|
||||
ClusterCreated | NodeAdded | ExternalIdReleased | CrossClusterNamespaceAttempt |
|
||||
OpcUaAccessDenied | …`. Note the access-denied / cross-cluster entries are authz-check events,
|
||||
not config writes.
|
||||
- **SQL stored-procedure path** (older, still present): several SPs `INSERT dbo.ConfigAuditLog`
|
||||
directly — e.g. `Published`/`RolledBack`/`NodeApplied`/`ExternalIdReleased`/`CrossClusterNamespaceAttempt`
|
||||
in `Migrations/20260417215224_StoredProcedures.cs:151,217,351,407,504`. These use `SUSER_SNAME()`
|
||||
as `Principal`, set `ClusterId`/`GenerationId`, write a **bare** `EventType` (no `Category:Action`
|
||||
split), and leave `EventId`/`CorrelationId` NULL.
|
||||
|
||||
**Query / UI** — the only read surface is the Admin UI page
|
||||
`src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Clusters/ClusterAudit.razor`
|
||||
(`@page "/clusters/{ClusterId}/audit"`, `[Authorize]`, lines 1-2). It reads the latest
|
||||
`PageSize = 200` rows (line 69) **filtered by `ClusterId`**, newest-first (`OnInitializedAsync`,
|
||||
lines 74-82), and renders Timestamp / Principal / Event(Type) / Node / Correlation(first 8 hex) /
|
||||
Details columns (lines 38-58). Tested in
|
||||
`tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/AuditWriterActorTests.cs`: count-threshold
|
||||
flush (lines 26-41), in-buffer dedup of duplicate EventIds (lines 45-62), `PostStop` flush
|
||||
(lines 66-81), and the column mapping incl. `EventType == "Config:Edit"` and `NodeId == "node-a"`
|
||||
(lines 85-104).
|
||||
|
||||
> Load-bearing gotcha: the actor path **never sets `ClusterId`** (lines 75-84), but the UI filters
|
||||
> on `ClusterId` (`ClusterAudit.razor:78`). So today the cluster-scoped view surfaces the
|
||||
> stored-procedure rows; structured `AuditEvent` rows written by the actor (which carry the host in
|
||||
> `NodeId`, not `ClusterId`) won't appear under a cluster. Worth flagging during normalization.
|
||||
|
||||
## 2. Mapping to the canonical `AuditEvent`
|
||||
|
||||
Target = `ZB.MOM.WW.Audit.AuditEvent` (built in parallel). OtOpcUa's existing `AuditEvent` is
|
||||
already almost field-for-field aligned; the only synthesized field is `Outcome`.
|
||||
|
||||
| Canonical field | OtOpcUa source | Mapping |
|
||||
|---|---|---|
|
||||
| `Guid EventId` | `AuditEvent.EventId` | Direct. Already the idempotency key (buffer key + `UX_ConfigAuditLog_EventId`). |
|
||||
| `DateTimeOffset OccurredAtUtc` | `AuditEvent.OccurredAtUtc` (`DateTime`) | Direct; widen `DateTime`(UTC) → `DateTimeOffset`. |
|
||||
| `string Actor` | `AuditEvent.Actor` | Direct (→ `ConfigAuditLog.Principal`). At Auth adoption this becomes the `ZB.MOM.WW.Auth` principal. |
|
||||
| `string Action` | `AuditEvent.Action` (+ `Category`) | Direct. Today persisted as `"{Category}:{Action}"` in `EventType`; canonical keeps `Action` and `Category` separate. |
|
||||
| `AuditOutcome Outcome` | *(none)* | **Derived** from the EventType vocabulary, not stored today. `OpcUaAccessDenied`/`CrossClusterNamespaceAttempt` → `Denied`; the config-write verbs → `Success`. No explicit `Failure` value exists yet (a failed flush is dropped, not recorded as an event). |
|
||||
| `string? Category` | `AuditEvent.Category` | Direct (e.g. `"Config"`). |
|
||||
| `string? Target` | *(none)* | No dedicated field today; the closest is `SourceNode`→`NodeId` (the acting host) or details. Leave null or carry the affected object in `DetailsJson`. |
|
||||
| `string? SourceNode` | `AuditEvent.SourceNode` (`NodeId.Value`) | Direct — the logical cluster node / host name (NOT an OPC UA NodeId). Currently lands in `ConfigAuditLog.NodeId`. |
|
||||
| `Guid? CorrelationId` | `AuditEvent.CorrelationId` (`CorrelationId.Value`) | Direct. |
|
||||
| `string? DetailsJson` | `AuditEvent.DetailsJson` | Direct; carries everything else (incl. `ClusterId`/`GenerationId`, which today are separate columns on the SP path). |
|
||||
|
||||
## 3. Adoption plan → `ZB.MOM.WW.Audit`
|
||||
|
||||
**Effort: medium.** OtOpcUa is the *donor* design for the canonical record, so most of the work is
|
||||
re-pointing types and bridging two persistence conventions, not redesigning the pipeline.
|
||||
|
||||
**Replace with the shared library:**
|
||||
- `Commons/Messages/Audit/AuditEvent.cs` → the canonical `ZB.MOM.WW.Audit.AuditEvent`. Add the new
|
||||
`Outcome` field (derive it at every emit site from the EventType vocabulary, e.g.
|
||||
`OpcUaAccessDenied → Denied`); keep `Category`/`Action`/`SourceNode`/`CorrelationId` as-is. Decide
|
||||
whether `SourceNode`/`CorrelationId` carry the Commons value-types or the canonical primitives at
|
||||
the seam (likely a thin adapter at construction).
|
||||
- `AuditWriterActor` → implement the library's `IAuditWriter` (keep the actor as OtOpcUa's
|
||||
Akka-cluster-singleton transport/batching adapter behind that seam; the 500/5s batching,
|
||||
PreRestart/PostStop flush, and two-layer dedup stay bespoke per §"left per-project").
|
||||
|
||||
**Keep bespoke (thin adapter only):**
|
||||
- Transport — the cluster-broadcast → singleton `AuditWriterActor`, batching, and flush triggers.
|
||||
- Storage — the `ConfigAuditLog` EF entity, indexes, and `UX_ConfigAuditLog_EventId` idempotency
|
||||
index. Map the canonical record onto the existing columns; add an `Outcome` column (or fold it into
|
||||
`EventType`/`DetailsJson` if a schema change is undesirable). `ClusterId`/`GenerationId` remain
|
||||
OtOpcUa-specific columns fed via `DetailsJson` or kept as side columns.
|
||||
- Domain vocabulary — the EventType strings (`DraftCreated`, `Published`, `OpcUaAccessDenied`, …)
|
||||
and the `Category:Action` composition convention.
|
||||
- Query/UI — `ClusterAudit.razor` and its `ClusterId` filter.
|
||||
|
||||
**Reconcile, not extract:**
|
||||
- The **two producers** (Akka `AuditEvent` path vs. SQL stored-procedure `INSERT`s using
|
||||
`SUSER_SNAME()`). The SP path bypasses the canonical record entirely and writes a different
|
||||
column convention (bare `EventType`, NULL `EventId`/`CorrelationId`, populated
|
||||
`ClusterId`/`GenerationId`). Adopting the library does not by itself unify these; either route the
|
||||
SP events through the actor or accept that SP rows stay non-idempotent and absent from the
|
||||
`EventId` dedup guarantee. Flag for the normalization spec.
|
||||
- The **`ClusterId`-filter / actor-never-sets-`ClusterId`** mismatch noted in §1 — fix when the
|
||||
query surface is normalized so structured `AuditEvent` rows are discoverable by cluster.
|
||||
@@ -0,0 +1,162 @@
|
||||
# Audit — current state: ScadaBridge
|
||||
|
||||
Repo: `~/Desktop/ScadaBridge`. Stack: .NET 10, Akka.NET; solution `ZB.MOM.WW.ScadaBridge.slnx`.
|
||||
Audit code centers on the dedicated `ZB.MOM.WW.ScadaBridge.AuditLog` project, with the shared
|
||||
record + seams living in `ZB.MOM.WW.ScadaBridge.Commons`. All paths relative to repo root.
|
||||
Verified 2026-06-01.
|
||||
|
||||
**By far the largest audit implementation in the family** — a full who-did-what pipeline
|
||||
across a site SQLite hot-path and a central MS SQL store, with forwarding, reconciliation,
|
||||
purge, partition maintenance, redaction, CLI export, hash-chain verify (v1 stub), and a Blazor
|
||||
UI. **Key finding: ScadaBridge is already at the target.** It already has an `IAuditWriter`
|
||||
best-effort seam (near-identical to the canonical contract) and an `IAuditPayloadFilter`
|
||||
redaction seam (= the library's `IAuditRedactor`, just renamed). Adoption is *align, don't
|
||||
replace* — mostly naming alignment; the enormous transport/storage/CLI/UI stays bespoke.
|
||||
|
||||
## 1. How it works today
|
||||
|
||||
### The record — `AuditEvent` (~25 fields)
|
||||
|
||||
`src/ZB.MOM.WW.ScadaBridge.Commons/Entities/Audit/AuditEvent.cs:22` — a `sealed record`,
|
||||
append-only, "single source of truth for AuditLog (#23) rows." Far richer than the canonical
|
||||
10-field event. Notable fields:
|
||||
|
||||
- Identity / correlation: `EventId` (idempotency key, `:25`), `CorrelationId` (per-op
|
||||
lifecycle, `:68`), `ExecutionId` (per-run, `:75`), `ParentExecutionId` (spawner link, `:82`).
|
||||
- Classification: `Channel` (`:62`), `Kind` (`:65`), `Status` (`:109`) — the domain enums (below).
|
||||
- Provenance: `SourceSiteId` (`:85`), `SourceNode` (`:94`, stamped from `INodeIdentityProvider`),
|
||||
`SourceInstanceId` (`:97`), `SourceScript` (`:100`), `Actor` (`:103`), `Target` (`:106`).
|
||||
- Outcome detail: `HttpStatus` (`:112`), `DurationMs` (`:115`), `ErrorMessage` (`:118`),
|
||||
`ErrorDetail` (`:121`).
|
||||
- Payload: `RequestSummary` / `ResponseSummary` (truncated+redacted, `:124`/`:127`),
|
||||
`PayloadTruncated` (`:130`), `Extra` (free-form JSON, `:133`).
|
||||
- Lifecycle plumbing: `IngestedAtUtc` (null on site, stamped at central ingest, `:52`),
|
||||
`ForwardState` (site-only, null on central, `:136`).
|
||||
|
||||
**UTC-forcing init-setters.** `OccurredAtUtc` (`:39`) and `IngestedAtUtc` (`:52`) keep a backing
|
||||
field and call `DateTime.SpecifyKind(value, DateTimeKind.Utc)` on assignment, so a value built
|
||||
from a literal or rehydrated from a SQL Server `datetime2` column (which strips `Kind` on the
|
||||
wire) cannot leak downstream as `Unspecified`/local. The record uses `DateTime` (not
|
||||
`DateTimeOffset`) deliberately, to match the partitioned `datetime2` column shape (`:9-21`).
|
||||
|
||||
### Domain vocabulary — four enums
|
||||
|
||||
`src/ZB.MOM.WW.ScadaBridge.Commons/Types/Enums/`:
|
||||
|
||||
- `AuditChannel.cs:7` — trust boundary crossed: `ApiOutbound`, `DbOutbound`, `Notification`,
|
||||
`ApiInbound`.
|
||||
- `AuditKind.cs:8` — specific event within a channel: `ApiCall`, `ApiCallCached`, `DbWrite`,
|
||||
`DbWriteCached`, `NotifySend`, `NotifyDeliver`, `InboundRequest`, `InboundAuthFailure`,
|
||||
`CachedSubmit`, `CachedResolve`. Cached variants emit multiple rows per operation.
|
||||
- `AuditStatus.cs:8` — lifecycle status of the row: `Submitted`, `Forwarded`, `Attempted`,
|
||||
`Delivered`, `Failed`, `Parked`, `Discarded`, `Skipped`.
|
||||
- `AuditForwardState.cs:9` — site-local forwarding state (central rows leave null): `Pending`,
|
||||
`Forwarded`, `Reconciled`. The site retention purge MUST NOT drop a `Pending` row.
|
||||
|
||||
### The writer seam — `IAuditWriter` (best-effort, never aborts the action)
|
||||
|
||||
`src/ZB.MOM.WW.ScadaBridge.Commons/Interfaces/Services/IAuditWriter.cs:10` — boundary-side
|
||||
abstraction: `Task WriteAsync(AuditEvent evt, CancellationToken ct = default)` (`:18`). The
|
||||
contract is explicit and matches the canonical seam almost word-for-word: **"Failures must NEVER
|
||||
abort the user-facing action"** (`:8`), best-effort, "implementations must swallow/log internal
|
||||
failures rather than propagating them to the calling boundary code" (`:13-14`).
|
||||
|
||||
### The redaction seam — `IAuditPayloadFilter` (pure, never throws)
|
||||
|
||||
`src/ZB.MOM.WW.ScadaBridge.AuditLog/Payload/IAuditPayloadFilter.cs:22` — `AuditEvent Apply(
|
||||
AuditEvent rawEvent)` (`:30`). Filters an event between construction and persistence:
|
||||
truncates oversized payloads, redacts headers/body/SQL params, sets `PayloadTruncated`.
|
||||
**Pure function** returning a filtered COPY via `with` expressions, and **MUST NOT throw** —
|
||||
on internal failure it over-redacts and increments the `AuditRedactionFailure` health metric
|
||||
(`:11-20`, `:26-28`). This is exactly the canonical `IAuditRedactor` under a different name.
|
||||
Two implementations: `DefaultAuditPayloadFilter.cs:56` (full truncation + header/body/SQL
|
||||
redaction with live options) and `SafeDefaultAuditPayloadFilter.cs:19` (always-safe fallback —
|
||||
header-only redaction, over-redacts on parse failure, `:42-59`).
|
||||
|
||||
### Transport / storage / pipeline — stays per-project
|
||||
|
||||
The `ZB.MOM.WW.ScadaBridge.AuditLog` project is split into `Site/`, `Central/`, `Payload/`, and
|
||||
`Configuration/`. This is the bespoke half and is **not** a candidate for extraction; cited here
|
||||
only to show the scale around the common core:
|
||||
|
||||
- **Site hot-path:** `Site/SqliteAuditWriter.cs:32` (`IAuditWriter` over an owned `SqliteConnection`
|
||||
fed by a bounded `Channel<T>` drained on a background task, so script-thread callers never block
|
||||
on disk I/O; first-write-wins on duplicate `EventId`). `Site/FallbackAuditWriter.cs:28` composes
|
||||
the SQLite writer with a drop-oldest `RingBufferFallback` so a primary failure never bubbles out.
|
||||
`Site/Telemetry/` forwards rows to central over Akka `ClusterClient`.
|
||||
- **Central ingest/store:** `Central/CentralAuditWriter.cs:40` (`ICentralAuditWriter`, direct MS SQL
|
||||
write for central-originated events, per-call EF scope, idempotent `InsertIfNotExistsAsync`,
|
||||
swallows every exception per "alog.md §13"). `Central/AuditLogIngestActor.cs:46` batches site
|
||||
telemetry; `Central/SiteAuditReconciliationActor.cs:68` periodically pulls to catch dropped
|
||||
forwards; `Central/AuditLogPurgeActor.cs:58` enforces retention; `Central/AuditLogPartitionMaintenanceService.cs:55`
|
||||
manages the partitioned table.
|
||||
- **CLI:** `CLI/Commands/AuditCommands.cs:12` builds `export` (`:137`, formats `csv`/`jsonl`/`parquet`)
|
||||
and `verify-chain` (`:226`). Hash-chain verify is currently a **v1 no-op stub** —
|
||||
`CLI/Commands/AuditVerifyChainHelpers.cs:6-10` ("v1 is a no-op").
|
||||
- **UI:** Blazor pages under `CentralUI/Components/Pages/Audit/` (e.g. `AuditLogPage.razor:1`,
|
||||
gated by `[Authorize(Policy = AuthorizationPolicies.OperationalAudit)]`) plus drill-down
|
||||
components in `CentralUI/Components/Audit/`.
|
||||
- **Wiring:** `AuditLog/ServiceCollectionExtensions.cs:59` `AddAuditLog(...)`, `:316`
|
||||
`AddAuditLogCentralMaintenance(...)`.
|
||||
|
||||
## 2. Mapping to the canonical record
|
||||
|
||||
Target (`ZB.MOM.WW.Audit`, being built): `record AuditEvent { Guid EventId; DateTimeOffset
|
||||
OccurredAtUtc; string Actor; string Action; AuditOutcome Outcome; string? Category; string?
|
||||
Target; string? SourceNode; Guid? CorrelationId; string? DetailsJson; }`. ScadaBridge's record is
|
||||
a strict superset — the canonical fields map directly; the rich extras collapse into `DetailsJson`.
|
||||
|
||||
| Canonical field | ScadaBridge source | Notes |
|
||||
|---|---|---|
|
||||
| `EventId` (Guid) | `AuditEvent.EventId` | Direct; same idempotency-key role. |
|
||||
| `OccurredAtUtc` (DateTimeOffset) | `AuditEvent.OccurredAtUtc` (`DateTime`, UTC-forced) | Type bridge `DateTime`(Utc)↔`DateTimeOffset`; semantics identical. |
|
||||
| `Actor` (string) | `AuditEvent.Actor` (nullable) | Direct; ScadaBridge allows null (system-originated rows). |
|
||||
| `Action` (string) | `AuditEvent.Kind` (+`Channel`) | Derive a stable action string, e.g. `{Channel}.{Kind}` (`ApiOutbound.ApiCall`). |
|
||||
| `Outcome` (Success/Failure/Denied) | `AuditEvent.Status` | `Delivered`→Success; `Failed`/`Parked`/`Discarded`→Failure; `InboundAuthFailure`(Kind)→Denied; in-flight `Submitted`/`Forwarded`/`Attempted` collapse to the last-known terminal state when projecting. |
|
||||
| `Category` (string?) | `AuditEvent.Channel` | The coarse bucket; pairs with `Action` above. |
|
||||
| `Target` (string?) | `AuditEvent.Target` | Direct. |
|
||||
| `SourceNode` (string?) | `AuditEvent.SourceNode` | Direct (`node-a`/`central-b`/…). |
|
||||
| `CorrelationId` (Guid?) | `AuditEvent.CorrelationId` | Direct (per-op lifecycle id). |
|
||||
| `DetailsJson` (string?) | `ExecutionId`, `ParentExecutionId`, `SourceSiteId`, `SourceInstanceId`, `SourceScript`, `HttpStatus`, `DurationMs`, `ErrorMessage`, `ErrorDetail`, `RequestSummary`, `ResponseSummary`, `PayloadTruncated`, `Extra`, `IngestedAtUtc`, `ForwardState` | The ~15 rich/plumbing fields serialize into the canonical `DetailsJson` extension. |
|
||||
|
||||
The canonical record is a lossy *projection* of ScadaBridge's — fine for cross-project
|
||||
reporting, but ScadaBridge keeps its full record as the storage shape (the partitioned SQL
|
||||
schema, forwarding state, and reconciliation all depend on the extra columns).
|
||||
|
||||
## 3. Adoption plan → `ZB.MOM.WW.Audit`
|
||||
|
||||
**Posture: align, don't replace.** ScadaBridge is the reference implementation the shared
|
||||
library is being extracted *from*; it already has both seams. Adoption is mostly renaming and
|
||||
contract-confirmation, with a deliberately small touched surface and a large blast radius if
|
||||
done carelessly. **Priority: LOW. Blast radius: HIGH.**
|
||||
|
||||
**Align (small, naming-level):**
|
||||
- **Rename the redaction seam to match the contract.** `IAuditPayloadFilter` → adopt
|
||||
`ZB.MOM.WW.Audit.IAuditRedactor` (`AuditEvent Apply(AuditEvent)` — identical signature and
|
||||
pure/never-throws contract). Either alias `IAuditPayloadFilter : IAuditRedactor` during
|
||||
transition or rename outright; `DefaultAuditPayloadFilter` / `SafeDefaultAuditPayloadFilter`
|
||||
implement it unchanged. See [`../../shared-contract/`](../../shared-contract/).
|
||||
- **Confirm the writer contract matches.** `IAuditWriter.WriteAsync(AuditEvent, CancellationToken
|
||||
= default)` is already byte-for-byte the canonical signature, and the "never abort the
|
||||
user-facing action" wording matches. The only delta is the **record type**: the library's
|
||||
`IAuditWriter` is typed on the *canonical* 10-field `AuditEvent`, while ScadaBridge's is typed on
|
||||
its ~25-field record. Resolve by either (a) keeping ScadaBridge's writer on its own rich record
|
||||
and adopting only the library's *interface name + outcome enum*, or (b) having the shared seam be
|
||||
generic over the event type. **Recommended: (a)** — adopt the canonical `AuditOutcome` enum and
|
||||
the interface naming, but keep the bespoke `AuditEvent` as ScadaBridge's storage record, since the
|
||||
whole transport/partition/forwarding layer is built on its extra columns. (Best-practice fit: this
|
||||
is the minimal-coupling option — share the contract, not the schema.)
|
||||
|
||||
**Keep bespoke (the large, untouched majority):**
|
||||
- The entire `Site/` (SQLite hot-path + ring-buffer fallback + telemetry forwarder) and `Central/`
|
||||
(ingest / reconcile / purge / partition maintenance) pipeline.
|
||||
- The `AuditEvent` rich record itself, the four domain enums (`AuditChannel`/`AuditKind`/
|
||||
`AuditStatus`/`AuditForwardState`), CLI `export`/`verify-chain`, and the Blazor audit UI.
|
||||
- The redaction *policy* (`DefaultAuditPayloadFilter` options, per-target overrides) — only the
|
||||
interface name is shared, not the implementation.
|
||||
|
||||
**Net:** ScadaBridge converges by renaming one interface and adopting the canonical `AuditOutcome`
|
||||
enum + the `Kind`/`Channel`→`Action`/`Category` and `…`→`DetailsJson` projection for any
|
||||
cross-project reporting. No transport, storage, CLI, or UI is replaced. Sequencing and the
|
||||
cross-project gap list live in [`../../GAPS.md`](../../GAPS.md); the canonical target is
|
||||
[`../../spec/SPEC.md`](../../spec/SPEC.md).
|
||||
@@ -0,0 +1,153 @@
|
||||
# Proposed shared library: `ZB.MOM.WW.Audit`
|
||||
|
||||
A contract on paper — the public surface to extract so the three projects stop
|
||||
re-implementing audit-event capture with incompatible shapes. Realizes
|
||||
[`../spec/SPEC.md`](../spec/SPEC.md).
|
||||
**Not yet created.** Reference implementations already exist: ScadaBridge's
|
||||
`IAuditWriter`/`IAuditPayloadFilter` (already at target shape), mxaccessgw
|
||||
structured-log audit trail, OtOpcUa admin-UI audit log.
|
||||
|
||||
## Package (.NET 10)
|
||||
|
||||
```
|
||||
ZB.MOM.WW.Audit # the single package: event record, seams, helpers, DI wiring
|
||||
```
|
||||
|
||||
Single package, single DLL. Only non-BCL dependency:
|
||||
`Microsoft.Extensions.DependencyInjection.Abstractions` (for `AddZbAudit`).
|
||||
Published to the Gitea NuGet feed; SemVer.
|
||||
|
||||
| Package (→ DLL) | Transitive deps | OtOpcUa | mxaccessgw | ScadaBridge |
|
||||
|---|---|---|---|---|
|
||||
| `ZB.MOM.WW.Audit` | `Microsoft.Extensions.DependencyInjection.Abstractions` | ✅ | ✅ | ✅ |
|
||||
|
||||
All three auth-bearing processes are .NET 10 — the x86/net48 mxaccessgw worker does
|
||||
no audit emission, so net48 multi-targeting is **not** required.
|
||||
|
||||
## `AuditEvent` record and `AuditOutcome` enum
|
||||
|
||||
```csharp
|
||||
public sealed record AuditEvent {
|
||||
public required Guid EventId { get; init; }
|
||||
public required DateTimeOffset OccurredAtUtc { get; init; } // normalized to UTC on assignment
|
||||
public required string Actor { get; init; }
|
||||
public required string Action { get; init; }
|
||||
public required AuditOutcome Outcome { get; init; }
|
||||
public string? Category { get; init; }
|
||||
public string? Target { get; init; }
|
||||
public string? SourceNode { get; init; }
|
||||
public Guid? CorrelationId { get; init; }
|
||||
public string? DetailsJson { get; init; }
|
||||
}
|
||||
|
||||
public enum AuditOutcome { Success, Failure, Denied }
|
||||
```
|
||||
|
||||
`OccurredAtUtc` is the only field with a normalization contract: any value assigned
|
||||
is coerced to UTC (via `ToUniversalTime()`). All other fields are caller-supplied and
|
||||
carried through without transformation by the library internals.
|
||||
|
||||
## Seams
|
||||
|
||||
### `IAuditWriter`
|
||||
|
||||
```csharp
|
||||
public interface IAuditWriter
|
||||
{
|
||||
Task WriteAsync(AuditEvent evt, CancellationToken ct = default);
|
||||
}
|
||||
```
|
||||
|
||||
**Hard contract:**
|
||||
- Best-effort delivery. The implementation **MUST swallow all internal failures** and
|
||||
**MUST NOT throw** to the caller. A write that fails silently is preferable to
|
||||
a write that crashes the calling thread or kills a request pipeline.
|
||||
- `CancellationToken` is respected for cooperative cancellation but a cancellation
|
||||
does not constitute a contract violation; the implementation may choose to complete
|
||||
a partially-written event anyway.
|
||||
|
||||
### `IAuditRedactor`
|
||||
|
||||
```csharp
|
||||
public interface IAuditRedactor
|
||||
{
|
||||
AuditEvent Apply(AuditEvent rawEvent);
|
||||
}
|
||||
```
|
||||
|
||||
**Hard contract:**
|
||||
- Pure function (no I/O, no side effects).
|
||||
- **MUST NOT throw.** On any internal failure the implementation must over-redact
|
||||
(e.g. replace the affected field with a sentinel such as `"[redacted]"`) rather
|
||||
than propagate the exception. Lossier output is always preferable to a thrown
|
||||
exception reaching the caller.
|
||||
|
||||
## Shipped helpers (concrete)
|
||||
|
||||
### Redactors
|
||||
|
||||
| Type | Behaviour |
|
||||
|---|---|
|
||||
| `NullAuditRedactor` | Identity — returns the event unchanged. Registered as the default by `AddZbAudit`. |
|
||||
| `TruncatingAuditRedactor` | Caps `DetailsJson` and `Target` to a configurable maximum length and appends a marker (e.g. `"…"`) when truncated. Never throws. Configured via `TruncatingAuditRedactorOptions`. |
|
||||
| `TruncatingAuditRedactorOptions` | Options record for `TruncatingAuditRedactor`: `MaxDetailsJsonLength`, `MaxTargetLength`, `TruncationMarker`. |
|
||||
|
||||
### Writers
|
||||
|
||||
| Type | Behaviour |
|
||||
|---|---|
|
||||
| `NoOpAuditWriter` | Discards every event. Registered as the default by `AddZbAudit`; consumer replaces with a real writer. |
|
||||
| `CompositeAuditWriter` | Fan-out: forwards each event to an ordered list of inner `IAuditWriter` instances. A failing inner writer is swallowed (per the `IAuditWriter` contract) — it does **not** abort the remaining writers in the list. |
|
||||
| `RedactingAuditWriter` | Decorator: calls `IAuditRedactor.Apply` on the event, then delegates the redacted event to an inner `IAuditWriter`. Separates the redaction concern from any concrete writer. |
|
||||
|
||||
## DI wiring
|
||||
|
||||
```csharp
|
||||
public static IServiceCollection AddZbAudit(this IServiceCollection services);
|
||||
```
|
||||
|
||||
Registers defaults via `TryAdd` so any prior consumer registration wins:
|
||||
|
||||
- `IAuditRedactor` → `NullAuditRedactor` (singleton)
|
||||
- `IAuditWriter` → `NoOpAuditWriter` (singleton)
|
||||
|
||||
A consumer that registers its own `IAuditWriter` (e.g. a Serilog-backed writer or a
|
||||
`CompositeAuditWriter`) before or after calling `AddZbAudit` will see its registration
|
||||
respected. `AddZbAudit` does **not** clear or override existing registrations.
|
||||
|
||||
## Relationship to Telemetry (`ILogRedactor`)
|
||||
|
||||
`IAuditRedactor` mirrors Telemetry.Serilog's `ILogRedactor` in shape and naming — same
|
||||
single-method contract, same "pure, must not throw, over-redact on failure" semantics —
|
||||
so that a future `ZB.MOM.WW.Hosting` aggregator package can wire both behind a single
|
||||
configuration surface without an impedance mismatch.
|
||||
|
||||
`ZB.MOM.WW.Audit` has **no dependency** on `ZB.MOM.WW.Telemetry` or any Serilog package.
|
||||
The alignment is intentional design convergence; the independence is a hard boundary.
|
||||
|
||||
## What stays in each consumer
|
||||
|
||||
OtOpcUa: admin-UI audit sink (Blazor event handler → `IAuditWriter`), `Category`
|
||||
constants specific to OPC UA operations.
|
||||
|
||||
mxaccessgw: gRPC interceptor that captures actor/action from call metadata; constraint-aware
|
||||
`Category` tagging; `DetailsJson` serialization of gateway-specific payloads.
|
||||
|
||||
ScadaBridge: site-scoped `SourceNode` population; `ManagementActor` enforcement callbacks;
|
||||
`IAuditPayloadFilter` → `IAuditRedactor` migration (shape is already equivalent — adoption
|
||||
is a near-zero-effort rename).
|
||||
|
||||
## Open contract questions
|
||||
|
||||
1. **Batching**: a `WriteBatchAsync(IEnumerable<AuditEvent>, CancellationToken)` overload on
|
||||
`IAuditWriter` may be warranted once a database-backed writer is in use. Defer until
|
||||
the first consumer demonstrates the need; batching can be added without breaking the
|
||||
existing single-event surface.
|
||||
2. **Structured `DetailsJson`**: confirm whether callers should supply raw JSON strings or
|
||||
whether a typed `TDetails` generic overload (serialized internally) is cleaner. The
|
||||
current `string?` keeps the library dependency-free but shifts serialization to the caller.
|
||||
3. **`CompositeAuditWriter` error policy**: decide whether per-writer failure should be
|
||||
observable (e.g. an optional `ILogger<CompositeAuditWriter>`) or always silently dropped.
|
||||
Logging the failure is diagnostic-friendly but adds a logging dependency.
|
||||
|
||||
See [`../GAPS.md`](../GAPS.md) for the adoption order and effort/risk.
|
||||
@@ -0,0 +1,94 @@
|
||||
# Canonical event model (standardized)
|
||||
|
||||
Status: **Standardized**. The org-wide audit record + outcome enum every sister project maps onto.
|
||||
This is the reference companion to [`SPEC.md`](SPEC.md) (mirroring auth's `CANONICAL-ROLES.md` /
|
||||
theme's `DESIGN-TOKENS.md`): the field-by-field canonical record, the `AuditOutcome` definition with
|
||||
which app states map onto each value, and the full per-project mapping table. The shared library
|
||||
defines exactly this record; each project **projects its native record onto it** at the seam.
|
||||
|
||||
## The canonical record
|
||||
|
||||
```csharp
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
public sealed record AuditEvent
|
||||
{
|
||||
// REQUIRED core — who / what / when / outcome
|
||||
public required Guid EventId { get; init; } // idempotency key
|
||||
public required DateTimeOffset OccurredAtUtc { get; init; } // normalized to UTC
|
||||
public required string Actor { get; init; } // who — = ZB.MOM.WW.Auth principal at adoption
|
||||
public required string Action { get; init; } // what — verb / event-type string
|
||||
public required AuditOutcome Outcome { get; init; } // Success | Failure | Denied
|
||||
|
||||
// OPTIONAL common
|
||||
public string? Category { get; init; } // subsystem / grouping bucket
|
||||
public string? Target { get; init; } // on-what (resource / method / connection)
|
||||
public string? SourceNode { get; init; } // emitting logical node / host
|
||||
public Guid? CorrelationId { get; init; } // join to originating request / workflow
|
||||
|
||||
// EXTENSION — everything project-specific, as JSON
|
||||
public string? DetailsJson { get; init; }
|
||||
}
|
||||
|
||||
public enum AuditOutcome { Success, Failure, Denied }
|
||||
```
|
||||
|
||||
### Field-by-field
|
||||
|
||||
| Field | Req? | Type | Meaning | Notes |
|
||||
|---|:-:|---|---|---|
|
||||
| `EventId` | yes | `Guid` | Idempotency key | Backs at-least-once transports: OtOpcUa's filtered-unique `EventId` index, ScadaBridge's first-write-wins. MxGateway has none today → **generate at write time**. |
|
||||
| `OccurredAtUtc` | yes | `DateTimeOffset` | When it happened, UTC | MxGateway already uses `DateTimeOffset`. OtOpcUa / ScadaBridge store UTC-forced `DateTime` and widen at the mapping boundary. |
|
||||
| `Actor` | yes | `string` | Who acted | SHOULD be the `ZB.MOM.WW.Auth` principal ([`SPEC.md`](SPEC.md) §4). Kept a `string` (no Auth dependency). Keyless events use a `"system"` / `"cli"` fallback rather than empty. |
|
||||
| `Action` | yes | `string` | What was done (verb / event-type) | Carries each app's domain verb: OtOpcUa `EventType`, MxGateway `EventType`, ScadaBridge `{Channel}.{Kind}`. |
|
||||
| `Outcome` | yes | `AuditOutcome` | Success / Failure / Denied | **New normalized field — no app stores it today; each derives it** (see below). |
|
||||
| `Category` | no | `string?` | Coarse subsystem / grouping | OtOpcUa `Category` (`"Config"`); MxGateway constant `"ApiKey"`; ScadaBridge `Channel`. |
|
||||
| `Target` | no | `string?` | The object acted on | ScadaBridge `Target` (direct). OtOpcUa / MxGateway have no dedicated field → null or fold into `DetailsJson`. |
|
||||
| `SourceNode` | no | `string?` | Emitting logical node / host | OtOpcUa `SourceNode` (a logical node name, **not** an OPC UA NodeId); ScadaBridge `SourceNode`; MxGateway `RemoteAddress`. |
|
||||
| `CorrelationId` | no | `Guid?` | Join to originating request / workflow | OtOpcUa / ScadaBridge direct; MxGateway has none today (left null). |
|
||||
| `DetailsJson` | no | `string?` | Extension bag — all project-specific data | Must be valid JSON where stored (OtOpcUa enforces this with a CHECK constraint). Absorbs each app's surplus columns. |
|
||||
|
||||
## `AuditOutcome` — definition and app-state mapping
|
||||
|
||||
Three values, deliberately minimal — enough to normalize denials and failures without importing any
|
||||
app's full taxonomy. `Outcome` is **derived** at each emit site (no app persists it today; OtOpcUa
|
||||
encodes it implicitly in `EventType`, MxGateway in the event-type literal, ScadaBridge in `Status`):
|
||||
|
||||
| `AuditOutcome` | Meaning | OtOpcUa (`EventType`) | MxGateway (event type) | ScadaBridge (`AuditStatus` / `AuditKind`) |
|
||||
|---|---|---|---|---|
|
||||
| **`Success`** | The action completed | config-write verbs — `DraftCreated`, `DraftEdited`, `Published`, `RolledBack`, `NodeApplied`, `CredentialAdded`, `ClusterCreated`, `NodeAdded`, `ExternalIdReleased`, … | key-lifecycle — `init-db`, `create-key`, `list-keys`, `revoke-key`, `rotate-key` + all `dashboard-*` | `Status = Delivered` |
|
||||
| **`Failure`** | The action was attempted and failed | *(none today — a failed actor flush is dropped, not recorded as an event)* | *(none emitted today)* | `Status ∈ { Failed, Parked, Discarded }` |
|
||||
| **`Denied`** | The action was rejected by authorization / policy | `OpcUaAccessDenied`, `CrossClusterNamespaceAttempt` | `constraint-denied` | `Kind = InboundAuthFailure` |
|
||||
|
||||
Notes:
|
||||
|
||||
- **OtOpcUa has no `Failure` source.** Its vocabulary only distinguishes success-verbs from
|
||||
access-denials; an internal write failure is dropped (best-effort), not emitted as an event. So
|
||||
OtOpcUa produces only `Success` / `Denied` until/unless it adds failure events.
|
||||
- **MxGateway emits only `Success` / `Denied`** today (no failure events; authentication
|
||||
success/failure is surfaced as gRPC status, not persisted — see its current-state doc).
|
||||
- **ScadaBridge in-flight states** (`Submitted` / `Forwarded` / `Attempted`) are not terminal; when
|
||||
projecting to a single `Outcome` they collapse to the last-known terminal state. `Skipped` is not a
|
||||
user-facing outcome and is excluded from the canonical projection.
|
||||
|
||||
## Per-project mapping table (canonical ← native record)
|
||||
|
||||
Consolidated from the three current-state docs. "Direct" = field exists with the same role; the
|
||||
right-hand notes flag the type bridges and synthesized fields.
|
||||
|
||||
| Canonical field | OtOpcUa `AuditEvent` (8 fields) | MxGateway `ApiKeyAuditRecord` (6 fields) | ScadaBridge `AuditEvent` (~25 fields) |
|
||||
|---|---|---|---|
|
||||
| `EventId` | `EventId` — direct (idempotency key) | **generate** new `Guid` (only `AuditId` rowid exists) | `EventId` — direct |
|
||||
| `OccurredAtUtc` | `OccurredAtUtc` (`DateTime` UTC) → widen | `CreatedUtc` (store-assigned `DateTimeOffset`) — direct | `OccurredAtUtc` (`DateTime` UTC-forced) → widen |
|
||||
| `Actor` | `Actor` — direct | `KeyId` (nullable → `"system"`/`"cli"` fallback) | `Actor` (nullable on system rows) |
|
||||
| `Action` | `Action` (persisted as `"{Category}:{Action}"`) | `EventType` — direct | `{Channel}.{Kind}` (e.g. `ApiOutbound.ApiCall`) |
|
||||
| `Outcome` | **derive** from `EventType` | **derive**: `constraint-denied`→`Denied`, else `Success` | **derive** from `Status` (+`InboundAuthFailure`→`Denied`) |
|
||||
| `Category` | `Category` (`"Config"`) | constant `"ApiKey"` | `Channel` |
|
||||
| `Target` | — none — (null or via `DetailsJson`) | — none — (`commandKind`/`target` embedded in `Details` text) | `Target` — direct |
|
||||
| `SourceNode` | `SourceNode` (logical node, `NodeId.Value`) | `RemoteAddress` (dashboard path only) | `SourceNode` — direct |
|
||||
| `CorrelationId` | `CorrelationId` (`CorrelationId.Value`) — direct | — none — | `CorrelationId` — direct |
|
||||
| `DetailsJson` | `DetailsJson` — direct (also `ClusterId`/`GenerationId` on the SP path) | `Details` (plain string → store as-is or wrap) | the ~15 rich/plumbing fields (`ExecutionId`, `SourceSiteId`, `HttpStatus`, `DurationMs`, `ErrorMessage`, `RequestSummary`, `ResponseSummary`, `PayloadTruncated`, `Extra`, `ForwardState`, …) serialize here |
|
||||
|
||||
The canonical record is a **lossy projection**: it is sufficient for cross-project reporting, but each
|
||||
project keeps its native record as the storage shape — ScadaBridge especially, whose partitioned SQL
|
||||
schema, forwarding state, and reconciliation depend on the extra columns ([`SPEC.md`](SPEC.md) §5).
|
||||
@@ -0,0 +1,146 @@
|
||||
# Audit — normalized target spec
|
||||
|
||||
Status: **Draft**. The single design the sister projects converge on. Derived from the three
|
||||
code-verified current-state docs (`../current-state/`) and the locked design
|
||||
(`../../../docs/plans/2026-06-01-audit-component-design.md`). Goal is *path to shared code*
|
||||
(`../shared-contract/ZB.MOM.WW.Audit.md`), so each normalized section maps to a shared library seam.
|
||||
|
||||
## 0. Normalized vs left-per-project
|
||||
|
||||
**Normalized here** (the shared `ZB.MOM.WW.Audit` library):
|
||||
|
||||
- **The canonical `AuditEvent` record** — required core (`EventId`, `OccurredAtUtc`, `Actor`,
|
||||
`Action`, `Outcome`) + optional common (`Category`, `Target`, `SourceNode`, `CorrelationId`) +
|
||||
the `DetailsJson` extension bag. The full field-by-field reference is [`EVENT-MODEL.md`](EVENT-MODEL.md).
|
||||
- **`AuditOutcome`** — the 3-value `Success | Failure | Denied` enum (§3). This is a *new*
|
||||
normalized field every app derives; see [`EVENT-MODEL.md`](EVENT-MODEL.md) for the per-app derivation.
|
||||
- **The two seams** — `IAuditWriter` (best-effort, never throws to caller, §1) and `IAuditRedactor`
|
||||
(pure, never throws, over-redacts on failure, §2).
|
||||
|
||||
**Explicitly NOT normalized** (domain-specific / divergent — keep per project):
|
||||
|
||||
- **Transport & storage** — OtOpcUa's Akka cluster-broadcast → singleton `AuditWriterActor` (batch
|
||||
500 / 5 s, two-layer dedup) over `ConfigAuditLog`; MxGateway's SQLite `IApiKeyAuditStore` append +
|
||||
list-recent; ScadaBridge's site-SQLite hot-path → central MS SQL ingest / reconcile / purge /
|
||||
partition-maintenance / hash-chain pipeline. The shared core carries no Akka / EF / SQLite /
|
||||
Serilog dependency; its only non-BCL dependency is `Microsoft.Extensions.DependencyInjection.Abstractions`
|
||||
(for `AddZbAudit`).
|
||||
- **Domain vocabulary** — ScadaBridge's `Channel` / `Kind` / `Status` / `ForwardState` enums and
|
||||
OtOpcUa's `EventType` strings (`DraftCreated`, `Published`, `OpcUaAccessDenied`, …). These map
|
||||
*into* `Action` / `Category` / `Outcome` / `DetailsJson`; they do not leak into the shared type.
|
||||
- **Query / CLI / UI / export** surfaces (OtOpcUa `ClusterAudit.razor`; ScadaBridge `export` /
|
||||
`verify-chain` CLI + Blazor audit pages; MxGateway's unused `ListRecentAsync`).
|
||||
- **Each app's redaction *policy*** — *which* fields/commands/payloads are sensitive. Only the
|
||||
`IAuditRedactor` *seam* is shared; the `Default` / `Safe` filter behaviour stays per-project.
|
||||
|
||||
> **Scope of the producer path.** OtOpcUa has **two producers** writing the same `ConfigAuditLog`
|
||||
> table — the structured Akka `AuditEvent` path *and* older SQL stored procedures that `INSERT`
|
||||
> directly (`SUSER_SNAME()`, bare `EventType`, NULL `EventId`). Normalization targets the
|
||||
> **structured producer path** (the one that builds an `AuditEvent`), not every SQL insert; the SP
|
||||
> path stays per-project and is a reconcile item, not an extraction item (`../GAPS.md`).
|
||||
|
||||
## 1. The writer contract — `IAuditWriter` (best-effort)
|
||||
|
||||
```csharp
|
||||
public interface IAuditWriter
|
||||
{
|
||||
Task WriteAsync(AuditEvent evt, CancellationToken ct = default);
|
||||
}
|
||||
```
|
||||
|
||||
Audit is a side-channel, never on the critical path. The hard rule:
|
||||
|
||||
- **`WriteAsync` MUST NOT throw to the caller.** An implementation swallows/logs its own internal
|
||||
failures; a failed write **must never abort the user-facing action** it is recording. (ScadaBridge's
|
||||
seam already states this almost word-for-word: "Failures must NEVER abort the user-facing action.")
|
||||
- Idempotency is carried by `EventId`, so retries and at-least-once transports are safe (OtOpcUa's
|
||||
filtered-unique `EventId` index and ScadaBridge's first-write-wins are both honoured by this key).
|
||||
- Delivery is at-most-once *as a contract* — a writer MAY drop on failure (OtOpcUa drops a failed
|
||||
batch; ScadaBridge's ring-buffer fallback drops oldest). Durability is a per-project transport
|
||||
decision, not part of this seam.
|
||||
|
||||
Shipped helpers (the only concrete writers): `NoOpAuditWriter` (discards — tests / disabled audit),
|
||||
`CompositeAuditWriter` (fans out to N writers; **one writer throwing does not stop the others**), and
|
||||
`RedactingAuditWriter` (decorator: applies the redactor, then delegates to an inner writer).
|
||||
|
||||
## 2. The redactor contract — `IAuditRedactor` (never throws)
|
||||
|
||||
```csharp
|
||||
public interface IAuditRedactor
|
||||
{
|
||||
AuditEvent Apply(AuditEvent rawEvent);
|
||||
}
|
||||
```
|
||||
|
||||
A pure projection from a raw event to a safe one, applied between event construction and the writer
|
||||
chain. The hard rule:
|
||||
|
||||
- **`Apply` MUST NOT throw.** On any internal failure it **over-redacts** (returns a strictly safer
|
||||
event) rather than propagating — a redactor that throws would either crash the audit path or leak
|
||||
the unredacted event. (ScadaBridge's `SafeDefaultAuditPayloadFilter` is the reference: header-only
|
||||
redaction, over-redacts on parse failure.)
|
||||
- It is a **pure function** returning a filtered *copy* (via `with`); it does not mutate the input or
|
||||
perform I/O.
|
||||
|
||||
The seam is **aligned-but-independent** with Telemetry's `ILogRedactor` — same shape and naming
|
||||
discipline so a future `ZB.MOM.WW.Hosting` aggregator wires both with one mental model — but there is
|
||||
**no cross-package dependency**. Shipped helpers: `NullAuditRedactor` (identity — the default when no
|
||||
policy is configured) and `TruncatingAuditRedactor` (caps `DetailsJson` / `Target` to a configured
|
||||
max + sets a truncation marker; never throws). The *secret-field policy* (which fields/commands are
|
||||
sensitive) stays per-project via composition.
|
||||
|
||||
## 3. `AuditOutcome` — the new normalized field
|
||||
|
||||
`Outcome` is in the **required core**, but **no app stores it today** — each encodes outcome
|
||||
implicitly and must **derive** it at adoption (this is the one genuinely new field):
|
||||
|
||||
- **OtOpcUa** — derived from the `EventType` vocabulary (`OpcUaAccessDenied` /
|
||||
`CrossClusterNamespaceAttempt` → `Denied`; config-write verbs → `Success`).
|
||||
- **MxGateway** — `constraint-denied` → `Denied`; key-lifecycle events → `Success`.
|
||||
- **ScadaBridge** — `AuditStatus` → `Outcome` (`Delivered` → `Success`; `Failed` / `Parked` /
|
||||
`Discarded` → `Failure`; `InboundAuthFailure` kind → `Denied`).
|
||||
|
||||
The three values normalize denials and failures across the family without importing any app's full
|
||||
taxonomy. The enum definition and the complete state-by-state mapping live in [`EVENT-MODEL.md`](EVENT-MODEL.md).
|
||||
|
||||
## 4. The hinge — audit closes the loop on Auth
|
||||
|
||||
Every audit row's `Actor` is the *who*, which is exactly the identity the **Auth** component already
|
||||
normalizes (LDAP/GLAuth principal, API-key name). Auth is the read side ("who is this and what may
|
||||
they do"); audit is the write side ("who did what"). The spec ties them by stating:
|
||||
|
||||
- **`Actor` SHOULD be the `ZB.MOM.WW.Auth` principal** at adoption time.
|
||||
- But `Actor` is **kept as a plain `string`** in the contract, so the library carries **no dependency
|
||||
on `ZB.MOM.WW.Auth`**. (MxGateway's keyless events — `init-db` / `list-keys` — supply a `"system"` /
|
||||
`"cli"` fallback rather than leaving the required field empty.)
|
||||
|
||||
This mirrors Auth's own decision to keep audit *read* inside `OBSERVE` and audit *export* inside
|
||||
`ADMINISTER` rather than minting a separate auditor role: the two components share a vocabulary, not a
|
||||
dependency.
|
||||
|
||||
## 5. ScadaBridge is already at the target
|
||||
|
||||
ScadaBridge already ships **both** seams: an `IAuditWriter` whose best-effort contract matches
|
||||
word-for-word, and an `IAuditPayloadFilter` that *is* the canonical `IAuditRedactor` under a different
|
||||
name (identical `AuditEvent Apply(AuditEvent)` signature, pure / never-throws / over-redacts). The
|
||||
library essentially **lifts ScadaBridge's seams**.
|
||||
|
||||
The one real (non-naming) decision is the **writer's record type**: the canonical `IAuditWriter` is
|
||||
typed on the 10-field `AuditEvent`; ScadaBridge's writer is typed on its ~25-field record.
|
||||
|
||||
> **Resolution (recommended):** share the **interface *name* + the `AuditOutcome` enum**, not the
|
||||
> record schema. ScadaBridge keeps its rich ~25-field record as its **storage shape** (its whole
|
||||
> transport / partition / forwarding / reconciliation layer is built on the extra columns), and maps
|
||||
> to the canonical 10-field record **only at cross-app reporting boundaries**. This is the
|
||||
> minimal-coupling option — share the contract, not the schema — and avoids making the shared seam
|
||||
> generic over the event type. ScadaBridge therefore converges by **renaming one interface** and
|
||||
> adopting `AuditOutcome`, with no transport / storage / CLI / UI change.
|
||||
|
||||
## 6. Acceptance (what "converged" means)
|
||||
|
||||
A project is converged when: (a) its structured audit-producer path constructs the canonical
|
||||
`AuditEvent` (with `Outcome` derived per §3) and persists via an implementation of `IAuditWriter`;
|
||||
(b) any redaction runs through an `IAuditRedactor`; (c) `Actor` carries the `ZB.MOM.WW.Auth` principal
|
||||
where one exists (string fallback otherwise); with its transport, storage, domain vocabulary, query
|
||||
surfaces, and redaction *policy* unchanged. Per-project deltas and the adoption backlog are in
|
||||
[`../GAPS.md`](../GAPS.md); the proposed library API is [`../shared-contract/ZB.MOM.WW.Audit.md`](../shared-contract/ZB.MOM.WW.Audit.md).
|
||||
@@ -0,0 +1,133 @@
|
||||
# Health — gaps & adoption backlog
|
||||
|
||||
Divergence of each project from [`spec/SPEC.md`](spec/SPEC.md), and the ordered backlog to
|
||||
reach the shared `ZB.MOM.WW.Health` library. Status legend: ⛔ gap · 🟡 partial · ✅ matches.
|
||||
|
||||
## Divergence vs spec
|
||||
|
||||
### §1 Endpoint tiers
|
||||
|
||||
| Spec tier | OtOpcUa | MxAccessGateway | ScadaBridge |
|
||||
|---|---|---|---|
|
||||
| `/health/ready` (tag `ready`) | ✅ present | ⛔ absent | ✅ present (name-predicate) |
|
||||
| `/health/active` (tag `active`) | ✅ present | ⛔ absent | ✅ present (name-predicate) |
|
||||
| `/healthz` (bare process liveness) | ✅ present | ⛔ absent | ⛔ absent |
|
||||
| `/health/live` (non-standard) | — | ⛔ present (hardcoded `"Healthy"`, bypasses health-check pipeline) | — |
|
||||
|
||||
→ **Gap T1 (P1):** MxAccessGateway has no standard health tiers. The existing `/health/live`
|
||||
`MapGet` lambda must be replaced by `app.MapZbHealth()` + real probes.
|
||||
→ **Gap T2:** ScadaBridge lacks `/healthz`. `MapZbHealth()` adds it automatically.
|
||||
→ **Gap T3:** MxAccessGateway's `/health/live` uses a raw `MapGet` that bypasses the ASP.NET Core
|
||||
health-check middleware — it does not participate in `IHealthCheckPublisher`, `HealthReport`, or
|
||||
UI integration. Must be removed.
|
||||
|
||||
### §2 Probe coverage
|
||||
|
||||
| Probe | OtOpcUa | MxAccessGateway | ScadaBridge |
|
||||
|---|---|---|---|
|
||||
| Database connectivity | ✅ `DatabaseHealthCheck` (query probe) | ⛔ none | ✅ `DatabaseHealthCheck` (`CanConnectAsync`) |
|
||||
| Akka cluster membership | ✅ `AkkaClusterHealthCheck` (2-way) | n/a (no Akka) | ✅ `AkkaClusterHealthCheck` (3-way) |
|
||||
| Active / leader node | ✅ `AdminRoleLeaderHealthCheck` (role-filtered) | n/a | ✅ `ActiveNodeHealthCheck` (role-less) |
|
||||
| Downstream gRPC dependency | ⛔ none | ⛔ none | ⛔ none |
|
||||
|
||||
→ **Gap P1 (P1):** MxAccessGateway has zero probes — `AddHealthChecks()` at
|
||||
`GatewayApplication.cs:61` is dead code. Minimum viable: a `GrpcDependencyHealthCheck`
|
||||
targeting the x86 worker IPC channel.
|
||||
→ **Gap P2:** No project probes its downstream gRPC dependency. OtOpcUa should probe the
|
||||
MxAccessGateway channel; MxAccessGateway should probe the worker IPC.
|
||||
→ **Gap P3:** Dead `AddHealthChecks()` in MxAccessGateway (`GatewayApplication.cs:61`) should be
|
||||
removed or replaced — it currently implies health checks are configured when they are not.
|
||||
|
||||
### §3 Akka status-policy divergence
|
||||
|
||||
| Aspect | OtOpcUa | ScadaBridge |
|
||||
|---|---|---|
|
||||
| Probe implementation | Scans `State.Members` for self by address | Reads `SelfMember.Status` directly |
|
||||
| Joining status | Degraded (not in Members as Up) | Healthy |
|
||||
| Leaving/Exiting status | Degraded | Degraded |
|
||||
| Other (Removed, Down…) | Degraded | Unhealthy |
|
||||
| ActorSystem null guard | — (none; `ActorSystem` injected directly) | ✅ Degraded if null |
|
||||
|
||||
The two implementations diverge in how they classify `Joining` (ScadaBridge calls it Healthy;
|
||||
OtOpcUa would see it as Degraded because `SelfMember` with status `Joining` would not appear as
|
||||
`Up` in the member scan). They also diverge in the Removed/Down classification (ScadaBridge
|
||||
Unhealthy, OtOpcUa Degraded).
|
||||
|
||||
The shared `ZB.MOM.WW.Health.Akka.AkkaClusterHealthCheck` ships two presets to preserve both
|
||||
behaviors rather than forcing one onto the other:
|
||||
- **Default** — ScadaBridge's three-way policy (`Up`/`Joining`=Healthy, `Leaving`/`Exiting`=Degraded,
|
||||
else Unhealthy)
|
||||
- **OtOpcUaCompat** — OtOpcUa's self-Up-among-members scan (found Up=Healthy, not found=Degraded)
|
||||
|
||||
→ **Gap A1:** OtOpcUa adopts the `OtOpcUaCompat` preset; ScadaBridge adopts the `Default` preset.
|
||||
Both preserve existing behavior without forcing convergence on a single policy.
|
||||
→ **Gap A2:** OtOpcUa's `AkkaClusterHealthCheck` injects `ActorSystem` directly (no null guard).
|
||||
The shared implementation injects via `AkkaHostedService` for startup safety.
|
||||
|
||||
### §4 Database probe technique
|
||||
|
||||
| Aspect | OtOpcUa | ScadaBridge |
|
||||
|---|---|---|
|
||||
| Probe method | `db.Deployments.AsNoTracking().Take(1).ToListAsync()` (query) | `_dbContext.Database.CanConnectAsync()` (connection only) |
|
||||
| Injection style | `IDbContextFactory<T>` (pooled, safe for concurrent probes) | `DbContext` directly (scoped, requires care in background use) |
|
||||
| Schema verification | ✅ implies schema is applied | ⛔ connection only |
|
||||
|
||||
→ **Gap D1:** `ZB.MOM.WW.Health.EntityFrameworkCore.DatabaseHealthCheck<TContext>` uses
|
||||
`CanConnectAsync` as the default (ScadaBridge behavior). An optional `ProbeQuery` delegate covers
|
||||
OtOpcUa's stricter approach. Both apps retain their existing probe semantics; neither is forced
|
||||
to change unless desired.
|
||||
→ **Gap D2:** ScadaBridge injects `DbContext` directly; the shared probe should use
|
||||
`IDbContextFactory<TContext>` for safe reuse from a background-service health-check context.
|
||||
ScadaBridge's DI registration will need updating on adoption.
|
||||
|
||||
### §5 Active-node / leader check
|
||||
|
||||
| Aspect | OtOpcUa | ScadaBridge |
|
||||
|---|---|---|
|
||||
| Probe type | `AdminRoleLeaderHealthCheck` (role-filtered: `"admin"`) | `ActiveNodeHealthCheck` (role-less; Up + leader) |
|
||||
| Non-role-bearing node | Healthy immediately | n/a (all central nodes have no role filter) |
|
||||
| Leader status | Healthy | Healthy |
|
||||
| Non-leader (standby) | Degraded | Unhealthy |
|
||||
| `IActiveNodeGate` backing | Not present | `ActiveNodeGate` (separate type, duplicated logic) |
|
||||
|
||||
→ **Gap L1:** `ZB.MOM.WW.Health.Akka.ActiveNodeHealthCheck` with an optional `RoleFilter`
|
||||
parameter unifies both behaviors. OtOpcUa passes `RoleFilter = "admin"` (role-filtered);
|
||||
ScadaBridge uses no role filter.
|
||||
→ **Gap L2:** ScadaBridge's `ActiveNodeGate` duplicates `ActiveNodeHealthCheck` logic. The shared
|
||||
`IActiveNodeGate` seam + a backing singleton eliminates the duplication.
|
||||
|
||||
### §6 Response writer
|
||||
|
||||
| | OtOpcUa | MxAccessGateway | ScadaBridge |
|
||||
|---|---|---|---|
|
||||
| Writer | Default (plain-text/JSON) | Bespoke `GatewayHealthReply` JSON | `UIResponseWriter.WriteHealthCheckUIResponse` |
|
||||
|
||||
→ **Gap W1:** the shared `ZB.MOM.WW.Health` package ships a canonical JSON response writer
|
||||
(lifting `HealthChecks.UI.Client` style to the default). All three projects adopt it on
|
||||
`MapZbHealth()` call — no per-project writer wiring needed.
|
||||
|
||||
### §7 Endpoint authentication
|
||||
|
||||
Both OtOpcUa and ScadaBridge expose health endpoints without authentication (`AllowAnonymous` or
|
||||
open by default). MxAccessGateway's `/health/live` has no authentication requirement. The spec
|
||||
canonizes this: health tiers are `AllowAnonymous`; `MapZbHealth()` applies `AllowAnonymous` by
|
||||
default.
|
||||
|
||||
No gap — consistent across all three. `MapZbHealth()` should document and enforce this default.
|
||||
|
||||
## Adoption backlog (ordered)
|
||||
|
||||
| # | Item | Projects | Priority | Effort | Risk | Notes |
|
||||
|---|---|---|---|---|---|---|
|
||||
| 1 | MxAccessGateway: remove dead `/health/live` + `AddHealthChecks()`, add `GrpcDependencyHealthCheck` (worker IPC) + `MapZbHealth()` | MxGateway | P1 | S | Low | Gap T1, T3, P1, P3 — no probes/tiers today; highest delta |
|
||||
| 2 | OtOpcUa: replace 3 bespoke checks with shared probes (`AkkaClusterHealthCheck` OtOpcUaCompat + `ActiveNodeHealthCheck` role-filtered + `DatabaseHealthCheck<T>` ProbeQuery) | OtOpcUa | P2 | S | Low | Gap A1, D1, L1 |
|
||||
| 3 | ScadaBridge: replace 3 bespoke checks with shared probes (Default policy + role-less Active + `CanConnectAsync`) + add `/healthz` + unify `ActiveNodeGate` with `IActiveNodeGate` seam | ScadaBridge | P2 | S | Low | Gap T2, A1, D2, L1, L2 |
|
||||
| 4 | OtOpcUa + MxAccessGateway: add `GrpcDependencyHealthCheck` for downstream gRPC channel | OtOpcUa, MxGateway | P2 | S | Low | Gap P2 — closes the silent-gateway-down scenario |
|
||||
| 5 | All: adopt canonical response writer (switch from per-project writers to `MapZbHealth` default) | all 3 | P3 | XS | Low | Gap W1 — mechanical; bundled with #1–3 |
|
||||
| 6 | DB injection style: switch ScadaBridge from injected `DbContext` to `IDbContextFactory<T>` | ScadaBridge | P3 | XS | Low | Gap D2 — background-service safety |
|
||||
|
||||
**Note: adoption items #1–6 are all follow-on tasks.** They are tracked here as the backlog for
|
||||
after `ZB.MOM.WW.Health` @ 0.1.0 is published. The library build itself (nupkgs, tests) is a
|
||||
separate task. This is consistent with how `ZB.MOM.WW.Auth` and `ZB.MOM.WW.Theme` are structured:
|
||||
the library is built first; adoption by the three apps is the next step.
|
||||
|
||||
@@ -0,0 +1,88 @@
|
||||
# Health (readiness / liveness / active-node)
|
||||
|
||||
Second normalized component under the operability cluster. **Goal: path to shared code** — converge
|
||||
the three sister projects onto a common three-tier health endpoint convention and a set of shared
|
||||
probe implementations, proposed as the `ZB.MOM.WW.Health` library set (3 packages), while each
|
||||
project keeps its own probe registration and orchestrator wiring.
|
||||
|
||||
- The one target: [`spec/SPEC.md`](spec/SPEC.md)
|
||||
- The proposed shared library: [`shared-contract/ZB.MOM.WW.Health.md`](shared-contract/ZB.MOM.WW.Health.md)
|
||||
- Divergences + backlog: [`GAPS.md`](GAPS.md)
|
||||
- Current state, per project: [`current-state/`](current-state/)
|
||||
|
||||
## Why health is a strong normalization candidate
|
||||
|
||||
Both OtOpcUa and ScadaBridge trace their health-check structure to the same "ScadaLink three-tier
|
||||
pattern" (`HealthEndpoints.cs:13` says so explicitly) but have already diverged in probe logic,
|
||||
status semantics, response writer, and endpoint registration style. MxAccessGateway has no shared
|
||||
ancestry here — it has a single hardcoded `/health/live` endpoint with no real probes at all.
|
||||
The common core (three tiers, database probe, Akka cluster probe, active-node probe) is
|
||||
re-implemented twice and absent once. Shared probe implementations with configurable policies
|
||||
close the gap without forcing identical behavior onto projects with legitimately different cluster
|
||||
semantics.
|
||||
|
||||
## Status by project
|
||||
|
||||
| Project | Endpoints today | Probes today | Response writer | `/healthz` | `IActiveNodeGate` | Adoption status |
|
||||
|---|---|---|---|---|---|---|
|
||||
| **OtOpcUa** | `/health/ready`, `/health/active`, `/healthz` | Database (query), AkkaCluster (2-way), AdminRoleLeader (role-filtered) | Default (plain-text/JSON) | ✅ present | — | Not started |
|
||||
| **MxAccessGateway** | `/health/live` only (raw `MapGet`; hardcoded `"Healthy"`) | **None** (`AddHealthChecks()` called but unused) | Bespoke `GatewayHealthReply` JSON | ⛔ absent | — | Not started |
|
||||
| **ScadaBridge** | `/health/ready`, `/health/active` | Database (`CanConnectAsync`), AkkaCluster (3-way), ActiveNode (role-less) | `HealthChecks.UI.Client` JSON | ⛔ absent | `ActiveNodeGate` (backs Inbound API 503 gate) | Not started |
|
||||
|
||||
See each project's [`current-state/<project>/CURRENT-STATE.md`](current-state/) for the
|
||||
code-verified detail and its adoption plan.
|
||||
|
||||
## Normalized vs. left per-project
|
||||
|
||||
**Normalized (the shared target):**
|
||||
|
||||
- Three-tier endpoint convention: `/health/ready` (tag `ready`), `/health/active` (tag `active`),
|
||||
`/healthz` (bare liveness). Mapped by `app.MapZbHealth()` from `ZB.MOM.WW.Health`.
|
||||
- Canonical JSON response writer (lifted from `HealthChecks.UI.Client` style; no per-project
|
||||
writer wiring needed).
|
||||
- `IActiveNodeGate` seam — generalized from ScadaBridge's `ActiveNodeGate`; wired into `MapZbHealth`
|
||||
for automatic active-tier response.
|
||||
- `GrpcDependencyHealthCheck` — reachability probe for a downstream gRPC dependency (covers
|
||||
OtOpcUa → MxAccessGateway channel and MxAccessGateway → worker IPC).
|
||||
- `AkkaClusterHealthCheck` (in `ZB.MOM.WW.Health.Akka`) with a configurable status policy.
|
||||
Default = ScadaBridge's three-way policy; `OtOpcUaCompat` preset preserves OtOpcUa's two-way
|
||||
self-Up-among-members scan.
|
||||
- `ActiveNodeHealthCheck` (in `ZB.MOM.WW.Health.Akka`) with an optional role filter. Role-less =
|
||||
ScadaBridge's behavior (Up + cluster leader); role-filtered = OtOpcUa's `AdminRoleLeader`
|
||||
behavior.
|
||||
- `DatabaseHealthCheck<TContext>` (in `ZB.MOM.WW.Health.EntityFrameworkCore`) with default
|
||||
`CanConnectAsync` and an optional `ProbeQuery` delegate.
|
||||
- `AllowAnonymous` on all three tiers by default (consistent across all three projects today).
|
||||
|
||||
**Left per-project (not forced together):**
|
||||
|
||||
- Which probes each app registers, their names, and which tags they carry.
|
||||
- Orchestrator / Traefik wiring (sidecars, route rules, upstreams).
|
||||
- ScadaBridge's `HealthMonitoring/` distributed aggregation pipeline (`SiteHealthCollector`,
|
||||
`CentralHealthAggregator`, `HealthReportSender`, etc.) — domain-specific, no shared-library
|
||||
equivalent.
|
||||
- MxAccessGateway's `GatewayHealthReply` metadata (`DefaultBackend`, `WorkerProtocolVersion`) —
|
||||
keep as a bespoke `/info` endpoint.
|
||||
- The x86 worker process — out of process and out of scope; the gateway-side
|
||||
`GrpcDependencyHealthCheck` observes it indirectly.
|
||||
|
||||
## Package structure
|
||||
|
||||
`ZB.MOM.WW.Health` ships as three dependency-split packages:
|
||||
|
||||
| Package | Contents | Consumers |
|
||||
|---|---|---|
|
||||
| `ZB.MOM.WW.Health` | Core tiers, `MapZbHealth`, canonical writer, `IActiveNodeGate`, `GrpcDependencyHealthCheck` | All three |
|
||||
| `ZB.MOM.WW.Health.Akka` | `AkkaClusterHealthCheck` + status presets, `ActiveNodeHealthCheck` + role filter | OtOpcUa, ScadaBridge |
|
||||
| `ZB.MOM.WW.Health.EntityFrameworkCore` | `DatabaseHealthCheck<TContext>` + optional probe delegate | OtOpcUa, ScadaBridge |
|
||||
|
||||
MxAccessGateway consumes the core package only (no Akka, no EF). OtOpcUa and ScadaBridge consume
|
||||
all three.
|
||||
|
||||
## Component status
|
||||
|
||||
**Status: Draft — library built at 0.1.0.** Spec and shared-contract written; current-state docs
|
||||
verified; GAPS backlog populated. Library implemented and packed at
|
||||
[`../../ZB.MOM.WW.Health/`](../../ZB.MOM.WW.Health/) (3 packages, 58 tests;
|
||||
`ZB.MOM.WW.Health`, `ZB.MOM.WW.Health.Akka`, `ZB.MOM.WW.Health.EntityFrameworkCore`).
|
||||
Adoption by the three apps is the next follow-on tracked in [`GAPS.md`](GAPS.md).
|
||||
@@ -0,0 +1,133 @@
|
||||
# Health — current state: MxAccessGateway
|
||||
|
||||
Repo: `~/Desktop/MxAccessGateway`. Stack: .NET 10 gateway (x64) + .NET 4.8 worker (x86), gRPC;
|
||||
solution `src/MxGateway.sln`.
|
||||
Health code lives in `src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs`. All paths relative
|
||||
to repo root.
|
||||
Verified 2026-06-01.
|
||||
|
||||
**Summary: bare liveness only.** MxAccessGateway has a single `/health/live` endpoint that returns
|
||||
a hardcoded `GatewayHealthReply` JSON object. `AddHealthChecks()` is called at startup but is
|
||||
entirely unused — no `IHealthCheck` implementations are registered, `MapHealthChecks` is never
|
||||
called, and there is no readiness or active-node tier. The net48 x86 worker process has no HTTP
|
||||
server and therefore no health endpoint of any kind.
|
||||
|
||||
## 1. Endpoint wiring
|
||||
|
||||
`src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs`:
|
||||
|
||||
- `:61` — `builder.Services.AddHealthChecks()` is called in the DI registration block. **This call
|
||||
is dead**: no `.AddCheck<T>()` call follows it, no `MapHealthChecks` is ever called. The
|
||||
framework registers the health-check infrastructure but nothing is wired through it.
|
||||
- `:139–145` — `MapGatewayEndpoints` maps a raw `endpoints.MapGet("/health/live", ...)` (not
|
||||
`MapHealthChecks`). The handler is an inline lambda that returns `Results.Ok(new GatewayHealthReply(...))`
|
||||
with a hardcoded `Status: "Healthy"`:
|
||||
|
||||
```csharp
|
||||
endpoints.MapGet(
|
||||
"/health/live",
|
||||
() => Results.Ok(new GatewayHealthReply(
|
||||
Status: "Healthy",
|
||||
DefaultBackend: GatewayContractInfo.DefaultBackendName,
|
||||
WorkerProtocolVersion: GatewayContractInfo.WorkerProtocolVersion)))
|
||||
.WithName("LiveHealth");
|
||||
```
|
||||
|
||||
This endpoint always returns HTTP 200 `{"Status":"Healthy",...}` as long as the process is alive.
|
||||
It carries no authentication requirement (no `[Authorize]` or `.RequireAuthorization()`).
|
||||
|
||||
## 2. Response shape
|
||||
|
||||
`GatewayHealthReply` is a record with three fields:
|
||||
- `Status` — always `"Healthy"` (hardcoded string, not the ASP.NET Core `HealthStatus` enum)
|
||||
- `DefaultBackend` — value of `GatewayContractInfo.DefaultBackendName` (the configured backend
|
||||
name, useful for confirming which gateway instance a probe hit)
|
||||
- `WorkerProtocolVersion` — value of `GatewayContractInfo.WorkerProtocolVersion` (the gRPC
|
||||
protocol version the gateway expects from the worker, useful for version-skew detection)
|
||||
|
||||
The response is not `HealthChecks.UI.Client` JSON and is not the standard ASP.NET Core health
|
||||
response shape. It is a bespoke JSON record.
|
||||
|
||||
## 3. Probes
|
||||
|
||||
None. There is no `IHealthCheck` registered. The `/health/live` response does not reflect:
|
||||
|
||||
- Whether the SQLite auth-store is reachable
|
||||
- Whether any active MXAccess session is functional
|
||||
- Whether the x86 worker named-pipe IPC is connected or the worker process is alive
|
||||
- Whether the gRPC service is actually accepting calls
|
||||
|
||||
The endpoint is purely a process liveness indicator.
|
||||
|
||||
## 4. Tier coverage
|
||||
|
||||
| Tier | Endpoint | Status |
|
||||
|---|---|---|
|
||||
| Process liveness | `/health/live` (raw `MapGet`) | ✅ present (but non-standard) |
|
||||
| Readiness | `/health/ready` | ⛔ absent |
|
||||
| Active node | `/health/active` | ⛔ absent (not Akka-based; not applicable as-is) |
|
||||
| `healthz` convention | `/healthz` | ⛔ absent |
|
||||
|
||||
MxAccessGateway is not an Akka.NET application — it has no cluster, no leader election, and no
|
||||
active-node concept. The "active" tier in the shared spec translates here to "is the worker process
|
||||
connected and the gRPC service ready to accept calls?" rather than cluster leadership.
|
||||
|
||||
## 5. x86 worker
|
||||
|
||||
`ZB.MOM.WW.MxGateway.Worker` is a .NET 4.8 console application communicating with the gateway
|
||||
over Windows named-pipe IPC. It has no HTTP server, no health endpoint, and no exposure to any
|
||||
probe mechanism. Its liveness must be inferred indirectly — either via the gateway process
|
||||
monitoring it (not currently implemented) or via the `GrpcDependencyHealthCheck` the gateway
|
||||
could use to probe the IPC channel.
|
||||
|
||||
## 6. Notable gaps
|
||||
|
||||
- `AddHealthChecks()` at `:61` is dead code. No `IHealthCheck` is ever registered via this call.
|
||||
- `/health/live` uses `MapGet` (a raw minimal-API handler) rather than `MapHealthChecks`. It
|
||||
bypasses the ASP.NET Core health-check middleware entirely, which means it does not participate
|
||||
in the standard health-check pipeline (no `IHealthCheckPublisher`, no `HealthReport`, no UI
|
||||
integration).
|
||||
- The hardcoded `"Healthy"` status means the endpoint cannot reflect real probe results even if
|
||||
probes were added later — the handler must be replaced, not just supplemented.
|
||||
- No readiness gating: orchestrators (Kubernetes, Traefik) that rely on `/health/ready` returning
|
||||
503 until the process is actually ready will receive 200 (or 404) from MxAccessGateway today.
|
||||
|
||||
---
|
||||
|
||||
## Adoption plan → `ZB.MOM.WW.Health`
|
||||
|
||||
**Replace `/health/live` + wire the shared tiers:**
|
||||
|
||||
The `AddHealthChecks()` call at `GatewayApplication.cs:61` is already present — it just needs
|
||||
probes registered against it. The raw `MapGet("/health/live", ...)` handler at `:139–145` must be
|
||||
removed and replaced with `app.MapZbHealth()` from `ZB.MOM.WW.Health`.
|
||||
|
||||
Steps:
|
||||
|
||||
1. **Remove** the inline `MapGet("/health/live", ...)` lambda (`:139–145`). The `GatewayHealthReply`
|
||||
record and `DefaultBackend`/`WorkerProtocolVersion` metadata can be surfaced differently (e.g., a
|
||||
`/info` endpoint or as custom data on the health response).
|
||||
2. **Register a `GrpcDependencyHealthCheck`** (from `ZB.MOM.WW.Health`) that probes the
|
||||
named-pipe IPC channel to the x86 worker. Tag `["ready"]`. This replaces the hardcoded
|
||||
liveness-only response with a real probe that reflects whether the worker is reachable.
|
||||
3. **Optionally add a `GrpcDependencyHealthCheck`** for any downstream gRPC dependency (e.g., the
|
||||
Galaxy Repository connection) if the gateway is expected to be healthy only when its upstreams are
|
||||
reachable. Tag `["ready"]`.
|
||||
4. **Call `app.MapZbHealth()`** — this maps `/health/ready` (tag `ready`), `/health/active` (tag
|
||||
`active`; initially empty — no active-node concept in MxGateway), and `/healthz` (bare liveness).
|
||||
The `/healthz` endpoint replaces the semantic role that `/health/live` served today.
|
||||
5. **Do not add `ZB.MOM.WW.Health.Akka`** — MxAccessGateway has no Akka dependency. The consumer
|
||||
matrix in the design specifies MxGateway uses the core package only.
|
||||
|
||||
**Keep bespoke:**
|
||||
|
||||
- The `WorkerProtocolVersion` / `DefaultBackend` metadata from `GatewayHealthReply` is
|
||||
MxAccessGateway-specific; keep it as a separate `/info` endpoint or embed it as `Data` on a
|
||||
custom probe rather than normalizing it into the shared contract.
|
||||
- The x86 worker itself (net48 console, named-pipe IPC, no HTTP) remains outside the shared health
|
||||
scheme. The `GrpcDependencyHealthCheck` observes the worker indirectly from the gateway side.
|
||||
- Per-gateway auth and TLS concerns on who may call health endpoints remain per-project.
|
||||
|
||||
**Adoption is a follow-on task** (tracked in `GAPS.md`), not part of the `ZB.MOM.WW.Health`
|
||||
library build. MxGateway is the **highest-priority adopter** (P1 gap — no probes/tiers today)
|
||||
and should be the first app wired up once the nupkg is available.
|
||||
@@ -0,0 +1,154 @@
|
||||
# Health — current state: OtOpcUa
|
||||
|
||||
Repo: `~/Desktop/OtOpcUa`. Stack: .NET 10, Akka.NET, OPC UA; solution `ZB.MOM.WW.OtOpcUa.slnx`.
|
||||
Health code lives in `src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/`. All paths relative to repo root.
|
||||
Verified 2026-06-01.
|
||||
|
||||
Full three-tier pattern: `/health/ready`, `/health/active`, and `/healthz`. Three probes covering
|
||||
the database, the Akka cluster, and the admin-role leader. All endpoints are `AllowAnonymous` to
|
||||
permit Traefik and load-balancer probing without credentials.
|
||||
|
||||
## 1. Endpoint wiring
|
||||
|
||||
`src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/HealthEndpoints.cs`:
|
||||
|
||||
- `:13` — XML comment explicitly names this as "ScadaLink's three-tier pattern: `ready` = boot ok;
|
||||
`active` = fully serving traffic; `healthz` = bare process liveness."
|
||||
- `:17` — `AddOtOpcUaHealth(IServiceCollection)` calls `services.AddHealthChecks()` and registers
|
||||
all three probes (lines 20–22):
|
||||
- `DatabaseHealthCheck` name `"configdb"`, tags `["ready","active"]`
|
||||
- `AkkaClusterHealthCheck` name `"akka"`, tags `["ready","active"]`
|
||||
- `AdminRoleLeaderHealthCheck` name `"admin-leader"`, tags `["active"]` only
|
||||
- `:28` — `MapOtOpcUaHealth(IEndpointRouteBuilder)` maps three endpoints (lines 33–44):
|
||||
- `/health/ready` — predicate `c => c.Tags.Contains("ready")`, `.AllowAnonymous()` (lines 33–36)
|
||||
- `/health/active` — predicate `c => c.Tags.Contains("active")`, `.AllowAnonymous()` (lines 37–40)
|
||||
- `/healthz` — predicate `_ => false` (no probes run; bare process liveness only), `.AllowAnonymous()` (lines 41–44)
|
||||
|
||||
`Program.cs`:
|
||||
- `:137` — `builder.Services.AddOtOpcUaHealth()`
|
||||
- `:159` — `app.MapOtOpcUaHealth()`
|
||||
|
||||
Response writer: default ASP.NET Core plain-text/JSON (no `HealthChecks.UI.Client`).
|
||||
|
||||
## 2. Probes
|
||||
|
||||
### DatabaseHealthCheck
|
||||
`src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/DatabaseHealthCheck.cs`:
|
||||
|
||||
- `:9` — injects `IDbContextFactory<OtOpcUaConfigDbContext>`
|
||||
- `:25–37` — opens a pooled context via `CreateDbContextAsync`, runs
|
||||
`db.Deployments.AsNoTracking().Take(1).ToListAsync()`. If the query succeeds →
|
||||
`HealthCheckResult.Healthy("ConfigDb reachable")` (`:31`). If it throws →
|
||||
`HealthCheckResult.Unhealthy("ConfigDb unreachable", ex)` (`:35`). No `Degraded` path.
|
||||
|
||||
The probe exercises a real query (not just `CanConnectAsync`) — it confirms the `Deployments` table
|
||||
is readable, which implies the schema migration has run. This is **stricter** than ScadaBridge's
|
||||
`CanConnectAsync` but more opaque about the failure reason.
|
||||
|
||||
Tags on registration: `["ready","active"]` — the database must be reachable for both readiness and
|
||||
active-node determination.
|
||||
|
||||
### AkkaClusterHealthCheck
|
||||
`src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AkkaClusterHealthCheck.cs`:
|
||||
|
||||
- `:9` — injects `ActorSystem` directly
|
||||
- `:27–33` — calls `Cluster.Get(_system)`, scans `cluster.State.Members` for the member whose
|
||||
`Address == cluster.SelfAddress` and `Status == MemberStatus.Up`:
|
||||
- Found Up → `HealthCheckResult.Healthy($"Self Up; {cluster.State.Members.Count} member(s)")` (`:32`)
|
||||
- Not found → `HealthCheckResult.Degraded("Self not yet Up in cluster")` (`:33`)
|
||||
|
||||
No `Unhealthy` path — joining/leaving/removed nodes are all reported as `Degraded`. This differs from
|
||||
ScadaBridge's more granular three-way policy (see GAPS).
|
||||
|
||||
Tags on registration: `["ready","active"]`.
|
||||
|
||||
### AdminRoleLeaderHealthCheck
|
||||
`src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AdminRoleLeaderHealthCheck.cs`:
|
||||
|
||||
- `:14` — injects `IClusterRoleInfo`
|
||||
- `:27–38` — three-path logic:
|
||||
- Node does not carry the `"admin"` role → `Healthy("Node does not carry admin role")` (`:30`) —
|
||||
non-admin nodes are immediately healthy, so this probe never gates a non-admin node.
|
||||
- Admin role + node is the role leader → `Healthy($"Admin leader ({...})")` (`:36`)
|
||||
- Admin role + not the leader → `Degraded($"Admin member but not leader (leader=...)")` (`:37`)
|
||||
|
||||
Tags on registration: `["active"]` only — does not participate in `/health/ready`. The intent is
|
||||
Traefik routing: the active node (admin-role leader) gets sticky admin-UI traffic; standby nodes
|
||||
are reachable for data-plane OPC UA but report `Degraded` on `/health/active` so the load balancer
|
||||
does not route control-plane traffic to them.
|
||||
|
||||
Note: no `Unhealthy` path for the role-filter case. If the ActorSystem is not running, `IClusterRoleInfo`
|
||||
presumably returns safe defaults (no role); this is not separately health-checked.
|
||||
|
||||
## 3. Tag / tier summary
|
||||
|
||||
| Probe | `/health/ready` | `/health/active` | `/healthz` |
|
||||
|---|---|---|---|
|
||||
| `DatabaseHealthCheck` | ✅ | ✅ | — |
|
||||
| `AkkaClusterHealthCheck` | ✅ | ✅ | — |
|
||||
| `AdminRoleLeaderHealthCheck` | — | ✅ | — |
|
||||
| (no probes) | — | — | ✅ (bare liveness) |
|
||||
|
||||
`/healthz` runs zero probes — it is a pure process liveness sentinel (process reachable = healthy;
|
||||
a crashed process = no response). Kubernetes liveness probes, Traefik TCP checks, and uptime
|
||||
monitors use this tier.
|
||||
|
||||
## 4. Downstream dependency coverage
|
||||
|
||||
No probe for the upstream MxAccessGateway gRPC channel. If the gateway is unreachable, OtOpcUa
|
||||
reports healthy here (the GalaxyDriver will surface errors in OPC UA diagnostics, but `/health/ready`
|
||||
and `/health/active` will not reflect it). This is a gap that the shared `GrpcDependencyHealthCheck`
|
||||
probe in `ZB.MOM.WW.Health` would close.
|
||||
|
||||
## 5. Notable design choices
|
||||
|
||||
- **AllowAnonymous on all tiers** — see `HealthEndpoints.cs:30–32` comment: "Without it the
|
||||
`AddOtOpcUaAuth` fallback policy 401s every probe and Traefik marks every backend unhealthy."
|
||||
- **Query probe, not `CanConnectAsync`** — the `Deployments` query validates that the schema has
|
||||
been applied. ScadaBridge uses `CanConnectAsync`; neither is wrong but they diverge.
|
||||
- **`Degraded` semantics** — the Akka check uses `Degraded` (not `Unhealthy`) for a joining/pre-Up
|
||||
node. ASP.NET Core maps `Degraded` to HTTP 200 by default; Traefik sees 200 and considers the
|
||||
node ready. If `Unhealthy` (HTTP 503) is required to gate traffic, the `Degraded` path is
|
||||
insufficient.
|
||||
- **`IClusterRoleInfo` abstraction** — the admin-leader check depends on `IClusterRoleInfo`, an OtOpcUa
|
||||
interface, not the raw `Akka.Cluster.Cluster` API. This is a testability-friendly layer absent in
|
||||
ScadaBridge's direct Akka usage.
|
||||
|
||||
---
|
||||
|
||||
## Adoption plan → `ZB.MOM.WW.Health`
|
||||
|
||||
**Replace with shared probes:**
|
||||
|
||||
- `AkkaClusterHealthCheck` → `ZB.MOM.WW.Health.Akka.AkkaClusterHealthCheck` using the
|
||||
**`OtOpcUaCompat` preset** (self-Up-among-members scan → Healthy/Degraded). The preset keeps
|
||||
OtOpcUa's existing two-way policy without forcing ScadaBridge's three-way policy onto it.
|
||||
- `AdminRoleLeaderHealthCheck` → `ZB.MOM.WW.Health.Akka.ActiveNodeHealthCheck` with
|
||||
`RoleFilter = "admin"`. The role-filter parameter produces identical behavior: non-admin nodes
|
||||
immediately healthy, admin leader healthy, admin non-leader degraded.
|
||||
- `DatabaseHealthCheck` → `ZB.MOM.WW.Health.EntityFrameworkCore.DatabaseHealthCheck<OtOpcUaConfigDbContext>`
|
||||
with a `ProbeQuery` delegate of `db => db.Deployments.AsNoTracking().Take(1).ToListAsync()`.
|
||||
The delegate preserves the stricter query probe rather than falling back to `CanConnectAsync`.
|
||||
- Add `GrpcDependencyHealthCheck` targeting the MxAccessGateway channel (closes the downstream
|
||||
dependency gap noted in §4). Tag `["ready","active"]`.
|
||||
- Replace `AddOtOpcUaHealth` / `MapOtOpcUaHealth` with
|
||||
`services.AddHealthChecks().AddCheck<...>()` (one call per probe, per spec §5) +
|
||||
`app.MapZbHealth()`. The `/healthz` bare-liveness tier is part of `MapZbHealth` by default —
|
||||
no separate wiring needed.
|
||||
|
||||
**Keep bespoke:**
|
||||
|
||||
- `IClusterRoleInfo` and its Akka implementation — on adoption this testability seam is given up
|
||||
for the health-check path. The shared `ActiveNodeHealthCheck` reads cluster role state from the
|
||||
ActorSystem directly (resolving it lazily via `IServiceProvider`); it does not accept
|
||||
`IClusterRoleInfo` as an injection point. This is an accepted trade-off: the shared implementation
|
||||
is simpler and consistent across projects, while `IClusterRoleInfo` remains available elsewhere
|
||||
in the OtOpcUa codebase where it is used outside health checks.
|
||||
- The `AllowAnonymous` policy — this is an OtOpcUa auth concern; `MapZbHealth` must document that
|
||||
callers are responsible for applying `AllowAnonymous` (or the shared helper applies it by default).
|
||||
- Which probes are registered and their tag assignments — the shared library supplies the check
|
||||
implementations; the wiring (which names, which tags, which options) remains per-project.
|
||||
|
||||
**Adoption is a follow-on task** (tracked in `GAPS.md`), not part of the `ZB.MOM.WW.Health`
|
||||
library build. The library build delivers the shared implementations; adoption lands in the
|
||||
OtOpcUa repo as a separate commit once the nupkg is available.
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user