Compare commits

...

6 Commits

Author SHA1 Message Date
Joseph Doherty 7ae25f8510 Re-stamp Telemetry-002/003 resolutions: nested redaction implemented in 05cc62a
Telemetry-002 was first resolved by documenting the scalar-only limitation; it is now
implemented (recursive nested redaction). Updated the two resolution notes to record
05cc62a and the replaced limitation test, preserving the audit trail. README unchanged
(still 0 pending / 35 total).
2026-06-01 12:13:05 -04:00
Joseph Doherty 05cc62aab3 Implement nested log redaction (Telemetry-002)
RedactionEnricher now projects each property into a mutable view the ILogRedactor
can edit: scalars stay as their CLR value, while StructureValue/SequenceValue/
DictionaryValue become nested IDictionary<string,object?>/IList<object?> the
redactor descends into recursively. A field nested inside a destructured {@Object}
can now be masked or removed — closing the gap documented as a limitation.

- Project/Rebuild round-trip preserves StructureValue.TypeTag and original
  dictionary keys; redactor-synthesised plain dicts/lists are rebuilt too.
- Untouched properties are not reallocated: structural ValueEquals skips write-back
  unless a property actually changed. Scalar fast path and no-redactor/no-property
  short-circuits retained.
- +5 nested-reach tests (mask/remove a field, sequence element, dictionary value,
  two-levels-deep); the old 'cannot reach' limitation test replaced. Serilog 34, 0 warnings.
- ILogRedactor XML doc + library README updated to document the recursive reach.
2026-06-01 12:12:26 -04:00
Joseph Doherty ae0ccc9a3a Mark all baseline code-review findings resolved
All 35 findings fixed in 544a6dd and marked Status: Resolved with resolution
notes. README regenerated: 0 pending / 35 total across 6 libraries.
2026-06-01 11:22:37 -04:00
Joseph Doherty 544a6ddb77 Fix all baseline code-review findings across the six shared libraries
Resolves the 35 findings from the 2026-06-01 baseline (commit 26ba1c7),
test-first for every behavioral change. +51 tests (331 -> 382 passing, 0 failed).

- Telemetry-001 (HIGH): RedactionEnricher now honours property removal, so a
  redactor that drops a key actually scrubs the secret from the event.
- Auth: LDAP validator ValidateOnStart; API-key verify no longer fails on a
  best-effort MarkUsed write or a corrupt scopes column (fail-closed); LDAP cert
  validation hook; KeyPrefix persistence aligned; README algorithm corrected.
- Health: Akka checks return Degraded (not throw) when the cluster isn't up yet;
  GrpcDependencyHealthCheck catch-all; null 'description' rendered; composite
  endpoint builder; XML docs shipped.
- Audit: CompositeAuditWriter no longer re-throws OperationCanceledException;
  TruncatingAuditRedactor over-redact scrubs Target + safe negative max; options
  record; XML docs shipped.
- Configuration: TryAddEnumerable idempotent registration; consistent port
  quoting; strict invariant port parsing; XML docs + README packaged.
- Theme: mobile toggle is now CSS-only (no Bootstrap JS); token/CSS hygiene;
  XML docs on the public parameter surface.

Shared-contract/spec docs updated where the code was the source of truth
(observability service.instance.id, MapZbMetrics, redactor reach). All changes
additive/back-compatible at v0.1.0. code-reviews bookkeeping follows separately.
2026-06-01 11:22:14 -04:00
Joseph Doherty 26ba1c7215 Baseline code review of the six ZB.MOM.WW.* shared libraries
All six libraries reviewed at commit 5f75cd4 against their components/ specs,
following code-reviews/REVIEW-PROCESS.md. 35 findings (0 Critical, 1 High,
9 Medium, 25 Low); none block adoption.

- Auth      0/0/3/3  (security core sound; startup-validation + key-verify contract gaps)
- Telemetry 0/1/2/5  (HIGH Telemetry-001: redactor 'remove' is a no-op -> secrets reach sinks)
- Health    0/0/2/4  (Akka checks throw instead of Degraded when cluster not yet up)
- Theme     0/0/1/5  (undocumented Bootstrap-collapse JS dep; token/CSS hygiene)
- Audit     0/0/1/4  (composite re-throws OCE vs never-throw writer contract)
- Configuration 0/0/0/4 (DI idempotency, port-parse strictness, packaging)

Cross-cutting: XML docs authored but GenerateDocumentationFile unset -> docs
not shipped in any nupkg (Auth/Health/Telemetry/Configuration/Audit).

README.md regenerated from the per-library findings; regen-readme.py --check passes.
2026-06-01 11:08:12 -04:00
Joseph Doherty 5f75cd4dab Add per-library code-review scaffolding for the ZB.MOM.WW.* shared libs
Adapts the code-reviews convention (process, README generator, template) from
the ScadaBridge app model (per-src/-module, Akka conventions) to scadaproj's
reality: six shared libraries reviewed against their components/ specs.

- REVIEW-PROCESS.md: review unit is a library; library->component-spec mapping;
  checklist re-targeted for reusable .NET libs (public API/semver, packaging &
  dependency hygiene, spec/shared-contract adherence) instead of actor/supervision.
- _template/findings.md: library/packages/component-spec/shared-contract header.
- regen-readme.py: per-library prose, data-driven Summary, '-' for unreviewed.
- Seed Auth/Theme/Health/Telemetry/Configuration/Audit findings stubs (0 findings).
- README.md generated; --check passes.
2026-06-01 10:46:16 -04:00
82 changed files with 3958 additions and 208 deletions
@@ -1,8 +1,9 @@
namespace ZB.MOM.WW.Audit;
/// <summary>Fans an event out to several writers. Best-effort: a failing writer does not stop the others.</summary>
/// <remarks>A failing writer's exception is swallowed so the fan-out drains and the caller is never
/// aborted — but <see cref="OperationCanceledException"/> is re-thrown so cancellation is honored.</remarks>
/// <remarks>Every inner-writer failure is swallowed — including <see cref="OperationCanceledException"/>
/// — so the fan-out drains and the caller is never aborted, honoring the <see cref="IAuditWriter"/>
/// "must not throw to the caller" contract even when a request-scoped cancellation token is passed.</remarks>
public sealed class CompositeAuditWriter : IAuditWriter
{
private readonly IReadOnlyList<IAuditWriter> _inner;
@@ -21,8 +22,7 @@ public sealed class CompositeAuditWriter : IAuditWriter
foreach (var writer in _inner)
{
try { await writer.WriteAsync(evt, ct).ConfigureAwait(false); }
catch (OperationCanceledException) { throw; } // honor cancellation; do not swallow
catch { /* best-effort seam: a failing writer must not stop the others or the caller */ }
catch { /* best-effort seam: a failing writer (incl. cancellation) must not stop the others or the caller */ }
}
}
}
@@ -2,8 +2,9 @@ namespace ZB.MOM.WW.Audit;
/// <summary>
/// Redactor that caps oversized <see cref="AuditEvent.DetailsJson"/> and <see cref="AuditEvent.Target"/>.
/// Never throws — over-redacts (drops DetailsJson) on internal failure. The secret-field policy
/// (which fields are sensitive) stays per-project; compose this with a project redactor as needed.
/// Never throws — over-redacts (drops both DetailsJson and Target) on internal failure. The
/// secret-field policy (which fields are sensitive) stays per-project; compose this with a project
/// redactor as needed.
/// </summary>
public sealed class TruncatingAuditRedactor : IAuditRedactor
{
@@ -26,13 +27,15 @@ public sealed class TruncatingAuditRedactor : IAuditRedactor
}
catch
{
// Hard contract: never throw. Over-redact on internal failure.
return rawEvent with { DetailsJson = null };
// Hard contract: never throw, and over-redact to a STRICTLY safer event on internal
// failure — scrub every field this redactor owns (both DetailsJson and Target).
return rawEvent with { DetailsJson = null, Target = null };
}
}
private string? Truncate(string? value, int max)
{
if (max < 0) max = 0; // clamp nonsensical negative caps so a config bug fails safe, not throws
if (value is null || value.Length <= max) return value;
var marker = _options.TruncationMarker;
if (marker.Length >= max) return marker[..max];
@@ -1,12 +1,12 @@
namespace ZB.MOM.WW.Audit;
/// <summary>Caps for <see cref="TruncatingAuditRedactor"/>.</summary>
public sealed class TruncatingAuditRedactorOptions
/// <summary>Immutable caps for <see cref="TruncatingAuditRedactor"/>.</summary>
public sealed record TruncatingAuditRedactorOptions
{
/// <summary>Max length of <see cref="AuditEvent.DetailsJson"/> before truncation. Default 4096.</summary>
public int MaxDetailsJsonLength { get; set; } = 4096;
public int MaxDetailsJsonLength { get; init; } = 4096;
/// <summary>Max length of <see cref="AuditEvent.Target"/> before truncation. Default 512.</summary>
public int MaxTargetLength { get; set; } = 512;
public int MaxTargetLength { get; init; } = 512;
/// <summary>Marker appended to a truncated value. Default "…[truncated]".</summary>
public string TruncationMarker { get; set; } = "…[truncated]";
public string TruncationMarker { get; init; } = "…[truncated]";
}
@@ -3,6 +3,8 @@
<TargetFramework>net10.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
<!-- Emit and pack XML docs so consumers get IntelliSense/tooltip documentation. -->
<GenerateDocumentationFile>true</GenerateDocumentationFile>
</PropertyGroup>
<PropertyGroup>
<IsPackable>true</IsPackable>
@@ -38,11 +38,32 @@ public class CompositeAuditWriterTests
}
[Fact]
public async Task Cancellation_is_propagated_not_swallowed()
public async Task Cancellation_does_not_surface_to_the_caller()
{
// OperationCanceledException is re-thrown (unlike ordinary writer failures, which are swallowed).
// Per the IAuditWriter hard contract ("must not throw to the caller"), an
// OperationCanceledException from an inner writer is swallowed like any other failure —
// it must NOT abort the user-facing action that produced the event.
var after = new RecordingWriter();
var sut = new CompositeAuditWriter(new IAuditWriter[] { new CancellingWriter(), after });
await Assert.ThrowsAsync<OperationCanceledException>(() => sut.WriteAsync(Evt()));
await sut.WriteAsync(Evt()); // must not throw
Assert.Equal(1, after.Count); // drain continues past the cancelled writer
}
[Fact]
public async Task Empty_writer_list_is_a_no_op()
{
var sut = new CompositeAuditWriter(Array.Empty<IAuditWriter>());
await sut.WriteAsync(Evt()); // must not throw
}
[Fact]
public async Task Null_writer_entry_is_swallowed_and_does_not_stop_the_others()
{
// A null inner writer faults the await; the best-effort seam swallows it (like any
// other writer failure) and continues draining the remaining writers.
var after = new RecordingWriter();
var sut = new CompositeAuditWriter(new IAuditWriter?[] { null, after }!);
await sut.WriteAsync(Evt()); // must not throw
Assert.Equal(1, after.Count);
}
}
@@ -53,4 +53,30 @@ public class TruncatingAuditRedactorTests
var result = r.Apply(Evt(new string('x', 20)));
Assert.Equal(3, result.DetailsJson!.Length);
}
[Fact]
public void Negative_max_is_treated_as_zero_and_does_not_throw()
{
// A negative cap is nonsensical misconfiguration. Truncate must clamp to 0 rather than
// throw, capping the value to the empty string (plus marker handling).
var opts = new TruncatingAuditRedactorOptions { MaxDetailsJsonLength = -5, MaxTargetLength = -1, TruncationMarker = "" };
var r = new TruncatingAuditRedactor(opts);
var result = r.Apply(Evt(new string('x', 20), target: new string('y', 20)));
Assert.Equal(string.Empty, result.DetailsJson);
Assert.Equal(string.Empty, result.Target);
}
[Fact]
public void Over_redact_fallback_scrubs_both_details_and_target_without_throwing()
{
// Drive the REAL TruncatingAuditRedactor.Apply into its catch branch via a reachable
// misconfiguration (a null TruncationMarker faults inside Truncate). The over-redact
// fallback must be strictly safer: BOTH DetailsJson AND Target scrubbed to null, no throw.
var opts = new TruncatingAuditRedactorOptions { MaxDetailsJsonLength = 5, TruncationMarker = null! };
var r = new TruncatingAuditRedactor(opts);
var raw = Evt(new string('x', 50), target: "sensitive target");
var result = r.Apply(raw);
Assert.Null(result.DetailsJson);
Assert.Null(result.Target);
}
}
+2 -2
View File
@@ -10,8 +10,8 @@ Authentication and authorisation libraries for the **ZB.MOM.WW SCADA family** (O
|---|---|---|
| `ZB.MOM.WW.Auth.Abstractions` | Auth contracts, canonical role constants, and shared types (`LdapOptions`, `LdapAuthResult`, `ILdapAuthService`, `IApiKeyStore`). No runtime dependencies beyond the BCL. | — |
| `ZB.MOM.WW.Auth.Ldap` | LDAP authentication service: bind-then-search-then-bind against GLAuth or Active Directory; RFC 4514-aware group extraction; fail-closed. | `Abstractions`, `Novell.Directory.Ldap.NETStandard` |
| `ZB.MOM.WW.Auth.ApiKeys` | SQLite-backed API-key store with pepper-based PBKDF2 hashing, rotation, and audit log. Includes a `MigrationHostedService` that runs schema migrations on startup. | `Abstractions`, `Microsoft.Data.Sqlite` |
| `ZB.MOM.WW.Auth.AspNetCore` | ASP.NET Core DI helpers (`AddZbAuth`), cookie defaults, claim-type constants, and `LdapOptionsValidator` registration. Wires together Ldap + ApiKeys + cookie middleware. | `Abstractions`, `Ldap`, `ApiKeys`, `Microsoft.AspNetCore.App` |
| `ZB.MOM.WW.Auth.ApiKeys` | SQLite-backed API-key store with **pepper-keyed HMAC-SHA256** secret hashing, rotation, and audit log. DI wiring is `AddZbApiKeyAuth`; an opt-in `MigrationHostedService` runs schema migrations on startup. | `Abstractions`, `Microsoft.Data.Sqlite` |
| `ZB.MOM.WW.Auth.AspNetCore` | ASP.NET Core wiring for the **LDAP** provider only: `AddZbLdapAuth` (binds + start-time-validates `LdapOptions`, registers `ILdapAuthService`), plus `ZbCookieDefaults.Apply` (hardened cookie helper the consumer calls itself) and `ZbClaimTypes` constants. It does **not** wire API keys or cookie middleware — API-key DI is `AddZbApiKeyAuth` in the `ApiKeys` package. | `Abstractions`, `Ldap`, `Microsoft.AspNetCore.App` |
---
@@ -1,3 +1,5 @@
using System.Net.Security;
namespace ZB.MOM.WW.Auth.Abstractions.Ldap;
public enum LdapTransport { Ldaps, StartTls, None }
@@ -16,6 +18,16 @@ public sealed record LdapOptions
public string DisplayNameAttribute { get; init; } = "cn";
public string GroupAttribute { get; init; } = "memberOf";
public int ConnectionTimeoutMs { get; init; } = 10_000;
/// <summary>
/// Optional hook to harden (or, in dev, relax) TLS server-certificate validation for the
/// <see cref="LdapTransport.Ldaps"/> / <see cref="LdapTransport.StartTls"/> transports. When
/// <see langword="null"/> (the default) the LDAP client validates the server certificate against
/// the OS trust store — it does <em>not</em> blind-accept. Supply a callback to pin a CA, validate
/// the SAN against <see cref="Server"/>, or otherwise tighten validation. This is a code-only seam
/// (not bound from configuration) and takes precedence over <see cref="AllowInsecure"/>.
/// </summary>
public RemoteCertificateValidationCallback? ServerCertificateValidationCallback { get; init; }
}
public enum LdapAuthFailure { BadCredentials, UserNotFound, AmbiguousUser, GroupLookupFailed, ServiceAccountBindFailed, Disabled }
@@ -101,7 +101,10 @@ public sealed class ApiKeyAdminCommands
var record = new ApiKeyRecord(
KeyId: keyId,
KeyPrefix: $"{_options.TokenPrefix}_{keyId}",
// KeyPrefix is the bare token prefix (e.g. "mxgw"), NOT prefix_keyId — the key id is
// already its own column. Embedding it here produced a self-referential value that
// confused admin tooling and disagreed with the read/test paths (see Auth-005).
KeyPrefix: _options.TokenPrefix,
SecretHash: secretHash,
DisplayName: displayName,
Scopes: scopes,
@@ -62,8 +62,24 @@ public sealed class ApiKeyVerifier(
return Fail(ApiKeyFailure.SecretMismatch);
}
// 6. Record successful use, then return the identity (no secret/hash/pepper included).
await store.MarkUsedAsync(record.KeyId, _timeProvider.GetUtcNow(), ct).ConfigureAwait(false);
// 6. The authentication decision is already made (line 60). Recording last-used is
// best-effort bookkeeping: a transient storage hiccup (SQLITE_BUSY past the busy-timeout,
// disk full, DB locked by a migration) must NOT turn an otherwise-valid credential into a
// failed auth. Swallow any non-cancellation failure so the only exception path remains
// cancellation, as the class contract promises. Cancellation is honoured (re-thrown).
try
{
await store.MarkUsedAsync(record.KeyId, _timeProvider.GetUtcNow(), ct).ConfigureAwait(false);
}
catch (OperationCanceledException)
{
throw;
}
catch
{
// Best-effort: the last-used write failed, but the credential is valid. Fail open on the
// bookkeeping (not the auth decision) rather than denying a legitimate caller.
}
return new ApiKeyVerification(
Succeeded: true,
@@ -20,7 +20,12 @@ public static class ScopeSerializer
/// <summary>Deserializes scopes from a JSON array string.</summary>
/// <param name="value">The JSON string to deserialize; may be null or empty.</param>
/// <returns>An ordinal-compared set of scopes; empty when the input is null/blank.</returns>
/// <returns>
/// An ordinal-compared set of scopes; empty when the input is null/blank. A malformed or
/// non-array column (operator tampering, a partial write, a format change, or a buggy writer)
/// fails closed to an EMPTY set rather than throwing, so a single poisoned row degrades to a
/// zero-scope identity on the auth path instead of an unhandled <see cref="JsonException"/>.
/// </returns>
public static IReadOnlySet<string> Deserialize(string? value)
{
if (string.IsNullOrWhiteSpace(value))
@@ -28,7 +33,18 @@ public static class ScopeSerializer
return new HashSet<string>(StringComparer.Ordinal);
}
string[]? scopes = JsonSerializer.Deserialize<string[]>(value);
string[]? scopes;
try
{
scopes = JsonSerializer.Deserialize<string[]>(value);
}
catch (JsonException)
{
// Fail closed: a corrupt scopes column yields no scopes rather than an exception on the
// verification hot path. The verifier's "only exception path is cancellation" contract
// is preserved, and a key with an unreadable scope set is left with zero authority.
return new HashSet<string>(StringComparer.Ordinal);
}
return new HashSet<string>(scopes ?? [], StringComparer.Ordinal);
}
@@ -37,7 +37,14 @@ public static class ServiceCollectionExtensions
ArgumentNullException.ThrowIfNull(config);
ArgumentException.ThrowIfNullOrWhiteSpace(sectionPath);
services.Configure<LdapOptions>(config.GetSection(sectionPath));
// Bind via the options builder and opt into start-time validation. An IValidateOptions<T>
// otherwise only runs when the options are first materialized (IOptions<T>.Value) — which
// here is the first login (ILdapAuthService factory below), not boot. ValidateOnStart hooks
// the host's start-time options validation so a misconfigured directory (e.g. insecure
// transport without AllowInsecure) fails fast at startup rather than on first login.
services.AddOptions<LdapOptions>()
.Bind(config.GetSection(sectionPath))
.ValidateOnStart();
// Fail fast at startup on a misconfigured directory rather than on first login.
services.AddSingleton<IValidateOptions<LdapOptions>, LdapOptionsValidator>();
@@ -1,5 +1,6 @@
namespace ZB.MOM.WW.Auth.Ldap.Internal;
using System.Net.Security;
using ZB.MOM.WW.Auth.Abstractions.Ldap;
/// <summary>
@@ -15,8 +16,29 @@ internal sealed record LdapSearchEntry(
/// </summary>
internal interface ILdapConnection : IDisposable
{
/// <summary>Opens (and optionally upgrades to TLS) a connection to the given host.</summary>
void Connect(string host, int port, LdapTransport transport, bool allowInsecure, int timeoutMs);
/// <summary>
/// Opens (and optionally upgrades to TLS) a connection to the given host.
/// </summary>
/// <param name="host">The LDAP server hostname or IP.</param>
/// <param name="port">The LDAP server port.</param>
/// <param name="transport">The transport security mode.</param>
/// <param name="allowInsecure">
/// When <see langword="true"/> AND no <paramref name="serverCertificateValidationCallback"/> is
/// supplied, TLS server-certificate validation is bypassed (dev/test only). Ignored when a
/// validation callback is supplied (the callback wins) or for plaintext transport.
/// </param>
/// <param name="timeoutMs">The connection/operation timeout in milliseconds.</param>
/// <param name="serverCertificateValidationCallback">
/// Optional TLS server-certificate validation callback. When <see langword="null"/>, the OS trust
/// store is used (the client does not blind-accept).
/// </param>
void Connect(
string host,
int port,
LdapTransport transport,
bool allowInsecure,
int timeoutMs,
RemoteCertificateValidationCallback? serverCertificateValidationCallback);
/// <summary>Binds with the supplied DN and password. Throws <c>LdapException</c> on bad credentials.</summary>
void Bind(string dn, string password);
@@ -2,19 +2,67 @@ namespace ZB.MOM.WW.Auth.Ldap.Internal;
using Novell.Directory.Ldap;
using ZB.MOM.WW.Auth.Abstractions.Ldap;
// Disambiguate: Novell also declares a RemoteCertificateValidationCallback delegate; the seam and
// LdapConnectionOptions.ConfigureRemoteCertificateValidationCallback both use the BCL one.
using RemoteCertificateValidationCallback = System.Net.Security.RemoteCertificateValidationCallback;
/// <summary>
/// Production <see cref="ILdapConnection"/> backed by <c>Novell.Directory.Ldap.LdapConnection</c>.
/// Mirrors the connection/search idioms from ZB.MOM.WW.ScadaBridge.Security.LdapAuthService.
/// </summary>
/// <remarks>
/// TLS server-certificate validation: by default the underlying
/// <c>Novell.Directory.Ldap.NETStandard</c> client validates the server certificate against the OS
/// trust store (it does NOT blind-accept). A caller-supplied
/// <c>RemoteCertificateValidationCallback</c> overrides that default (CA pinning / SAN checks); when
/// none is supplied and <c>allowInsecure</c> is set, validation is bypassed for dev/test only.
/// </remarks>
internal sealed class NovellLdapConnection : ILdapConnection
{
private readonly LdapConnection _conn = new();
private readonly LdapConnection _conn;
private bool _disposed;
/// <inheritdoc/>
public void Connect(string host, int port, LdapTransport transport, bool allowInsecure, int timeoutMs)
/// <summary>
/// Builds the connection, wiring a TLS server-certificate validation policy: a supplied
/// <paramref name="serverCertificateValidationCallback"/> wins; otherwise <paramref name="allowInsecure"/>
/// bypasses validation (dev/test only); otherwise the OS-trust-store default applies.
/// </summary>
public NovellLdapConnection(
bool allowInsecure = false,
RemoteCertificateValidationCallback? serverCertificateValidationCallback = null)
{
if (serverCertificateValidationCallback is not null)
{
var options = new LdapConnectionOptions()
.ConfigureRemoteCertificateValidationCallback(serverCertificateValidationCallback);
_conn = new LdapConnection(options);
}
else if (allowInsecure)
{
// Dev/test only: accept any server certificate. Reachable solely when an operator has set
// AllowInsecure (rejected for plaintext-without-AllowInsecure by LdapOptionsValidator).
var options = new LdapConnectionOptions()
.ConfigureRemoteCertificateValidationCallback((_, _, _, _) => true);
_conn = new LdapConnection(options);
}
else
{
// Default: validate against the OS trust store (no blind-accept).
_conn = new LdapConnection();
}
}
/// <inheritdoc/>
public void Connect(
string host,
int port,
LdapTransport transport,
bool allowInsecure,
int timeoutMs,
RemoteCertificateValidationCallback? serverCertificateValidationCallback)
{
// The TLS-validation policy (allowInsecure / callback) is wired at construction time on the
// LdapConnectionOptions; the per-call arguments here are accepted for seam symmetry.
ApplyTimeout(timeoutMs);
// LDAPS: TLS is negotiated at the TCP-connection level.
@@ -98,8 +146,16 @@ internal sealed class NovellLdapConnection : ILdapConnection
}
}
/// <summary>Factory that produces fresh <see cref="NovellLdapConnection"/> instances.</summary>
internal sealed class NovellLdapConnectionFactory : ILdapConnectionFactory
/// <summary>
/// Factory that produces fresh <see cref="NovellLdapConnection"/> instances, carrying the TLS
/// server-certificate validation policy (a supplied callback, or an <c>allowInsecure</c> bypass) so
/// it is wired onto each connection at construction time.
/// </summary>
internal sealed class NovellLdapConnectionFactory(
bool allowInsecure = false,
RemoteCertificateValidationCallback? serverCertificateValidationCallback = null)
: ILdapConnectionFactory
{
public ILdapConnection Create() => new NovellLdapConnection();
public ILdapConnection Create() =>
new NovellLdapConnection(allowInsecure, serverCertificateValidationCallback);
}
@@ -26,10 +26,14 @@ public sealed class LdapAuthService : ILdapAuthService
/// <summary>
/// Production constructor: binds against a live directory via the real
/// Novell-backed connection factory.
/// Novell-backed connection factory. The TLS server-certificate validation policy
/// (<see cref="LdapOptions.ServerCertificateValidationCallback"/> or the
/// <see cref="LdapOptions.AllowInsecure"/> bypass) is carried into the factory so each
/// connection is built with it.
/// </summary>
public LdapAuthService(LdapOptions options)
: this(options, new NovellLdapConnectionFactory())
: this(options, new NovellLdapConnectionFactory(
options.AllowInsecure, options.ServerCertificateValidationCallback))
{
}
@@ -92,7 +96,13 @@ public sealed class LdapAuthService : ILdapAuthService
// Abstractions change could add DirectoryUnavailable to disambiguate.
try
{
conn.Connect(_options.Server, _options.Port, _options.Transport, _options.AllowInsecure, _options.ConnectionTimeoutMs);
conn.Connect(
_options.Server,
_options.Port,
_options.Transport,
_options.AllowInsecure,
_options.ConnectionTimeoutMs,
_options.ServerCertificateValidationCallback);
}
catch (LdapException)
{
@@ -87,6 +87,33 @@ public sealed class ApiKeyAdminCommandsTests : IAsyncLifetime
Assert.Single(recent, e => e.EventType == "create-key");
}
[Fact]
public async Task CreateKey_PersistsBareTokenPrefix_NotPrefixUnderscoreKeyId()
{
// Auth-005: KeyPrefix is the bare token prefix ("mxgw"), NOT "mxgw_key-1". The key id is
// already its own column; embedding it produced a self-referential value that disagreed with
// the read/test paths and confused admin tooling.
ApiKeyAdminCommands commands = BuildCommands();
await commands.InitDbAsync(null, CancellationToken.None);
await commands.CreateKeyAsync(
"key-1",
"Service A",
new HashSet<string>(["read"], StringComparer.Ordinal),
constraintsJson: null,
remoteAddress: null,
CancellationToken.None);
ApiKeyRecord? found = await _read.FindByKeyIdAsync("key-1", CancellationToken.None);
Assert.NotNull(found);
Assert.Equal("mxgw", found!.KeyPrefix);
// The same bare prefix is surfaced by the admin list projection.
IReadOnlyList<ApiKeyListItem> listed = await commands.ListKeysAsync(CancellationToken.None);
ApiKeyListItem item = Assert.Single(listed, k => k.KeyId == "key-1");
Assert.Equal("mxgw", item.KeyPrefix);
}
[Fact]
public async Task CreateKey_PepperUnavailable_ReturnsNoTokenAndAppendsNoAudit()
{
@@ -212,6 +212,51 @@ public class ApiKeyVerifierTests
Assert.DoesNotContain(Convert.ToBase64String(hash), identityText, StringComparison.Ordinal);
}
// --- Auth-002: a failed best-effort MarkUsedAsync must NOT fail a valid key ---
[Fact]
public async Task VerifyAsync_ValidKey_MarkUsedThrows_StillSucceeds()
{
// MarkUsedAsync is best-effort "last used" bookkeeping. A transient storage failure
// (SQLITE_BUSY, disk full, locked DB) must not turn an otherwise-valid credential into a
// failed auth: the decision is already made before the usage write. The verifier's contract
// is "the only exception path is cancellation", so a non-cancellation MarkUsedAsync failure
// is swallowed and the result is still Succeeded == true.
byte[] hash = ApiKeySecretHasher.Hash(Secret, Pepper);
var store = new FakeApiKeyStore
{
Record = BuildRecord(hash),
MarkUsedException = new InvalidOperationException("SQLITE_BUSY"),
};
var verifier = BuildVerifier(store, new FakePepperProvider(Pepper));
ApiKeyVerification result =
await verifier.VerifyAsync(Header(KeyId, Secret), CancellationToken.None);
Assert.True(result.Succeeded);
Assert.Null(result.Failure);
Assert.NotNull(result.Identity);
Assert.Equal(KeyId, result.Identity!.KeyId);
Assert.True(store.MarkUsedCalled);
}
[Fact]
public async Task VerifyAsync_MarkUsedThrowsOperationCanceled_Propagates()
{
// The ONLY exception path is cancellation: an OperationCanceledException from the usage
// write (e.g. the request was cancelled mid-write) is honoured and re-thrown, not swallowed.
byte[] hash = ApiKeySecretHasher.Hash(Secret, Pepper);
var store = new FakeApiKeyStore
{
Record = BuildRecord(hash),
MarkUsedException = new OperationCanceledException(),
};
var verifier = BuildVerifier(store, new FakePepperProvider(Pepper));
await Assert.ThrowsAnyAsync<OperationCanceledException>(
() => verifier.VerifyAsync(Header(KeyId, Secret), CancellationToken.None));
}
// --- Cancellation ---
[Fact]
@@ -253,6 +298,9 @@ public class ApiKeyVerifierTests
public string? MarkUsedKeyId { get; private set; }
public DateTimeOffset? MarkUsedWhenUtc { get; private set; }
/// <summary>When set, <see cref="MarkUsedAsync"/> throws this exception (after recording the call).</summary>
public Exception? MarkUsedException { get; set; }
public Task<ApiKeyRecord?> FindByKeyIdAsync(string keyId, CancellationToken ct)
{
FindByKeyIdCalled = true;
@@ -267,6 +315,11 @@ public class ApiKeyVerifierTests
MarkUsedCalled = true;
MarkUsedKeyId = keyId;
MarkUsedWhenUtc = whenUtc;
if (MarkUsedException is not null)
{
return Task.FromException(MarkUsedException);
}
return Task.CompletedTask;
}
}
@@ -164,6 +164,36 @@ public sealed class SqliteApiKeyStoreTests : IAsyncLifetime
Assert.Empty(ScopeSerializer.Deserialize(""));
}
// --- Auth-003: corrupt scopes JSON must fail closed (empty set), never throw JsonException ---
[Theory]
[InlineData("not json at all")]
[InlineData("{")]
[InlineData("{\"a\":1}")] // valid JSON, but an object, not a string[]
[InlineData("42")] // valid JSON, but a number
[InlineData("[\"read\",")] // truncated/partial write
public void ScopeSerializer_DeserializeMalformed_ReturnsEmptySet_DoesNotThrow(string value)
{
// A poisoned scopes column (tampering, partial write, format change, buggy writer) must
// degrade to a zero-scope set rather than throwing on the verification hot path.
IReadOnlySet<string> scopes = ScopeSerializer.Deserialize(value);
Assert.Empty(scopes);
}
[Fact]
public async Task FindByKeyId_CorruptScopesColumn_ReturnsRecordWithEmptyScopes_DoesNotThrow()
{
// Insert a row whose scopes column holds malformed (non-array) JSON, then read it through
// the store. The store must NOT propagate a JsonException out of FindByKeyIdAsync (which the
// verifier relies on for its "only exception path is cancellation" contract).
await InsertWithRawScopesAsync("key-corrupt", scopesJson: "{ this is not valid json");
ApiKeyRecord? found = await _store.FindByKeyIdAsync("key-corrupt", CancellationToken.None);
Assert.NotNull(found);
Assert.Empty(found!.Scopes);
}
private static ApiKeyRecord SampleRecord(string keyId) => new(
KeyId: keyId,
KeyPrefix: "mxgw_ab12",
@@ -213,6 +243,33 @@ public sealed class SqliteApiKeyStoreTests : IAsyncLifetime
await command.ExecuteNonQueryAsync(CancellationToken.None);
}
private async Task InsertWithRawScopesAsync(string keyId, string scopesJson)
{
// Writes the scopes column verbatim (NOT via ScopeSerializer.Serialize) so a malformed
// value can be persisted to simulate tampering / a partial or buggy write.
await using SqliteConnection connection =
await _factory.OpenConnectionAsync(CancellationToken.None);
await using SqliteCommand command = connection.CreateCommand();
command.CommandText = """
INSERT INTO api_keys (
key_id, key_prefix, secret_hash, display_name, scopes,
constraints, created_utc, last_used_utc, revoked_utc)
VALUES (
$key_id, $key_prefix, $secret_hash, $display_name, $scopes,
$constraints, $created_utc, $last_used_utc, $revoked_utc);
""";
command.Parameters.AddWithValue("$key_id", keyId);
command.Parameters.AddWithValue("$key_prefix", "mxgw");
command.Parameters.Add("$secret_hash", SqliteType.Blob).Value = new byte[] { 1, 2, 3 };
command.Parameters.AddWithValue("$display_name", "Corrupt Key");
command.Parameters.AddWithValue("$scopes", scopesJson);
command.Parameters.AddWithValue("$constraints", DBNull.Value);
command.Parameters.AddWithValue("$created_utc", DateTimeOffset.UnixEpoch.ToString("O"));
command.Parameters.AddWithValue("$last_used_utc", DBNull.Value);
command.Parameters.AddWithValue("$revoked_utc", DBNull.Value);
await command.ExecuteNonQueryAsync(CancellationToken.None);
}
public Task DisposeAsync()
{
SqliteConnection.ClearAllPools();
@@ -1,5 +1,6 @@
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.Auth.Abstractions.Ldap;
using ZB.MOM.WW.Auth.AspNetCore;
@@ -85,4 +86,52 @@ public class ServiceCollectionExtensionsTests
Assert.Contains(validators, v => v is LdapOptionsValidator);
}
// --- Auth-001: ValidateOnStart must run options validation at host startup, not first login ---
private static IConfiguration BuildInsecureConfiguration() =>
new ConfigurationBuilder()
.AddInMemoryCollection(new Dictionary<string, string?>
{
[$"{LdapSection}:Server"] = LdapServer,
[$"{LdapSection}:SearchBase"] = "dc=example,dc=com",
[$"{LdapSection}:ServiceAccountDn"] = "cn=svc,dc=example,dc=com",
// Plaintext transport without AllowInsecure: the validator must reject this.
[$"{LdapSection}:Transport"] = nameof(LdapTransport.None),
[$"{LdapSection}:AllowInsecure"] = "false",
})
.Build();
[Fact]
public async Task AddZbLdapAuth_StartingHost_FailsForInsecureConfig()
{
// The misconfiguration must surface at host start, not deferred until the first login
// (i.e. the first ILdapAuthService resolution). ValidateOnStart wires the host's
// start-time options validation, so StartAsync must throw OptionsValidationException.
IConfiguration config = BuildInsecureConfiguration();
using IHost host = new HostBuilder()
.ConfigureServices(services => services.AddZbLdapAuth(config, LdapSection))
.Build();
OptionsValidationException ex =
await Assert.ThrowsAsync<OptionsValidationException>(() => host.StartAsync());
Assert.Contains(nameof(LdapOptions.Transport), string.Join(" ", ex.Failures));
}
[Fact]
public async Task AddZbLdapAuth_StartingHost_SucceedsForSecureConfig()
{
// A valid (secure) config must start cleanly — proving ValidateOnStart does not reject
// well-formed options.
IConfiguration config = BuildConfiguration();
using IHost host = new HostBuilder()
.ConfigureServices(services => services.AddZbLdapAuth(config, LdapSection))
.Build();
await host.StartAsync();
await host.StopAsync();
}
}
@@ -1,3 +1,4 @@
using System.Net.Security;
using ZB.MOM.WW.Auth.Abstractions.Ldap;
using ZB.MOM.WW.Auth.Ldap.Internal;
@@ -19,6 +20,10 @@ internal sealed class FakeLdapConnection : ILdapConnection
// ---- observation -----
public (string Host, int Port, LdapTransport Transport, bool AllowInsecure, int TimeoutMs)? ConnectArgs { get; private set; }
/// <summary>The server-certificate validation callback passed to the most recent <see cref="Connect"/> call.</summary>
public RemoteCertificateValidationCallback? ConnectCertCallback { get; private set; }
public List<string> BoundDns { get; } = new();
/// <summary>
@@ -107,9 +112,16 @@ internal sealed class FakeLdapConnection : ILdapConnection
// ---- ILdapConnection -----
public void Connect(string host, int port, LdapTransport transport, bool allowInsecure, int timeoutMs)
public void Connect(
string host,
int port,
LdapTransport transport,
bool allowInsecure,
int timeoutMs,
RemoteCertificateValidationCallback? serverCertificateValidationCallback = null)
{
ConnectArgs = (host, port, transport, allowInsecure, timeoutMs);
ConnectCertCallback = serverCertificateValidationCallback;
if (_throwOnConnect)
throw new Novell.Directory.Ldap.LdapException(
"Directory unreachable", Novell.Directory.Ldap.LdapException.ConnectError, host);
@@ -1,3 +1,4 @@
using System.Net.Security;
using ZB.MOM.WW.Auth.Abstractions.Ldap;
using ZB.MOM.WW.Auth.Ldap;
@@ -80,6 +81,56 @@ public class LdapAuthServiceTests
Assert.Equal(LdapAuthFailure.Disabled, (await svc.AuthenticateAsync("a", "b", default)).Failure);
}
// --- Auth-006: TLS validation seam — allowInsecure is honoured and a cert-validation
// callback is threaded into the connection rather than being silently ignored. ---
[Fact]
public async Task Connect_ReceivesAllowInsecureFlag_FromOptions()
{
// The allowInsecure flag must reach the connection (it used to be an unused parameter).
var fake = new FakeLdapConnection().WithUserEntry(
"cn=alice,dc=x", memberOf: new[] { "cn=Engineers,ou=g,dc=x" });
var svc = new LdapAuthService(
Opts() with { AllowInsecure = true }, new FakeLdapConnectionFactory(fake));
await svc.AuthenticateAsync("alice", "pw", default);
Assert.NotNull(fake.ConnectArgs);
Assert.True(fake.ConnectArgs!.Value.AllowInsecure);
}
[Fact]
public async Task Connect_ReceivesConfiguredCertValidationCallback()
{
// A consumer-supplied RemoteCertificateValidationCallback must be passed through to the
// connection so production callers can pin a CA / validate the SAN — the seam no longer
// discards it.
RemoteCertificateValidationCallback callback = (_, _, _, _) => true;
var fake = new FakeLdapConnection().WithUserEntry(
"cn=alice,dc=x", memberOf: new[] { "cn=Engineers,ou=g,dc=x" });
var svc = new LdapAuthService(
Opts() with { ServerCertificateValidationCallback = callback },
new FakeLdapConnectionFactory(fake));
await svc.AuthenticateAsync("alice", "pw", default);
Assert.Same(callback, fake.ConnectCertCallback);
}
[Fact]
public async Task Connect_NoCertCallbackConfigured_PassesNull()
{
// Default: no callback configured -> null reaches the connection, which means the
// production adapter falls back to OS-trust-store validation (documented behaviour).
var fake = new FakeLdapConnection().WithUserEntry(
"cn=alice,dc=x", memberOf: new[] { "cn=Engineers,ou=g,dc=x" });
var svc = new LdapAuthService(Opts(), new FakeLdapConnectionFactory(fake));
await svc.AuthenticateAsync("alice", "pw", default);
Assert.Null(fake.ConnectCertCallback);
}
[Fact]
public async Task PreservesEscapedCommaInGroupName_OnRfc4514Dn()
{
@@ -6,5 +6,6 @@
<LangVersion>latest</LangVersion>
<Version>0.1.0</Version>
<ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
<GenerateDocumentationFile>true</GenerateDocumentationFile>
</PropertyGroup>
</Project>
@@ -1,3 +1,5 @@
using System.Globalization;
namespace ZB.MOM.WW.Configuration;
/// <summary>
@@ -16,11 +18,14 @@ internal static class Checks
/// <summary>
/// Validates a raw string as a TCP port (parse + range), returning <c>null</c> when valid.
/// Centralizes the port wording for callers that hold the raw config value.
/// Centralizes the port wording for callers that hold the raw config value. Parsing is strict
/// and culture-invariant (<see cref="NumberStyles.None"/>): a leading sign or surrounding
/// whitespace is rejected. Both the parse-failure and range-failure messages quote the offending
/// raw value so they read consistently.
/// </summary>
internal static string? PortValue(string? raw, string field) =>
int.TryParse(raw, out var port)
? Port(port, field)
int.TryParse(raw, NumberStyles.None, CultureInfo.InvariantCulture, out var port) && port is >= 1 and <= 65535
? null
: $"{field} must be between 1 and 65535 (was '{raw ?? "null"}')";
/// <summary>
@@ -33,7 +38,7 @@ internal static class Checks
var idx = value.LastIndexOf(':');
if (idx <= 0 || idx == value.Length - 1
|| value.AsSpan(0, idx).Contains(':')
|| !int.TryParse(value[(idx + 1)..], out var port)
|| !int.TryParse(value[(idx + 1)..], NumberStyles.None, CultureInfo.InvariantCulture, out var port)
|| port is < 1 or > 65535)
return $"{field} must be 'host:port' with port 1-65535 (was '{value}')";
return null;
@@ -1,5 +1,6 @@
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.DependencyInjection.Extensions;
using Microsoft.Extensions.Options;
namespace ZB.MOM.WW.Configuration;
@@ -33,7 +34,7 @@ public static class ServiceCollectionExtensions
ArgumentNullException.ThrowIfNull(configuration);
ArgumentException.ThrowIfNullOrWhiteSpace(sectionPath);
services.AddSingleton<IValidateOptions<TOptions>, TValidator>();
services.TryAddEnumerable(ServiceDescriptor.Singleton<IValidateOptions<TOptions>, TValidator>());
return services.AddOptions<TOptions>()
.Bind(configuration.GetSection(sectionPath))
.ValidateOnStart();
@@ -7,7 +7,11 @@
<PackageTags>configuration;options;validation;ivalidateoptions;validateonstart;startup;scada;wonderware;zb-mom-ww</PackageTags>
<PackageProjectUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-configuration</PackageProjectUrl>
<RepositoryUrl>https://gitea.dohertylan.com/dohertj2/zb-mom-ww-configuration</RepositoryUrl>
<PackageReadmeFile>README.md</PackageReadmeFile>
</PropertyGroup>
<ItemGroup>
<None Include="..\..\README.md" Pack="true" PackagePath="\" />
</ItemGroup>
<ItemGroup>
<PackageReference Include="Microsoft.Extensions.Options" />
<PackageReference Include="Microsoft.Extensions.Options.ConfigurationExtensions" />
@@ -44,4 +44,24 @@ public sealed class AddValidatedOptionsTests
Assert.Equal("central", opts.Name);
await host.StopAsync();
}
[Fact]
public void Calling_twice_registers_validator_once()
{
var config = new ConfigurationBuilder()
.AddInMemoryCollection(new Dictionary<string, string?> { ["Node:Port"] = "0", ["Node:Name"] = "" })
.Build();
var services = new ServiceCollection();
services.AddValidatedOptions<NodeOptions, NodeValidator>(config, "Node");
services.AddValidatedOptions<NodeOptions, NodeValidator>(config, "Node");
using var provider = services.BuildServiceProvider();
var validators = provider.GetServices<IValidateOptions<NodeOptions>>().ToArray();
Assert.Single(validators);
// Resolving the options surfaces each accumulated failure exactly once, not doubled.
var ex = Assert.Throws<OptionsValidationException>(
() => provider.GetRequiredService<IOptions<NodeOptions>>().Value);
Assert.Equal(2, ex.Failures.Count());
}
}
@@ -0,0 +1,92 @@
using Microsoft.Extensions.Configuration;
using ZB.MOM.WW.Configuration;
namespace ZB.MOM.WW.Configuration.Tests;
/// <summary>
/// Pins the exact failure-message wording produced by the shared <c>Checks</c> seam through its
/// public front-ends (<see cref="ConfigPreflight"/> for raw port values, <see cref="ValidationBuilder"/>
/// for host:port endpoints). Covers Configuration-002 (consistent quoting) and Configuration-003
/// (strict, culture-invariant port parsing).
/// </summary>
public sealed class ChecksWordingTests
{
private static IConfiguration Config(string key, string? value) =>
new ConfigurationBuilder()
.AddInMemoryCollection(new Dictionary<string, string?> { [key] = value })
.Build();
private static string PortFailure(string? rawValue)
{
var pf = ConfigPreflight.For(Config("X:Port", rawValue)).RequirePort("X:Port");
return Assert.Single(pf.Failures);
}
// Configuration-002: range failure and parse failure must quote the offending value the same way.
[Fact]
public void PortValue_range_failure_quotes_the_value()
{
Assert.Equal("X:Port must be between 1 and 65535 (was '0')", PortFailure("0"));
}
[Fact]
public void PortValue_high_range_failure_quotes_the_value()
{
Assert.Equal("X:Port must be between 1 and 65535 (was '70000')", PortFailure("70000"));
}
[Fact]
public void PortValue_parse_failure_quotes_the_value()
{
Assert.Equal("X:Port must be between 1 and 65535 (was 'notaport')", PortFailure("notaport"));
}
[Fact]
public void PortValue_null_failure_renders_null()
{
Assert.Equal("X:Port must be between 1 and 65535 (was 'null')", PortFailure(null));
}
// Configuration-003: strict, culture-invariant parsing rejects sign and surrounding whitespace.
[Theory]
[InlineData("+5000")]
[InlineData(" 5000")]
[InlineData("5000 ")]
[InlineData(" 5000 ")]
[InlineData("-1")]
public void PortValue_rejects_loose_inputs(string raw)
{
var pf = ConfigPreflight.For(Config("X:Port", raw)).RequirePort("X:Port");
Assert.False(pf.IsValid);
Assert.Equal($"X:Port must be between 1 and 65535 (was '{raw}')", Assert.Single(pf.Failures));
}
[Fact]
public void PortValue_accepts_plain_in_range_port()
{
var pf = ConfigPreflight.For(Config("X:Port", "5000")).RequirePort("X:Port");
Assert.True(pf.IsValid);
}
[Theory]
[InlineData("host:+5000")]
[InlineData("host: 5000")]
[InlineData("host:5000 ")]
public void HostPort_rejects_loose_port_inputs(string value)
{
var b = new ValidationBuilder();
b.HostPort(value, "X:Endpoint");
Assert.False(b.IsValid);
Assert.Equal($"X:Endpoint must be 'host:port' with port 1-65535 (was '{value}')", Assert.Single(b.Failures));
}
[Fact]
public void HostPort_accepts_plain_endpoint()
{
var b = new ValidationBuilder();
b.HostPort("host:5000", "X:Endpoint");
Assert.True(b.IsValid);
}
}
@@ -1,6 +1,8 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<IsPackable>false</IsPackable>
<!-- Test project does not ship; no XML docs required (overrides Directory.Build.props). -->
<GenerateDocumentationFile>false</GenerateDocumentationFile>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="coverlet.collector" />
+5
View File
@@ -7,6 +7,11 @@
<LangVersion>latest</LangVersion>
<Version>0.1.0</Version>
<ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
<!-- Emit XML docs so the public API summaries ship inside the packed nupkgs (IntelliSense for
consumers). CS1591 (missing doc on a public member) is suppressed so undocumented test /
non-packed members do not break the build; the src public surface is fully documented. -->
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<NoWarn>$(NoWarn);CS1591</NoWarn>
</PropertyGroup>
</Project>
@@ -103,27 +103,41 @@ public sealed class ActiveNodeHealthCheck : IHealthCheck
if (system is null)
return Task.FromResult(HealthCheckResult.Degraded("ActorSystem not yet available."));
var cluster = Cluster.Get(system);
var self = cluster.SelfMember;
var selfUp = self.Status == MemberStatus.Up;
MemberStatus selfStatus;
bool selfUp;
bool hasRole;
bool isLeader;
if (_role is null)
try
{
hasRole = false;
var leader = cluster.State.Leader;
isLeader = leader is not null && leader == self.Address;
// Reading cluster membership can throw while the ActorSystem exists but the cluster has
// not finished initialising (e.g. Akka.Cluster not yet configured →
// ConfigurationException). The spec's startup-safety rule maps this to Degraded rather
// than letting the exception escape (which the host would record as Unhealthy).
var cluster = Cluster.Get(system);
var self = cluster.SelfMember;
selfStatus = self.Status;
selfUp = selfStatus == MemberStatus.Up;
if (_role is null)
{
hasRole = false;
var leader = cluster.State.Leader;
isLeader = leader is not null && leader == self.Address;
}
else
{
hasRole = self.HasRole(_role);
var roleLeader = cluster.State.RoleLeader(_role);
isLeader = roleLeader is not null && roleLeader == self.Address;
}
}
else
catch (Exception ex) when (ex is not OperationCanceledException)
{
hasRole = self.HasRole(_role);
var roleLeader = cluster.State.RoleLeader(_role);
isLeader = roleLeader is not null && roleLeader == self.Address;
return Task.FromResult(HealthCheckResult.Degraded("Akka cluster state not yet accessible.", ex));
}
var health = ActiveNodeDecision.Evaluate(selfUp, isLeader, hasRole, _role);
var description = DescribeResult(health, self.Status, selfUp, isLeader);
var description = DescribeResult(health, selfStatus, selfUp, isLeader);
var result = health switch
{
HealthStatus.Healthy => HealthCheckResult.Healthy(description),
@@ -8,7 +8,8 @@ namespace ZB.MOM.WW.Health.Akka;
/// <summary>
/// Health check that maps the local node's Akka cluster membership status to a
/// <see cref="HealthStatus"/> through a configurable <see cref="AkkaClusterStatusPolicy"/>.
/// Register to the <see cref="ZbHealthTags.Ready"/> tag (recommended <c>[ready, active]</c>).
/// Register to the <see cref="ZbHealthTags.Ready"/> tag only — cluster membership is a readiness
/// concern; the <see cref="ZbHealthTags.Active"/> tier is reserved for the leader / active-node probe.
/// </summary>
/// <remarks>
/// The <see cref="ActorSystem"/> is resolved lazily from the service provider. If it is not yet
@@ -42,7 +43,21 @@ public sealed class AkkaClusterHealthCheck : IHealthCheck
if (system is null)
return Task.FromResult(HealthCheckResult.Degraded("ActorSystem not yet available."));
var status = Cluster.Get(system).SelfMember.Status;
MemberStatus status;
try
{
// Cluster.Get(system).SelfMember can throw while the ActorSystem exists but the cluster
// has not finished initialising (e.g. Akka.Cluster not yet configured →
// ConfigurationException). The spec's startup-safety rule maps this to Degraded, not an
// escaping exception (which the host would record as Unhealthy and pull the node from
// rotation).
status = Cluster.Get(system).SelfMember.Status;
}
catch (Exception ex) when (ex is not OperationCanceledException)
{
return Task.FromResult(HealthCheckResult.Degraded("Akka cluster state not yet accessible.", ex));
}
var health = _policy.Evaluate(status);
var description = $"Akka cluster member status: {status}";
var result = health switch
@@ -13,14 +13,15 @@ namespace ZB.MOM.WW.Health;
/// The probe is injectable via <see cref="GrpcDependencyOptions.Probe"/>; the default drives the
/// channel to a connected state with <see cref="GrpcChannel.ConnectAsync"/>. The result is
/// <see cref="HealthStatus.Healthy"/> when the probe returns <c>true</c>, and
/// <see cref="HealthStatus.Unhealthy"/> when it returns <c>false</c>, throws an
/// <see cref="RpcException"/>, or times out / is cancelled within
/// <see cref="GrpcDependencyOptions.Timeout"/>.
/// <see cref="HealthStatus.Unhealthy"/> when it returns <c>false</c>, throws any exception
/// (<see cref="RpcException"/> or otherwise), or times out within
/// <see cref="GrpcDependencyOptions.Timeout"/>. External cancellation of the supplied
/// <see cref="CancellationToken"/> propagates as an <see cref="OperationCanceledException"/>.
/// </para>
/// <para>
/// Recommended registration tags: <see cref="ZbHealthTags.Ready"/> and
/// <see cref="ZbHealthTags.Active"/> — a missing downstream gRPC dependency makes the node both
/// not-ready and not-able-to-act. The registrant applies the tags.
/// Recommended registration tag: <see cref="ZbHealthTags.Ready"/> only — downstream gRPC
/// reachability is a readiness concern; the <see cref="ZbHealthTags.Active"/> tier is reserved for
/// the leader / active-node probe. The registrant applies the tag.
/// </para>
/// </remarks>
public sealed class GrpcDependencyHealthCheck : IHealthCheck
@@ -74,6 +75,15 @@ public sealed class GrpcDependencyHealthCheck : IHealthCheck
{
return HealthCheckResult.Unhealthy($"{name} probe timed out after {_options.Timeout}.", ex);
}
catch (Exception ex)
{
// Catch-all to match the sibling DatabaseHealthCheck: any other probe error
// (e.g. InvalidOperationException / HttpRequestException / SocketException from the
// transport, or anything a custom probe throws) maps to Unhealthy rather than escaping
// the IHealthCheck boundary. The OCE/Rpc external-cancellation handlers above run first,
// so caller cancellation still propagates.
return HealthCheckResult.Unhealthy($"{name} probe failed: {ex.Message}", ex);
}
}
/// <summary>
@@ -28,9 +28,9 @@ public static class ZbHealthEndpointExtensions
/// emits a minimal <c>200 OK</c> body.
/// </remarks>
/// <returns>
/// The <see cref="IEndpointConventionBuilder"/> for the readiness (<c>/health/ready</c>) endpoint.
/// A single tier is returned (rather than a composite) to keep the API simple; conventions
/// applied to the result affect only the readiness endpoint.
/// A composite <see cref="IEndpointConventionBuilder"/> that fans every chained convention out to
/// <em>all three</em> health endpoints (readiness, active, and liveness). For example,
/// <c>endpoints.MapZbHealth().RequireHost("…")</c> gates all three endpoints, as a caller expects.
/// </returns>
public static IEndpointConventionBuilder MapZbHealth(
this IEndpointRouteBuilder endpoints,
@@ -47,7 +47,7 @@ public static class ZbHealthEndpointExtensions
ResponseWriter = responseWriter,
}).AllowAnonymous();
endpoints.MapHealthChecks(options.ActivePath, new HealthCheckOptions
var active = endpoints.MapHealthChecks(options.ActivePath, new HealthCheckOptions
{
Predicate = static c => c.Tags.Contains(ZbHealthTags.Active),
ResponseWriter = responseWriter,
@@ -56,12 +56,38 @@ public static class ZbHealthEndpointExtensions
// Liveness: run no checks. The endpoint returns 200 as long as the process can respond.
// No JSON writer — the empty report would carry no useful data, so the framework default
// (a minimal plain-text body) is sufficient.
endpoints.MapHealthChecks(options.LivePath, new HealthCheckOptions
var live = endpoints.MapHealthChecks(options.LivePath, new HealthCheckOptions
{
Predicate = static _ => false,
}).AllowAnonymous();
return ready;
return new CompositeEndpointConventionBuilder(ready, active, live);
}
/// <summary>
/// An <see cref="IEndpointConventionBuilder"/> that forwards each convention to several
/// underlying builders, so conventions chained onto the result of
/// <see cref="MapZbHealth(IEndpointRouteBuilder, ZbHealthEndpointOptions?)"/> apply to all three
/// health endpoints rather than just one.
/// </summary>
private sealed class CompositeEndpointConventionBuilder : IEndpointConventionBuilder
{
private readonly IEndpointConventionBuilder[] _builders;
public CompositeEndpointConventionBuilder(params IEndpointConventionBuilder[] builders) =>
_builders = builders;
public void Add(Action<EndpointBuilder> convention)
{
foreach (var builder in _builders)
builder.Add(convention);
}
public void Finally(Action<EndpointBuilder> finalConvention)
{
foreach (var builder in _builders)
builder.Finally(finalConvention);
}
}
/// <summary>
@@ -70,7 +96,10 @@ public static class ZbHealthEndpointExtensions
/// </summary>
/// <param name="endpoints">The endpoint route builder to map onto.</param>
/// <param name="configure">Callback that mutates a fresh <see cref="ZbHealthEndpointOptions"/>.</param>
/// <returns>The <see cref="IEndpointConventionBuilder"/> for the readiness endpoint.</returns>
/// <returns>
/// A composite <see cref="IEndpointConventionBuilder"/> that fans chained conventions out to all
/// three health endpoints.
/// </returns>
public static IEndpointConventionBuilder MapZbHealth(
this IEndpointRouteBuilder endpoints,
Action<ZbHealthEndpointOptions> configure)
@@ -1,5 +1,4 @@
using System.Text.Json;
using System.Text.Json.Serialization;
using Microsoft.AspNetCore.Http;
using Microsoft.Extensions.Diagnostics.HealthChecks;
@@ -21,15 +20,21 @@ namespace ZB.MOM.WW.Health;
/// }
/// }
/// </code>
/// The HTTP status code is left to the ASP.NET Core health-checks middleware (Healthy/Degraded → 200,
/// Unhealthy → 503); this writer only renders the body and sets <c>Content-Type: application/json</c>.
/// The <c>description</c> key is always present; when a check supplies no description it is emitted
/// as JSON <c>null</c> (not omitted), matching the spec example and the <c>HealthChecks.UI.Client</c>
/// shape. The HTTP status code is left to the ASP.NET Core health-checks middleware (Healthy/Degraded
/// → 200, Unhealthy → 503); this writer only renders the body and sets
/// <c>Content-Type: application/json</c>.
/// </remarks>
public static class ZbHealthWriter
{
// Null properties are emitted (not omitted) so a null `description` renders as
// "description": null — matching the SPEC §3 example and the HealthChecks.UI.Client shape this
// writer mirrors. Consumers can then read entries.<name>.description without handling a missing
// property. (Do not set DefaultIgnoreCondition = WhenWritingNull here.)
private static readonly JsonSerializerOptions SerializerOptions = new()
{
PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull,
};
/// <summary>
@@ -86,6 +86,50 @@ public sealed class ActiveNodeDecisionTests
Assert.False(gate.IsActiveNode);
}
[Fact]
public async Task HealthCheck_RoleLess_ClusterInaccessible_ReturnsDegraded()
{
// ActorSystem present but Akka.Cluster not configured → Cluster.Get throws. The check must
// return Degraded (startup-safety rule), not let the exception escape (→ Unhealthy).
using var system = ActorSystem.Create("plain-no-cluster-roleless");
try
{
var provider = new ServiceCollection()
.AddSingleton(system)
.BuildServiceProvider();
var check = new ActiveNodeHealthCheck(provider);
var result = await check.CheckHealthAsync(NewContext(check));
Assert.Equal(HealthStatus.Degraded, result.Status);
}
finally
{
await system.Terminate();
}
}
[Fact]
public async Task HealthCheck_RoleFiltered_ClusterInaccessible_ReturnsDegraded()
{
using var system = ActorSystem.Create("plain-no-cluster-rolefiltered");
try
{
var provider = new ServiceCollection()
.AddSingleton(system)
.BuildServiceProvider();
var check = new ActiveNodeHealthCheck(provider, "admin");
var result = await check.CheckHealthAsync(NewContext(check));
Assert.Equal(HealthStatus.Degraded, result.Status);
}
finally
{
await system.Terminate();
}
}
private static HealthCheckContext NewContext(IHealthCheck check) => new()
{
Registration = new HealthCheckRegistration("active-node", check, HealthStatus.Unhealthy, tags: null),
@@ -1,3 +1,4 @@
using Akka.Actor;
using Akka.Cluster;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Diagnostics.HealthChecks;
@@ -70,6 +71,30 @@ public sealed class AkkaClusterStatusPolicyTests
Assert.Equal(HealthStatus.Degraded, result.Status);
}
[Fact]
public async Task HealthCheck_ActorSystemPresentButClusterInaccessible_ReturnsDegraded()
{
// A plain (non-clustered) ActorSystem exists in DI, but Akka.Cluster is not configured,
// so Cluster.Get(system) throws a ConfigurationException — the startup race the spec calls
// out. The check must return Degraded, not let the exception escape (→ Unhealthy via the host).
using var system = ActorSystem.Create("plain-no-cluster");
try
{
var provider = new ServiceCollection()
.AddSingleton(system)
.BuildServiceProvider();
var check = new AkkaClusterHealthCheck(provider, AkkaClusterStatusPolicy.Default);
var result = await check.CheckHealthAsync(NewContext(check));
Assert.Equal(HealthStatus.Degraded, result.Status);
}
finally
{
await system.Terminate();
}
}
private static HealthCheckContext NewContext(IHealthCheck check) => new()
{
Registration = new HealthCheckRegistration("akka-cluster", check, HealthStatus.Unhealthy, tags: null),
@@ -71,6 +71,21 @@ public sealed class GrpcDependencyHealthCheckTests
Assert.Contains("mxaccessgw worker", result.Description);
}
[Fact]
public async Task ProbeThrowsArbitraryException_Unhealthy()
{
// A non-RpcException / non-OperationCanceledException (e.g. the transport surfacing an
// InvalidOperationException) must be caught and mapped to Unhealthy, not allowed to escape.
var result = await RunAsync(new GrpcDependencyOptions
{
DependencyName = "mxaccessgw worker",
Probe = static (_, _) => throw new InvalidOperationException("channel disposed"),
});
Assert.Equal(HealthStatus.Unhealthy, result.Status);
Assert.Contains("mxaccessgw worker", result.Description);
}
[Fact]
public async Task ProbeExceedsTimeout_Unhealthy()
{
@@ -31,7 +31,7 @@ public sealed class ResponseWriterTests
}
private static async Task<HttpResponseMessage> GetReadyAsync(
HealthStatus status, string description = "db reachable")
HealthStatus status, string? description = "db reachable")
{
var builder = WebApplication.CreateBuilder();
builder.WebHost.UseTestServer();
@@ -66,6 +66,24 @@ public sealed class ResponseWriterTests
Assert.Equal("db reachable", db.GetProperty("description").GetString());
}
[Fact]
public async Task ReadyEndpoint_NullDescription_EmitsDescriptionKeyAsNull()
{
// A check that produces no description must still emit the "description" key with a JSON null
// value (matching the spec §3 example and the HealthChecks.UI.Client shape) rather than
// dropping the key — so consumers can read entries.<name>.description without handling a
// missing property.
var response = await GetReadyAsync(HealthStatus.Healthy, description: null);
Assert.Equal(HttpStatusCode.OK, response.StatusCode);
using var doc = JsonDocument.Parse(await response.Content.ReadAsStringAsync());
var db = doc.RootElement.GetProperty("entries").GetProperty("db");
Assert.True(db.TryGetProperty("description", out var description), "description key must be present");
Assert.Equal(JsonValueKind.Null, description.ValueKind);
}
[Fact]
public async Task ReadyEndpoint_Degraded_Returns200_WithDegradedStatus()
{
@@ -127,6 +127,31 @@ public sealed class TierMappingTests
Assert.Equal(0, active.Invocations);
}
[Fact]
public async Task ChainedConvention_AppliesToAllThreeEndpoints()
{
// MapZbHealth returns a composite builder, so a convention chained onto its result
// (.RequireHost) must gate all three endpoints — not just readiness. With a host filter
// that does not match the default test-client host, every tier returns 404.
var builder = WebApplication.CreateBuilder();
builder.WebHost.UseTestServer();
builder.Services.AddHealthChecks()
.AddCheck("ready-check", new RecordingHealthCheck(HealthStatus.Healthy), tags: new[] { ZbHealthTags.Ready })
.AddCheck("active-check", new RecordingHealthCheck(HealthStatus.Healthy), tags: new[] { ZbHealthTags.Active });
await using var app = builder.Build();
app.MapZbHealth().RequireHost("health.internal");
await app.StartAsync();
var client = app.GetTestClient();
// The default test host does not match "health.internal", so the convention removed every
// endpoint from this host — confirming it fanned out to all three, not just readiness.
Assert.Equal(HttpStatusCode.NotFound, (await client.GetAsync("/health/ready")).StatusCode);
Assert.Equal(HttpStatusCode.NotFound, (await client.GetAsync("/health/active")).StatusCode);
Assert.Equal(HttpStatusCode.NotFound, (await client.GetAsync("/healthz")).StatusCode);
}
[Fact]
public async Task Options_OverrideRoutePaths()
{
+17 -3
View File
@@ -55,6 +55,20 @@ Trace↔log correlation is automatic: `TraceContextEnricher` reads `Activity.Cur
log event and attaches `trace_id` and `span_id`, so log events produced inside a traced request
carry the same span identity as the trace backend.
**Redaction reach.** A registered `ILogRedactor` may **remove** or **replace** any value, and
`RedactionEnricher` honours both (a removed key is dropped from the event). Scalar properties appear
as their unwrapped CLR value; **destructured** properties are projected into mutable views the
redactor can descend into — a `{@Object}` is an `IDictionary<string, object?>` of its fields, a
logged collection an `IList<object?>`, a logged dictionary an `IDictionary<string, object?>` — all
recursively, so a field **nested inside** a destructured object can be masked or removed:
```csharp
if (properties["command"] is IDictionary<string, object?> command) command["apiKey"] = "***";
```
Structure type tags and dictionary keys are preserved on rebuild, and untouched properties are left
intact (not reallocated). See the `ILogRedactor` XML doc for the full contract.
---
## Exporter options
@@ -113,9 +127,9 @@ backend):
| Assembly | Tests |
|---|---|
| `ZB.MOM.WW.Telemetry.Tests` | 7 |
| `ZB.MOM.WW.Telemetry.Serilog.Tests` | 12 |
| **Total** | **19** |
| `ZB.MOM.WW.Telemetry.Tests` | 12 |
| `ZB.MOM.WW.Telemetry.Serilog.Tests` | 17 |
| **Total** | **29** |
---
@@ -11,6 +11,20 @@ public interface ILogRedactor
/// <summary>
/// Inspects and mutates the supplied log-event <paramref name="properties"/> in place — remove
/// or replace any sensitive values. Called on every log event before it reaches any sink.
/// Both removing a key (the property is dropped from the event) and replacing its value are
/// honoured by <see cref="RedactionEnricher"/>.
/// <para>
/// <b>Reach — top-level and nested.</b> A scalar property (e.g. <c>{apiKey}</c>) appears as its
/// unwrapped CLR value, which you can read and replace directly. A <b>destructured</b> property is
/// projected into a mutable view you can descend into: a <c>{@Object}</c> arrives as an
/// <c>IDictionary&lt;string, object?&gt;</c> of its fields, a logged collection as an
/// <c>IList&lt;object?&gt;</c>, and a logged dictionary as an <c>IDictionary&lt;string, object?&gt;</c>
/// keyed by the string form of each key — all recursively. You can therefore mask or remove a field
/// nested inside a destructured object, for example:
/// <code>if (properties["command"] is IDictionary&lt;string, object?&gt; command) command["apiKey"] = "***";</code>
/// The structure's type tag and a dictionary's original keys are preserved when the event is
/// rebuilt, and properties you do not touch are left intact.
/// </para>
/// </summary>
/// <param name="properties">The mutable property dictionary for the current log event.</param>
void Redact(IDictionary<string, object?> properties);
@@ -30,53 +30,309 @@ public sealed class RedactionEnricher : ILogEventEnricher
}
/// <summary>
/// Hands the log event's scalar properties to the registered <see cref="ILogRedactor"/> and
/// writes back any values the redactor changed. No-op when no redactor is registered.
/// Hands the log event's properties to the registered <see cref="ILogRedactor"/> and reconciles
/// the result back onto the event. No-op when no redactor is registered or the event carries no
/// properties.
/// <para>
/// Each property is projected into a mutable view the redactor can edit: a scalar is its
/// unwrapped value, and a structured value (<see cref="StructureValue"/> from <c>{@Object}</c>,
/// <see cref="SequenceValue"/>, <see cref="DictionaryValue"/>) becomes a nested
/// <see cref="IDictionary{TKey,TValue}"/> / <see cref="IList{T}"/> the redactor can descend into —
/// recursively — so a field nested inside a destructured object can be masked or removed. After
/// redaction each property is rebuilt and written back only when it actually changed; the
/// structure's type tag and a dictionary's original keys are preserved on rebuild, and keys the
/// redactor removed are deleted via <c>RemovePropertyIfPresent</c>. Properties the redactor does
/// not touch are left intact and are not reallocated.
/// </para>
/// </summary>
/// <param name="logEvent">The log event to redact.</param>
/// <param name="propertyFactory">Factory used to materialize replacement properties.</param>
/// <param name="propertyFactory">Unused; structured values are rebuilt directly into Serilog values.</param>
public void Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory)
{
ArgumentNullException.ThrowIfNull(logEvent);
ArgumentNullException.ThrowIfNull(propertyFactory);
var redactor = ResolveRedactor();
var redactor = _redactor.Value;
if (redactor is null)
{
return;
}
var snapshot = new Dictionary<string, object?>(logEvent.Properties.Count);
// Hot path: an event with no properties has nothing to redact — skip the snapshot copy.
if (logEvent.Properties.Count == 0)
{
return;
}
// Project every property into a mutable view. Scalars stay as their CLR value (zero extra
// allocation); structured values become nested dictionaries/lists carrying enough metadata
// (node kind, type tag, original dictionary keys) to be rebuilt faithfully.
var snapshot = new Dictionary<string, object?>(logEvent.Properties.Count, StringComparer.Ordinal);
foreach (var property in logEvent.Properties)
{
snapshot[property.Key] = property.Value is ScalarValue scalar
? scalar.Value
: property.Value;
snapshot[property.Key] = Project(property.Value);
}
// Capture the original key set so we can honour deletions: any key the redactor drops from
// the snapshot must be removed from the event (not silently retained).
var originalKeys = new HashSet<string>(snapshot.Keys, StringComparer.Ordinal);
redactor.Redact(snapshot);
foreach (var entry in snapshot)
{
if (HasChanged(logEvent, entry.Key, entry.Value))
// Rebuild the (possibly redacted) value and write it back only when it differs from what
// the event already holds, so an untouched property is never needlessly reallocated.
var rebuilt = Rebuild(entry.Value);
if (!logEvent.Properties.TryGetValue(entry.Key, out var existing) || !ValueEquals(existing, rebuilt))
{
logEvent.AddOrUpdateProperty(
propertyFactory.CreateProperty(entry.Key, entry.Value));
logEvent.AddOrUpdateProperty(new LogEventProperty(entry.Key, rebuilt));
}
}
// Reconcile removals: a redactor that deleted a key from the snapshot (e.g.
// properties.Remove("apiKey")) means that property must not reach any sink.
foreach (var key in originalKeys)
{
if (!snapshot.ContainsKey(key))
{
logEvent.RemovePropertyIfPresent(key);
}
}
}
private ILogRedactor? ResolveRedactor() => _redactor.Value;
private static bool HasChanged(LogEvent logEvent, string key, object? newValue)
/// <summary>
/// Projects an immutable Serilog value into a mutable view the redactor can edit. Scalars unwrap
/// to their CLR value; structures/sequences/dictionaries become nested mutable wrappers that
/// remember their kind (and a structure's type tag / a dictionary's original keys) for rebuild.
/// </summary>
private static object? Project(LogEventPropertyValue value)
{
if (!logEvent.Properties.TryGetValue(key, out var existing))
switch (value)
{
// Redactor added a brand-new property.
return true;
case ScalarValue scalar:
return scalar.Value;
case StructureValue structure:
var projectedStructure = new ProjectedStructure(structure.TypeTag, structure.Properties.Count);
foreach (var property in structure.Properties)
{
projectedStructure[property.Name] = Project(property.Value);
}
return projectedStructure;
case SequenceValue sequence:
var projectedSequence = new ProjectedSequence(sequence.Elements.Count);
foreach (var element in sequence.Elements)
{
projectedSequence.Add(Project(element));
}
return projectedSequence;
case DictionaryValue dictionary:
var projectedDictionary = new ProjectedDictionary(dictionary.Elements.Count);
foreach (var pair in dictionary.Elements)
{
var key = pair.Key.Value?.ToString() ?? NullKey;
projectedDictionary[key] = Project(pair.Value);
projectedDictionary.OriginalKeys[key] = pair.Key;
}
return projectedDictionary;
default:
// Unknown future LogEventPropertyValue subtype — pass the wrapper through untouched.
return value;
}
}
/// <summary>
/// Rebuilds a (possibly redacted) projected value back into an immutable Serilog value. The
/// inverse of <see cref="Project"/>; also accepts plain dictionaries/lists a redactor synthesised
/// and leaf CLR values it substituted.
/// </summary>
private static LogEventPropertyValue Rebuild(object? projected)
{
switch (projected)
{
case ProjectedStructure structure:
return new StructureValue(RebuildProperties(structure), structure.TypeTag);
case ProjectedDictionary dictionary:
var pairs = new List<KeyValuePair<ScalarValue, LogEventPropertyValue>>(dictionary.Count);
foreach (var entry in dictionary)
{
var key = dictionary.OriginalKeys.TryGetValue(entry.Key, out var original)
? original
: new ScalarValue(entry.Key);
pairs.Add(new KeyValuePair<ScalarValue, LogEventPropertyValue>(key, Rebuild(entry.Value)));
}
return new DictionaryValue(pairs);
case ProjectedSequence sequence:
return new SequenceValue(RebuildElements(sequence));
case IDictionary<string, object?> injected:
// A redactor synthesised a new structure (plain dictionary) — rebuild as a StructureValue.
return new StructureValue(RebuildProperties(injected));
case IList<object?> injectedList:
return new SequenceValue(RebuildElements(injectedList));
case LogEventPropertyValue raw:
// An unknown subtype passed through by Project, or a value the redactor set directly.
return raw;
default:
return new ScalarValue(projected);
}
}
private static List<LogEventProperty> RebuildProperties(IDictionary<string, object?> source)
{
var properties = new List<LogEventProperty>(source.Count);
foreach (var entry in source)
{
properties.Add(new LogEventProperty(entry.Key, Rebuild(entry.Value)));
}
var existingValue = existing is ScalarValue scalar ? scalar.Value : existing;
return !Equals(existingValue, newValue);
return properties;
}
private static List<LogEventPropertyValue> RebuildElements(IList<object?> source)
{
var elements = new List<LogEventPropertyValue>(source.Count);
foreach (var element in source)
{
elements.Add(Rebuild(element));
}
return elements;
}
/// <summary>
/// Structural equality between two Serilog values, used to skip writing back a property the
/// redactor left unchanged. Compares scalars by value and structures/sequences/dictionaries by
/// their contents (recursively); unknown kinds fall back to reference equality.
/// </summary>
private static bool ValueEquals(LogEventPropertyValue left, LogEventPropertyValue right)
{
switch (left)
{
case ScalarValue scalar when right is ScalarValue otherScalar:
return Equals(scalar.Value, otherScalar.Value);
case StructureValue structure when right is StructureValue otherStructure:
if (structure.TypeTag != otherStructure.TypeTag
|| structure.Properties.Count != otherStructure.Properties.Count)
{
return false;
}
foreach (var property in structure.Properties)
{
var match = FindProperty(otherStructure, property.Name);
if (match is null || !ValueEquals(property.Value, match.Value))
{
return false;
}
}
return true;
case SequenceValue sequence when right is SequenceValue otherSequence:
if (sequence.Elements.Count != otherSequence.Elements.Count)
{
return false;
}
for (var i = 0; i < sequence.Elements.Count; i++)
{
if (!ValueEquals(sequence.Elements[i], otherSequence.Elements[i]))
{
return false;
}
}
return true;
case DictionaryValue dictionary when right is DictionaryValue otherDictionary:
if (dictionary.Elements.Count != otherDictionary.Elements.Count)
{
return false;
}
foreach (var pair in dictionary.Elements)
{
var match = FindDictionaryValue(otherDictionary, pair.Key);
if (match is null || !ValueEquals(pair.Value, match))
{
return false;
}
}
return true;
default:
return ReferenceEquals(left, right);
}
}
private static LogEventProperty? FindProperty(StructureValue structure, string name)
{
foreach (var property in structure.Properties)
{
if (property.Name == name)
{
return property;
}
}
return null;
}
private static LogEventPropertyValue? FindDictionaryValue(DictionaryValue dictionary, ScalarValue key)
{
foreach (var pair in dictionary.Elements)
{
if (Equals(pair.Key.Value, key.Value))
{
return pair.Value;
}
}
return null;
}
private const string NullKey = "";
/// <summary>A destructured object projected to a mutable dictionary; preserves its type tag.</summary>
private sealed class ProjectedStructure : Dictionary<string, object?>
{
public ProjectedStructure(string? typeTag, int capacity)
: base(capacity, StringComparer.Ordinal) => TypeTag = typeTag;
public string? TypeTag { get; }
}
/// <summary>A logged dictionary projected to a mutable dictionary; preserves the original keys.</summary>
private sealed class ProjectedDictionary : Dictionary<string, object?>
{
public ProjectedDictionary(int capacity)
: base(capacity, StringComparer.Ordinal) =>
OriginalKeys = new Dictionary<string, ScalarValue>(capacity, StringComparer.Ordinal);
public Dictionary<string, ScalarValue> OriginalKeys { get; }
}
/// <summary>A logged collection projected to a mutable list.</summary>
private sealed class ProjectedSequence : List<object?>
{
public ProjectedSequence(int capacity) : base(capacity)
{
}
}
}
@@ -115,38 +115,13 @@ internal static class ZbSerilogConfig
}
/// <summary>
/// Builds the OTLP Resource-attribute map mirroring <c>ZbResource</c>. Null/empty optional
/// attributes are omitted, matching the shared Resource's omission rules. The
/// <c>service.instance.id</c> is sourced from <see cref="ZbResource.InstanceId"/> — the
/// same deterministic <c>MachineName:ProcessId</c> value used by the OTel SDK path — so
/// all three signals carry an identical instance identifier. Internal so it can be asserted
/// by the test assembly without being part of the public NuGet API.
/// Builds the OTLP log-sink Resource-attribute map. This is <em>not</em> a parallel
/// implementation: it is derived directly from <see cref="ZbResource.BuildAttributes"/> — the
/// single source of truth shared with the OTel SDK metrics/traces pipeline — so the log sink can
/// never drift from metrics and traces. Returned as a fresh mutable copy because the
/// Serilog OpenTelemetry sink takes ownership of the dictionary it is handed. Internal so it can
/// be asserted by the test assembly without being part of the public NuGet API.
/// </summary>
internal static IDictionary<string, object> BuildResourceAttributes(ZbTelemetryOptions options)
{
var attributes = new Dictionary<string, object>
{
["service.name"] = options.ServiceName,
["service.namespace"] = options.ServiceNamespace,
["service.instance.id"] = ZbResource.InstanceId,
["host.name"] = Environment.MachineName,
};
if (!string.IsNullOrEmpty(options.ServiceVersion))
{
attributes["service.version"] = options.ServiceVersion;
}
if (!string.IsNullOrEmpty(options.SiteId))
{
attributes["site.id"] = options.SiteId;
}
if (!string.IsNullOrEmpty(options.NodeRole))
{
attributes["node.role"] = options.NodeRole;
}
return attributes;
}
internal static IDictionary<string, object> BuildResourceAttributes(ZbTelemetryOptions options) =>
new Dictionary<string, object>(ZbResource.BuildAttributes(options), StringComparer.Ordinal);
}
@@ -65,6 +65,10 @@ public static class ZbSerilogExtensions
var options = new ZbTelemetryOptions();
configure(options);
// Fail fast on a malformed OTLP endpoint with a clear, named message — same validation the
// core AddZbTelemetry path uses — instead of a late error when the OTel log sink builds.
ZbTelemetryOptionsValidator.Validate(options, nameof(configure));
// Register the application logger in DI only. preserveStaticLogger: true ensures
// AddSerilog does NOT freeze or replace Log.Logger — critical for multi-host
// processes (integration tests etc.) where AddZbSerilog may be called more than once.
@@ -20,6 +20,17 @@
<FrameworkReference Include="Microsoft.AspNetCore.App" />
</ItemGroup>
<ItemGroup>
<!-- The Serilog companion package reuses the internal options validator (single fail-fast
path); its tests assert it too. -->
<AssemblyAttribute Include="System.Runtime.CompilerServices.InternalsVisibleTo">
<_Parameter1>ZB.MOM.WW.Telemetry.Serilog</_Parameter1>
</AssemblyAttribute>
<AssemblyAttribute Include="System.Runtime.CompilerServices.InternalsVisibleTo">
<_Parameter1>ZB.MOM.WW.Telemetry.Serilog.Tests</_Parameter1>
</AssemblyAttribute>
</ItemGroup>
<ItemGroup>
<PackageReference Include="OpenTelemetry.Extensions.Hosting" />
<PackageReference Include="OpenTelemetry.Exporter.Prometheus.AspNetCore" />
@@ -9,8 +9,10 @@ namespace ZB.MOM.WW.Telemetry;
public static class ZbMetricsEndpointExtensions
{
/// <summary>
/// Mounts the Prometheus <c>/metrics</c> endpoint. Only valid when
/// <see cref="ZbTelemetryOptions.Exporter"/> = <see cref="ZbExporter.Prometheus"/>.
/// Mounts the Prometheus <c>/metrics</c> endpoint. Valid under <em>any</em>
/// <see cref="ZbTelemetryOptions.Exporter"/> value: the Prometheus exporter is always wired by
/// <c>AddZbTelemetry</c>, and OTLP (<see cref="ZbExporter.Otlp"/>) is only an additive overlay —
/// so <c>/metrics</c> serves scrape data even when <c>Exporter = ZbExporter.Otlp</c>.
/// Call after <c>app.UseRouting()</c>.
/// </summary>
/// <param name="endpoints">The endpoint route builder.</param>
@@ -31,34 +31,55 @@ public static class ZbResource
Configure(ResourceBuilder.CreateDefault(), options);
/// <summary>
/// Applies the shared ZB.MOM.WW Resource attributes to an existing <see cref="ResourceBuilder"/>.
/// Internal seam so the <c>AddZbTelemetry</c> pipeline produces a Resource identical to
/// <see cref="Build"/>.
/// The single source of truth for the shared ZB.MOM.WW Resource attribute set. Every consumer
/// of the Resource — the OTel SDK metrics/traces pipeline (<see cref="Configure"/>) and the
/// Serilog OTLP log sink — derives its attributes from this one map, so logs can never drift
/// from metrics/traces. Required attributes (<c>service.name</c>, <c>service.namespace</c>,
/// <c>service.instance.id</c>, <c>host.name</c>) are always present; optional ones
/// (<c>service.version</c>, <c>site.id</c>, <c>node.role</c>) are included only when the
/// corresponding option is non-null/non-empty, matching the Resource's omission rules.
/// </summary>
internal static ResourceBuilder Configure(ResourceBuilder builder, ZbTelemetryOptions options)
/// <param name="options">The telemetry options describing the service identity.</param>
/// <returns>The canonical attribute map carried by all three signals.</returns>
public static IReadOnlyDictionary<string, object> BuildAttributes(ZbTelemetryOptions options)
{
builder.AddService(
serviceName: options.ServiceName,
serviceNamespace: options.ServiceNamespace,
serviceVersion: options.ServiceVersion,
autoGenerateServiceInstanceId: false,
serviceInstanceId: InstanceId);
ArgumentNullException.ThrowIfNull(options);
var attributes = new List<KeyValuePair<string, object>>
var attributes = new Dictionary<string, object>(StringComparer.Ordinal)
{
new("host.name", System.Environment.MachineName),
["service.name"] = options.ServiceName,
["service.namespace"] = options.ServiceNamespace,
["service.instance.id"] = InstanceId,
["host.name"] = System.Environment.MachineName,
};
if (!string.IsNullOrEmpty(options.ServiceVersion))
{
attributes["service.version"] = options.ServiceVersion;
}
if (!string.IsNullOrEmpty(options.SiteId))
{
attributes.Add(new("site.id", options.SiteId));
attributes["site.id"] = options.SiteId;
}
if (!string.IsNullOrEmpty(options.NodeRole))
{
attributes.Add(new("node.role", options.NodeRole));
attributes["node.role"] = options.NodeRole;
}
return attributes;
}
/// <summary>
/// Applies the shared ZB.MOM.WW Resource attributes to an existing <see cref="ResourceBuilder"/>.
/// Internal seam so the <c>AddZbTelemetry</c> pipeline produces a Resource identical to
/// <see cref="Build"/>. Derives every attribute from <see cref="BuildAttributes"/> — the same
/// canonical map the Serilog OTLP log sink uses — so all three signals agree.
/// </summary>
internal static ResourceBuilder Configure(ResourceBuilder builder, ZbTelemetryOptions options)
{
var attributes = BuildAttributes(options);
builder.AddAttributes(attributes);
return builder;
}
@@ -101,6 +101,7 @@ public static class ZbTelemetryExtensions
"ZbTelemetryOptions.ServiceName is required (e.g. \"otopcua\").",
nameof(configure));
}
ZbTelemetryOptionsValidator.Validate(options, nameof(configure));
return options;
}
@@ -0,0 +1,44 @@
namespace ZB.MOM.WW.Telemetry;
/// <summary>
/// Eager, fail-fast validation of <see cref="ZbTelemetryOptions"/> shared by the core
/// <c>AddZbTelemetry</c> path and the Serilog <c>AddZbSerilog</c> path, so a malformed value is
/// reported once, clearly, and with the offending option named — rather than surfacing late as a
/// bare <see cref="UriFormatException"/> deep inside exporter construction at host-build time.
/// </summary>
internal static class ZbTelemetryOptionsValidator
{
/// <summary>
/// Validates the OTLP configuration. When <see cref="ZbTelemetryOptions.Exporter"/> is
/// <see cref="ZbExporter.Otlp"/>, <see cref="ZbTelemetryOptions.OtlpEndpoint"/> must be a
/// non-empty, well-formed absolute URI. Throws an <see cref="ArgumentException"/> (naming the
/// option) otherwise. No-op for the Prometheus exporter — a stray endpoint is ignored there.
/// </summary>
/// <param name="options">The populated telemetry options to validate.</param>
/// <param name="paramName">The originating parameter name for the thrown exception.</param>
public static void Validate(ZbTelemetryOptions options, string paramName)
{
ArgumentNullException.ThrowIfNull(options);
if (options.Exporter != ZbExporter.Otlp)
{
return;
}
if (string.IsNullOrWhiteSpace(options.OtlpEndpoint))
{
throw new ArgumentException(
"ZbTelemetryOptions.OtlpEndpoint is required when Exporter = ZbExporter.Otlp " +
"(e.g. \"http://collector:4317\").",
paramName);
}
if (!Uri.TryCreate(options.OtlpEndpoint, UriKind.Absolute, out _))
{
throw new ArgumentException(
$"ZbTelemetryOptions.OtlpEndpoint is not a well-formed absolute URI: " +
$"'{options.OtlpEndpoint}' (expected e.g. \"http://collector:4317\").",
paramName);
}
}
}
@@ -25,11 +25,124 @@ public sealed class RedactionTests
}
}
private sealed class RemovingRedactor : ILogRedactor
{
private readonly string _key;
public RemovingRedactor(string key) => _key = key;
public void Redact(IDictionary<string, object?> properties) => properties.Remove(_key);
}
/// <summary>Masks a field nested inside a destructured <c>{@command}</c> object.</summary>
private sealed class NestedFieldRedactor : ILogRedactor
{
public void Redact(IDictionary<string, object?> properties)
{
if (properties.TryGetValue("command", out var value)
&& value is IDictionary<string, object?> command
&& command.ContainsKey("ApiKey"))
{
command["ApiKey"] = Masked;
}
}
}
/// <summary>Removes a field nested inside a destructured <c>{@command}</c> object.</summary>
private sealed class NestedRemovingRedactor : ILogRedactor
{
public void Redact(IDictionary<string, object?> properties)
{
if (properties.TryGetValue("command", out var value)
&& value is IDictionary<string, object?> command)
{
command.Remove("ApiKey");
}
}
}
/// <summary>Masks an element inside a destructured <c>{@items}</c> sequence.</summary>
private sealed class SequenceElementRedactor : ILogRedactor
{
public void Redact(IDictionary<string, object?> properties)
{
if (properties.TryGetValue("items", out var value) && value is IList<object?> items)
{
for (var i = 0; i < items.Count; i++)
{
if (items[i] is string s && s.StartsWith("secret", StringComparison.Ordinal))
{
items[i] = Masked;
}
}
}
}
}
/// <summary>Masks a value inside a destructured <c>{@map}</c> dictionary.</summary>
private sealed class DictionaryValueRedactor : ILogRedactor
{
public void Redact(IDictionary<string, object?> properties)
{
if (properties.TryGetValue("map", out var value)
&& value is IDictionary<string, object?> map
&& map.ContainsKey("token"))
{
map["token"] = Masked;
}
}
}
/// <summary>Masks a field two levels deep inside a destructured object graph.</summary>
private sealed class DeepFieldRedactor : ILogRedactor
{
public void Redact(IDictionary<string, object?> properties)
{
if (properties.TryGetValue("payload", out var value)
&& value is IDictionary<string, object?> payload
&& payload.TryGetValue("Inner", out var innerValue)
&& innerValue is IDictionary<string, object?> inner
&& inner.ContainsKey("ApiKey"))
{
inner["ApiKey"] = Masked;
}
}
}
// Named types so a destructured StructureValue's TypeTag is deterministic in assertions.
private sealed record Command(string Name, string ApiKey);
private sealed record Envelope(Inner Inner);
private sealed record Inner(string ApiKey, string Note);
private static string? ScalarOrNull(LogEvent logEvent, string propertyName) =>
logEvent.Properties.TryGetValue(propertyName, out var value) && value is ScalarValue scalar
? scalar.Value?.ToString()
: null;
/// <summary>
/// Builds a logger wired with <paramref name="redactor"/>, logs one event, and returns it —
/// the shared harness for the redaction tests.
/// </summary>
private static LogEvent LogWith(ILogRedactor redactor, string template, params object?[] args)
{
var serviceProvider = new ServiceCollection()
.AddSingleton<ILogRedactor>(redactor)
.BuildServiceProvider();
var sink = new InMemorySink();
var options = new ZbTelemetryOptions { ServiceName = "mxgateway" };
var loggerConfig = new LoggerConfiguration();
ZbSerilogConfig.Apply(loggerConfig, options, serviceProvider);
using Logger logger = loggerConfig.WriteTo.Sink(sink).CreateLogger();
logger.Information(template, args);
return Assert.Single(sink.LogEvents);
}
[Fact]
public void Registered_redactor_masks_sensitive_property()
{
@@ -68,6 +181,140 @@ public sealed class RedactionTests
Assert.Equal("mxgw_secret", ScalarOrNull(logEvent, "apiKey"));
}
/// <summary>
/// Telemetry-001: a redactor that REMOVES a key (the most natural way to implement "must not
/// leave the process") must result in the property being absent from the emitted event, not
/// silently retained.
/// </summary>
[Fact]
public void Removing_redactor_scrubs_the_property_from_the_event()
{
var serviceProvider = new ServiceCollection()
.AddSingleton<ILogRedactor>(new RemovingRedactor("apiKey"))
.BuildServiceProvider();
var sink = new InMemorySink();
var options = new ZbTelemetryOptions { ServiceName = "mxgateway" };
var loggerConfig = new LoggerConfiguration();
ZbSerilogConfig.Apply(loggerConfig, options, serviceProvider);
using Logger logger = loggerConfig.WriteTo.Sink(sink).CreateLogger();
logger.Information("authenticating {apiKey} for {user}", "mxgw_secret", "alice");
var logEvent = Assert.Single(sink.LogEvents);
Assert.False(
logEvent.Properties.ContainsKey("apiKey"),
"apiKey must be removed from the event when the redactor removes the key");
// A non-sensitive property the redactor left alone must survive.
Assert.Equal("alice", ScalarOrNull(logEvent, "user"));
}
/// <summary>
/// Telemetry-002: a redactor can reach a field nested inside a destructured ({@Object})
/// property and mask it, leaving sibling fields and the structure's type tag intact.
/// </summary>
[Fact]
public void Redactor_masks_a_field_inside_a_destructured_object()
{
var logEvent = LogWith(
new NestedFieldRedactor(),
"dispatching {@command}",
new Command("Write", "mxgw_secret"));
var structure = Assert.IsType<StructureValue>(logEvent.Properties["command"]);
var fields = structure.Properties.ToDictionary(p => p.Name, p => p.Value);
Assert.Equal(Masked, (fields["ApiKey"] as ScalarValue)?.Value?.ToString());
Assert.Equal("Write", (fields["Name"] as ScalarValue)?.Value?.ToString());
Assert.Equal(nameof(Command), structure.TypeTag);
}
/// <summary>
/// Telemetry-002: a redactor can remove a field nested inside a destructured object; the
/// field is dropped from the rebuilt structure while siblings survive.
/// </summary>
[Fact]
public void Redactor_removes_a_field_inside_a_destructured_object()
{
var logEvent = LogWith(
new NestedRemovingRedactor(),
"dispatching {@command}",
new Command("Write", "mxgw_secret"));
var structure = Assert.IsType<StructureValue>(logEvent.Properties["command"]);
var names = structure.Properties.Select(p => p.Name).ToList();
Assert.DoesNotContain("ApiKey", names);
Assert.Contains("Name", names);
}
/// <summary>Telemetry-002: a redactor can mask an element nested inside a destructured sequence.</summary>
[Fact]
public void Redactor_masks_an_element_inside_a_destructured_sequence()
{
var logEvent = LogWith(
new SequenceElementRedactor(),
"items {@items}",
(object)new[] { "keep", "secret-token" });
var sequence = Assert.IsType<SequenceValue>(logEvent.Properties["items"]);
var elements = sequence.Elements.Select(e => (e as ScalarValue)?.Value?.ToString()).ToList();
Assert.Equal(new[] { "keep", Masked }, elements);
}
/// <summary>Telemetry-002: a redactor can mask a value nested inside a destructured dictionary.</summary>
[Fact]
public void Redactor_masks_a_value_inside_a_destructured_dictionary()
{
var map = new Dictionary<string, string> { ["user"] = "alice", ["token"] = "secret" };
var logEvent = LogWith(new DictionaryValueRedactor(), "map {@map}", map);
var dictionary = Assert.IsType<DictionaryValue>(logEvent.Properties["map"]);
var pairs = dictionary.Elements.ToDictionary(
e => (string)e.Key.Value!,
e => (e.Value as ScalarValue)?.Value?.ToString());
Assert.Equal("alice", pairs["user"]);
Assert.Equal(Masked, pairs["token"]);
}
/// <summary>Telemetry-002: nested reach is recursive — a field two levels deep is reachable.</summary>
[Fact]
public void Redactor_masks_a_field_two_levels_deep()
{
var logEvent = LogWith(
new DeepFieldRedactor(),
"payload {@payload}",
new Envelope(new Inner("secret", "keep")));
var outer = Assert.IsType<StructureValue>(logEvent.Properties["payload"]);
var innerValue = outer.Properties.Single(p => p.Name == "Inner").Value;
var inner = Assert.IsType<StructureValue>(innerValue);
var fields = inner.Properties.ToDictionary(p => p.Name, p => p.Value);
Assert.Equal(Masked, (fields["ApiKey"] as ScalarValue)?.Value?.ToString());
Assert.Equal("keep", (fields["Note"] as ScalarValue)?.Value?.ToString());
}
/// <summary>
/// A structured property the redactor does not touch must survive redaction unchanged
/// (siblings, values, and type tag intact) even while a scalar property is masked.
/// </summary>
[Fact]
public void Untouched_structured_property_survives_redaction_intact()
{
var logEvent = LogWith(
new FakeRedactor(),
"dispatching {apiKey} {@command}",
"mxgw_secret",
new Command("Write", "open"));
Assert.Equal(Masked, ScalarOrNull(logEvent, "apiKey"));
var structure = Assert.IsType<StructureValue>(logEvent.Properties["command"]);
var fields = structure.Properties.ToDictionary(p => p.Name, p => p.Value);
Assert.Equal("Write", (fields["Name"] as ScalarValue)?.Value?.ToString());
Assert.Equal("open", (fields["ApiKey"] as ScalarValue)?.Value?.ToString());
Assert.Equal(nameof(Command), structure.TypeTag);
}
[Fact]
public void AddZbSerilog_with_otlp_options_builds_without_error()
{
@@ -122,6 +369,32 @@ public sealed class RedactionTests
Assert.Equal("central", attributes["node.role"]);
}
/// <summary>
/// Telemetry-005: the Serilog OTLP log-sink attribute map and the OTel SDK metrics/traces
/// attribute map must be key-for-key and value-for-value identical, because both now derive from
/// the single <see cref="ZbResource.BuildAttributes"/> source of truth. This pins that they
/// cannot silently drift apart.
/// </summary>
[Fact]
public void Serilog_and_OTel_resource_attribute_sets_are_identical()
{
var options = new ZbTelemetryOptions
{
ServiceName = "mxgateway",
ServiceNamespace = "ZB.MOM.WW",
ServiceVersion = "9.9.9",
SiteId = "site-z",
NodeRole = "hub",
};
var serilogAttributes = ZbSerilogConfig.BuildResourceAttributes(options);
var canonical = ZbResource.BuildAttributes(options);
Assert.Equal(
canonical.OrderBy(kvp => kvp.Key, StringComparer.Ordinal),
serilogAttributes.OrderBy(kvp => kvp.Key, StringComparer.Ordinal));
}
[Fact]
public void BuildResourceAttributes_omits_optional_keys_when_not_set()
{
@@ -37,6 +37,55 @@ public sealed class AddZbTelemetryTests
Assert.Equal("configure", ex.ParamName);
}
// Telemetry-006: malformed/missing OtlpEndpoint must fail fast with a clear, named error
// instead of a late UriFormatException deep inside exporter construction.
[Fact]
public void AddZbTelemetry_Throws_WhenOtlpExporterHasMalformedEndpoint()
{
var builder = WebApplication.CreateBuilder();
var ex = Assert.Throws<ArgumentException>(() =>
builder.AddZbTelemetry(o =>
{
o.ServiceName = "telemetry-test";
o.Exporter = ZbExporter.Otlp;
o.OtlpEndpoint = "not a uri"; // missing scheme — not an absolute URI
}));
Assert.Equal("configure", ex.ParamName);
Assert.Contains("OtlpEndpoint", ex.Message);
}
[Fact]
public void AddZbTelemetry_Throws_WhenOtlpExporterHasNoEndpoint()
{
var builder = WebApplication.CreateBuilder();
var ex = Assert.Throws<ArgumentException>(() =>
builder.AddZbTelemetry(o =>
{
o.ServiceName = "telemetry-test";
o.Exporter = ZbExporter.Otlp;
// OtlpEndpoint left null
}));
Assert.Equal("configure", ex.ParamName);
Assert.Contains("OtlpEndpoint", ex.Message);
}
[Fact]
public void AddZbTelemetry_DoesNotValidateEndpoint_WhenExporterIsPrometheus()
{
// A stray (even malformed) endpoint is harmless under the Prometheus exporter and must not
// be validated — it is ignored.
var builder = WebApplication.CreateBuilder();
var ex = Record.Exception(() =>
builder.AddZbTelemetry(o =>
{
o.ServiceName = "telemetry-test";
o.Exporter = ZbExporter.Prometheus;
o.OtlpEndpoint = "not a uri";
}));
Assert.Null(ex);
}
// Fix #1: Prometheus coexists with OTLP — /metrics must still serve under Otlp exporter
[Fact]
@@ -48,8 +97,10 @@ public sealed class AddZbTelemetryTests
{
o.ServiceName = "telemetry-test";
o.Exporter = ZbExporter.Otlp;
// OtlpEndpoint intentionally left null — exporter will be registered but won't
// connect anywhere; we are only verifying Prometheus remains present.
// A well-formed endpoint is required under the Otlp exporter (Telemetry-006); the
// exporter is registered but won't connect anywhere in the test. We are only verifying
// Prometheus remains present.
o.OtlpEndpoint = "http://localhost:4317";
o.Meters = ["Test.OtlpCoexist.Meter"];
});
+11 -1
View File
@@ -7,6 +7,11 @@ woff2 fonts, and side-rail layout CSS — all served from `_content/ZB.MOM.WW.Th
plus a set of Blazor SSR components that carry no inline colours and reuse the token
classes. Bootstrap 5 is **not** vendored; each app keeps its own Bootstrap link.
The kit ships **no JavaScript**: every interactive affordance — the narrow-viewport
side-rail hamburger and the collapsible nav sections — is a native CSS-only
`<details>`/`<summary>` disclosure, so the chrome works unchanged in static Blazor SSR
with only Bootstrap **CSS** present (no Bootstrap collapse JS bundle required).
## Adopt
1. Reference the NuGet package in your app; keep your own Bootstrap 5 `<link>` in `App.razor`.
@@ -83,11 +88,16 @@ Served at `_content/ZB.MOM.WW.Theme/…` by the ASP.NET static-web-asset pipelin
`theme.css` declares `@font-face` with `url('../fonts/…')` — correct relative path from
`css/` to `fonts/`. (OtOpcUa's original `url('fonts/…')` was a latent 404; the kit fixes it.)
All colours are CSS custom-property tokens declared once in `theme.css`'s `:root` block —
no hardcoded hex appears outside it. Derived border/hover/wash shades (e.g. `--ok-border`,
`--warn-ink`, `--info-bg`, `--hover-bg`, `--active-bg`, `--zebra-bg`) are named tokens too,
so a token-block edit re-themes the whole kit.
## Build
```bash
# from ZB.MOM.WW.Theme/
dotnet build -c Release # TreatWarningsAsErrors — expect 0 warnings
dotnet test # 32 bUnit tests
dotnet test # 38 bUnit tests
./build/pack.sh # → ./artifacts/ZB.MOM.WW.Theme.0.1.0.nupkg
```
@@ -1,3 +1,19 @@
namespace ZB.MOM.WW.Theme;
public enum ButtonVariant { Primary, Secondary, Danger, Ghost }
/// <summary>
/// Visual variant for <c>TechButton</c>, mapped to a Bootstrap <c>.btn-*</c> class.
/// </summary>
public enum ButtonVariant
{
/// <summary>Primary action — emits <c>btn-primary</c> (filled accent button).</summary>
Primary,
/// <summary>Secondary action — emits <c>btn-outline-secondary</c> (outlined button).</summary>
Secondary,
/// <summary>Destructive action — emits <c>btn-danger</c> (filled red button).</summary>
Danger,
/// <summary>Low-emphasis action — emits <c>btn-link</c> (text-only button).</summary>
Ghost,
}
@@ -7,6 +7,11 @@
</div>
@code {
/// <summary>Product name displayed next to the brand glyph. Required.</summary>
[Parameter, EditorRequired] public string Product { get; set; } = string.Empty;
/// <summary>
/// Optional custom logo. When provided it replaces the default brand glyph (<c>▐</c>).
/// </summary>
[Parameter] public RenderFragment? Logo { get; set; }
}
@@ -36,7 +36,12 @@
</div>
@code {
/// <summary>Product name shown in the card heading. Required.</summary>
[Parameter, EditorRequired] public string Product { get; set; } = string.Empty;
/// <summary>
/// Form <c>action</c> URL the sign-in POST targets. Defaults to <c>/auth/login</c>.
/// </summary>
[Parameter] public string Action { get; set; } = "/auth/login";
/// <summary>
@@ -47,6 +52,9 @@
/// </summary>
[Parameter] public string? ReturnUrl { get; set; }
/// <summary>
/// Optional error message shown as a notice above the submit button (e.g. a failed login).
/// </summary>
[Parameter] public string? Error { get; set; }
/// <summary>
@@ -6,8 +6,21 @@
</NavLink>
@code {
/// <summary>Link target (the <c>href</c> of the underlying <c>NavLink</c>). Required.</summary>
[Parameter, EditorRequired] public string Href { get; set; } = string.Empty;
/// <summary>Visible label text for the rail link. Required.</summary>
[Parameter, EditorRequired] public string Text { get; set; } = string.Empty;
/// <summary>
/// Optional leading icon, rendered in a <c>.rail-ico</c> span before the label.
/// </summary>
[Parameter] public RenderFragment? Icon { get; set; }
/// <summary>
/// Active-class matching behaviour passed to the underlying <c>NavLink</c>.
/// Defaults to <see cref="NavLinkMatch.Prefix"/>; use <see cref="NavLinkMatch.All"/>
/// for exact matches such as the home/overview link.
/// </summary>
[Parameter] public NavLinkMatch Match { get; set; } = NavLinkMatch.Prefix;
}
@@ -2,12 +2,24 @@
Apps that want cookie-persisted expand state keep their own interactive NavSection. *@
@namespace ZB.MOM.WW.Theme
<details class="rail-section" open="@Expanded">
<summary class="rail-eyebrow-toggle"><span class="rail-eyebrow-label">@Title</span></summary>
<summary class="rail-eyebrow-toggle">@Title</summary>
<div class="rail-section-body">@ChildContent</div>
</details>
@code {
/// <summary>
/// Eyebrow label shown in the section's <c>&lt;summary&gt;</c> header. Required.
/// </summary>
[Parameter, EditorRequired] public string Title { get; set; } = string.Empty;
/// <summary>
/// Initial open state of the collapsible section. Defaults to <c>true</c> (expanded).
/// Backed by the native <c>&lt;details open&gt;</c> attribute — no JavaScript required.
/// </summary>
[Parameter] public bool Expanded { get; set; } = true;
/// <summary>
/// Section content — typically <see cref="NavRailItem"/> children.
/// </summary>
[Parameter] public RenderFragment? ChildContent { get; set; }
}
@@ -3,7 +3,13 @@
<span class="chip @ChipClass">@ChildContent</span>
@code {
/// <summary>
/// Status the chip represents. Mapped to a token-based <c>.chip-*</c> class
/// (e.g. <see cref="StatusState.Ok"/> → <c>chip-ok</c>). Required.
/// </summary>
[Parameter, EditorRequired] public StatusState State { get; set; }
/// <summary>Chip label content (e.g. the status text).</summary>
[Parameter] public RenderFragment? ChildContent { get; set; }
private string ChipClass => State switch
@@ -6,10 +6,28 @@
</button>
@code {
/// <summary>
/// Visual variant, mapped to a Bootstrap <c>.btn-*</c> class.
/// Defaults to <see cref="ButtonVariant.Primary"/>.
/// </summary>
[Parameter] public ButtonVariant Variant { get; set; } = ButtonVariant.Primary;
/// <summary>HTML <c>type</c> attribute (<c>button</c>, <c>submit</c>, <c>reset</c>). Defaults to <c>button</c>.</summary>
[Parameter] public string Type { get; set; } = "button";
/// <summary>
/// When <c>true</c>, disables the button and shows an inline spinner — use while an
/// async action is in flight. Defaults to <c>false</c>.
/// </summary>
[Parameter] public bool Busy { get; set; }
/// <summary>Button label content.</summary>
[Parameter] public RenderFragment? ChildContent { get; set; }
/// <summary>
/// Arbitrary additional HTML attributes splatted onto the underlying
/// <c>&lt;button&gt;</c> (e.g. <c>id</c>, <c>@onclick</c>, <c>name</c>, <c>value</c>).
/// </summary>
[Parameter(CaptureUnmatchedValues = true)] public IDictionary<string, object>? AdditionalAttributes { get; set; }
private string VariantClass => Variant switch
@@ -8,9 +8,23 @@
</section>
@code {
/// <summary>
/// Optional string title rendered in the panel header. Ignored when <see cref="Header"/>
/// is supplied (<see cref="Header"/> takes precedence).
/// </summary>
[Parameter] public string? Title { get; set; }
/// <summary>
/// Optional custom panel header content. Takes precedence over <see cref="Title"/>.
/// </summary>
[Parameter] public RenderFragment? Header { get; set; }
/// <summary>Panel body content.</summary>
[Parameter] public RenderFragment? ChildContent { get; set; }
/// <summary>Optional panel footer content (padded, top-bordered).</summary>
[Parameter] public RenderFragment? Footer { get; set; }
/// <summary>Additional CSS classes appended to the root <c>&lt;section class="panel"&gt;</c>.</summary>
[Parameter] public string? Class { get; set; }
}
@@ -8,8 +8,17 @@
</div>
@code {
/// <summary>Field label text. Required.</summary>
[Parameter, EditorRequired] public string Label { get; set; } = string.Empty;
/// <summary>Optional hint text rendered as <c>.form-text</c> below the input.</summary>
[Parameter] public string? Hint { get; set; }
/// <summary>
/// Optional inline error message rendered as <c>.field-error.s-bad</c> below the input.
/// </summary>
[Parameter] public string? Error { get; set; }
/// <summary>The control itself — an <c>&lt;input&gt;</c>, <c>&lt;select&gt;</c>, etc.</summary>
[Parameter] public RenderFragment? ChildContent { get; set; }
}
@@ -1,13 +1,15 @@
@* Components/ThemeShell.razor — the one canonical side-rail chassis.
Not a LayoutComponentBase: the app's thin MainLayout delegates to this. *@
Not a LayoutComponentBase: the app's thin MainLayout delegates to this.
The narrow-viewport hamburger uses a CSS-only <details> disclosure (no
Bootstrap JS): the rail is force-shown on lg+ and toggles via the native
<details> open state below lg, so it works in pure static SSR. *@
@namespace ZB.MOM.WW.Theme
<div class="app-shell d-flex flex-column flex-lg-row" style="@AccentStyle">
<button class="btn btn-outline-secondary btn-sm d-lg-none m-2 align-self-start"
type="button" data-bs-toggle="collapse" data-bs-target="#theme-rail"
aria-controls="theme-rail" aria-expanded="false" aria-label="Toggle navigation">
<details class="app-shell d-flex flex-column flex-lg-row" style="@AccentStyle">
<summary class="rail-toggle btn btn-outline-secondary btn-sm d-lg-none m-2 align-self-start"
aria-label="Toggle navigation">
&#9776;
</button>
<div class="collapse d-lg-block" id="theme-rail">
</summary>
<div class="theme-rail" id="theme-rail">
<nav class="side-rail">
<BrandBar Product="@Product" Logo="@Logo" />
@Nav
@@ -18,14 +20,41 @@
</nav>
</div>
<main class="page">@ChildContent</main>
</div>
</details>
@code {
/// <summary>
/// Product name shown in the rail's <see cref="BrandBar"/> header. Required.
/// </summary>
[Parameter, EditorRequired] public string Product { get; set; } = string.Empty;
/// <summary>
/// Optional per-app accent colour. When set, emitted as <c>style="--accent: &lt;value&gt;"</c>
/// on the shell root, overriding the <c>--accent</c> design token for this subtree only.
/// This is the single per-app token override the kit allows (e.g. <c>"#2f855a"</c>).
/// </summary>
[Parameter] public string? Accent { get; set; }
/// <summary>
/// Optional custom logo rendered in the rail header. When omitted, the default
/// brand glyph is shown.
/// </summary>
[Parameter] public RenderFragment? Logo { get; set; }
/// <summary>
/// Rail navigation content — typically <see cref="NavRailSection"/> /
/// <see cref="NavRailItem"/> elements.
/// </summary>
[Parameter] public RenderFragment? Nav { get; set; }
/// <summary>
/// Optional content pinned to the bottom of the rail (e.g. session info / sign-out).
/// </summary>
[Parameter] public RenderFragment? RailFooter { get; set; }
/// <summary>
/// The page body — normally <c>@Body</c> forwarded from the app's <c>MainLayout</c>.
/// </summary>
[Parameter] public RenderFragment? ChildContent { get; set; }
private string? AccentStyle => Accent is null ? null : $"--accent: {Accent}";
@@ -1,3 +1,23 @@
namespace ZB.MOM.WW.Theme;
public enum StatusState { Ok, Warn, Bad, Idle, Info }
/// <summary>
/// Status conveyed by a <c>StatusPill</c> chip, mapped to a token-based
/// <c>.chip-*</c> class (status is communicated by colour, not iconography).
/// </summary>
public enum StatusState
{
/// <summary>Success / healthy / connected — emits <c>chip-ok</c> (green).</summary>
Ok,
/// <summary>Warning / degraded — emits <c>chip-warn</c> (amber).</summary>
Warn,
/// <summary>Error / faulted / disconnected — emits <c>chip-bad</c> (red).</summary>
Bad,
/// <summary>Unknown / offline / neutral — emits <c>chip-idle</c> (grey). The default.</summary>
Idle,
/// <summary>Informational — emits <c>chip-info</c> (on-palette blue).</summary>
Info,
}
@@ -2,6 +2,13 @@
<PropertyGroup>
<RootNamespace>ZB.MOM.WW.Theme</RootNamespace>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
<!-- Emit the XML doc file so missing-doc (CS1591) warnings on the public API
surface under TreatWarningsAsErrors. Razor compiles each component to a
public class whose generated members carry no docs, so CS1591 is left
out of the error set (it would flag generated members, not authored ones)
while still producing the doc XML consumers see at the call site. -->
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<NoWarn>$(NoWarn);CS1591</NoWarn>
<IsPackable>true</IsPackable>
<PackageId>ZB.MOM.WW.Theme</PackageId>
<Description>Shared Technical-Light UI kit (tokens, fonts, side-rail shell, widgets) for the ZB.MOM.WW SCADA family.</Description>
@@ -2,9 +2,13 @@
Tokens live in theme.css; this sheet carries only layout + the side rail. */
/* ── App shell: side rail + page ─────────────────────────────────────────── */
/* The outer flex direction is supplied by Bootstrap utilities on the wrapper
(`d-flex flex-column flex-lg-row`) so the mobile hamburger row stacks above
the rail on <lg viewports and the rail sits beside the page on lg+. */
/* The shell is a native <details> disclosure so the narrow-viewport hamburger
toggle works with NO JavaScript (no Bootstrap collapse bundle): below lg the
<summary> hamburger opens/closes the rail; on lg+ the rail is force-shown
regardless of the open state (see the lg media query below). The outer flex
direction is supplied by Bootstrap utilities on the wrapper
(`d-flex flex-column flex-lg-row`) so the hamburger row stacks above the rail
on <lg viewports and the rail sits beside the page on lg+. */
.app-shell {
align-items: stretch;
min-height: calc(100vh - 3.3rem);
@@ -15,6 +19,18 @@
min-width: 0;
}
/* Hamburger <summary>: hide the native disclosure triangle so it reads as a
plain button (the styling comes from the Bootstrap .btn utilities). */
.app-shell > summary.rail-toggle {
list-style: none;
cursor: pointer;
}
.app-shell > summary.rail-toggle::-webkit-details-marker { display: none; }
/* Below lg the native <details> shows/hides #theme-rail via [open]; nothing
extra is needed there. On lg+ we force the rail visible (see media query)
and keep the toggle hidden via Bootstrap's d-lg-none. */
/* ── Side rail ───────────────────────────────────────────────────────────── */
.side-rail {
width: 220px;
@@ -27,9 +43,12 @@
border-right: 1px solid var(--rule-strong);
}
/* On lg+ keep the side rail pinned so it stays visible when content scrolls. */
/* On lg+ keep the side rail pinned so it stays visible when content scrolls,
and force it shown regardless of the <details> open state (the hamburger
toggle is hidden at this width). */
@media (min-width: 992px) {
#theme-rail {
display: block;
position: sticky;
top: 0;
height: 100vh;
@@ -38,8 +57,9 @@
}
}
/* When the side rail is collapsed under <lg viewports the Bootstrap collapse
container removes the fixed width; restore full width on mobile. */
/* When the side rail is collapsed under <lg viewports (the <details> is closed)
the native disclosure hides #theme-rail; when open, restore full width on
mobile so the rail spans the viewport. */
@media (max-width: 991.98px) {
.side-rail {
width: 100%;
@@ -72,17 +92,9 @@
}
.side-rail .brand .mark { color: var(--accent); }
.rail-eyebrow {
font-size: 0.68rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.07em;
color: var(--ink-faint);
padding: 0.3rem 0.6rem;
}
/* Collapsible variant — rendered by NavRailSection. Looks like .rail-eyebrow
plus a leading chevron; clicking flips chevron + expanded state. */
/* Section eyebrow — the <summary> label of a NavRailSection. The collapse
chevron is supplied by the `.rail-section > summary::before` rule below; this
rule styles the uppercase eyebrow text itself. */
.rail-eyebrow-toggle {
display: flex;
align-items: center;
@@ -100,12 +112,6 @@
cursor: pointer;
}
.rail-eyebrow-toggle:hover { color: var(--ink); }
.rail-eyebrow-chevron {
display: inline-block;
width: 0.7rem;
font-size: 0.55rem;
color: var(--ink-faint);
}
.rail-section-body {
display: flex;
flex-direction: column;
@@ -120,17 +126,28 @@
color: var(--ink-soft);
}
.rail-link:hover {
background: #f3f6fd;
background: var(--hover-bg);
color: var(--ink);
text-decoration: none;
}
.rail-link.active {
background: #eef2fc;
background: var(--active-bg);
border-left-color: var(--accent);
color: var(--accent-deep);
font-weight: 600;
}
/* Optional leading icon inside a rail link (NavRailItem's Icon slot). Sizes and
aligns the icon and gives it a gap from the label text. */
.rail-ico {
display: inline-flex;
align-items: center;
justify-content: center;
width: 1rem;
margin-right: 0.4rem;
color: var(--ink-faint);
}
/* ── Session block, pinned to the rail foot ──────────────────────────────── */
.rail-foot {
margin-top: auto;
@@ -180,8 +197,8 @@
.rail-section > summary::before { content: '\25B6'; font-size: 0.55rem; color: var(--ink-faint); margin-right: 0.4rem; }
.rail-section[open] > summary::before { content: '\25BC'; }
/* StatusPill: info variant (on-palette, reuses dir-read colours) */
.chip-info { color: var(--accent-deep); background: #e7ecfb; border-color: #cdd9f7; }
/* StatusPill: info variant (on-palette, reuses the info blue wash) */
.chip-info { color: var(--accent-deep); background: var(--info-bg); border-color: var(--info-border); }
/* TechCard body/footer padding; TechField error; LoginCard body */
.panel-body { padding: 0.85rem 0.9rem; }
@@ -62,6 +62,29 @@
--bad-bg: #fceaea;
--idle-bg: #eef0f2;
/* Status — derived border tints (lighter than the foreground; for chips,
pills, and tinted cards). Pair with the matching -bg above. */
--ok-border: #c6e6cd;
--warn-border: #efd6a6;
--bad-border: #eec3c3;
--ok-border-soft: #bfe3c6; /* conn pill (slightly lighter) */
--warn-border-soft: #f0d9ab; /* conn pill (slightly lighter) */
--bad-border-soft: #f0c0c0; /* conn pill (slightly lighter) */
/* Warning ink — a deeper amber than --warn, for chip/notice/value text that
needs more contrast on the warm tint than the pure --warn would give. */
--warn-ink: #b56a00;
--warn-ink-deep: #8a5a00;
/* Info / accent tint — on-palette blue wash (StatusPill Info, dir-read tag) */
--info-bg: #e7ecfb;
--info-border: #cdd9f7;
/* Neutral surface washes — derived from --paper/--accent, no own foreground */
--zebra-bg: #fbfbf9; /* even-row / sticky-head fill — barely-there */
--hover-bg: #f3f6fd; /* row / rail-link hover — faint cool wash */
--active-bg: #eef2fc; /* active rail-link fill */
/* Type stacks — Plex first, graceful system fallback */
--mono: 'IBM Plex Mono', ui-monospace, 'Cascadia Mono', Consolas, monospace;
--sans: 'IBM Plex Sans', system-ui, -apple-system, 'Segoe UI', sans-serif;
@@ -81,7 +104,7 @@
not a gradient wash. Keep it subtle. */
body {
background:
radial-gradient(1200px 480px at 88% -8%, #ffffff 0%, rgba(255,255,255,0) 70%),
radial-gradient(1200px 480px at 88% -8%, var(--card) 0%, rgba(255,255,255,0) 70%),
var(--paper);
color: var(--ink);
font-family: var(--sans);
@@ -142,11 +165,11 @@ a:hover { color: var(--accent-deep); text-decoration: underline; }
width: 7px; height: 7px; border-radius: 50%;
background: var(--idle);
}
.conn-pill[data-state="connected"] { color: var(--ok); border-color: #bfe3c6; background: var(--ok-bg); }
.conn-pill[data-state="connected"] { color: var(--ok); border-color: var(--ok-border-soft); background: var(--ok-bg); }
.conn-pill[data-state="connected"] .dot { background: var(--ok); }
.conn-pill[data-state="connecting"] { color: var(--warn); border-color: #f0d9ab; background: var(--warn-bg); }
.conn-pill[data-state="connecting"] { color: var(--warn); border-color: var(--warn-border-soft); background: var(--warn-bg); }
.conn-pill[data-state="connecting"] .dot { background: var(--warn); animation: pulse 1.1s ease-in-out infinite; }
.conn-pill[data-state="disconnected"] { color: var(--bad); border-color: #f0c0c0; background: var(--bad-bg); }
.conn-pill[data-state="disconnected"] { color: var(--bad); border-color: var(--bad-border-soft); background: var(--bad-bg); }
.conn-pill[data-state="disconnected"] .dot { background: var(--bad); }
@keyframes pulse { 0%,100% { opacity: 1; } 50% { opacity: 0.25; } }
@@ -171,10 +194,10 @@ a:hover { color: var(--accent-deep); text-decoration: underline; }
border-radius: 4px;
border: 1px solid transparent;
}
.chip-ok { color: var(--ok); background: var(--ok-bg); border-color: #c6e6cd; }
.chip-warn { color: #b56a00; background: var(--warn-bg); border-color: #efd6a6; }
.chip-bad { color: var(--bad); background: var(--bad-bg); border-color: #eec3c3; }
.chip-idle { color: var(--ink-soft); background: var(--idle-bg); border-color: var(--rule-strong); }
.chip-ok { color: var(--ok); background: var(--ok-bg); border-color: var(--ok-border); }
.chip-warn { color: var(--warn-ink); background: var(--warn-bg); border-color: var(--warn-border); }
.chip-bad { color: var(--bad); background: var(--bad-bg); border-color: var(--bad-border); }
.chip-idle { color: var(--idle); background: var(--idle-bg); border-color: var(--rule-strong); }
/* ── Panel — the base raised surface ─────────────────────────────────────────
A white card with a hairline border and 8px radius. .panel-head is the
@@ -248,10 +271,10 @@ a:hover { color: var(--accent-deep); text-decoration: underline; }
font-weight: 400;
color: var(--ink-faint);
}
.agg-card.alert { border-color: #eec3c3; background: var(--bad-bg); }
.agg-card.alert { border-color: var(--bad-border); background: var(--bad-bg); }
.agg-card.alert .agg-value { color: var(--bad); }
.agg-card.caution { border-color: #efd6a6; background: var(--warn-bg); }
.agg-card.caution .agg-value { color: #b56a00; }
.agg-card.caution { border-color: var(--warn-border); background: var(--warn-bg); }
.agg-card.caution .agg-value { color: var(--warn-ink); }
/* ── Metric card + key/value rows ────────────────────────────────────────────
A .panel-head over a stack of .kv rows: label left, monospace value right.
@@ -278,7 +301,7 @@ a:hover { color: var(--accent-deep); text-decoration: underline; }
padding: 0.32rem 0.9rem;
font-size: 0.85rem;
}
.kv:nth-child(even) { background: #fbfbf9; }
.kv:nth-child(even) { background: var(--zebra-bg); }
.kv .k { color: var(--ink-soft); }
.kv .v {
font-family: var(--mono);
@@ -332,7 +355,7 @@ a:hover { color: var(--accent-deep); text-decoration: underline; }
text-transform: uppercase;
letter-spacing: 0.05em;
color: var(--ink-faint);
background: #fbfbf9;
background: var(--zebra-bg);
position: sticky;
top: 0;
}
@@ -345,7 +368,7 @@ a:hover { color: var(--accent-deep); text-decoration: underline; }
.data-table th.sorted-desc::after { content: ' \2193'; color: var(--accent); }
.data-table tbody tr { cursor: pointer; transition: background 0.08s; }
.data-table tbody tr:hover { background: #f3f6fd; }
.data-table tbody tr:hover { background: var(--hover-bg); }
.data-table tbody tr:last-child td { border-bottom: none; }
.empty-row {
@@ -365,15 +388,15 @@ a:hover { color: var(--accent-deep); text-decoration: underline; }
padding: 0.1rem 0.4rem;
border-radius: 3px;
}
.dir-read { color: var(--accent-deep); background: #e7ecfb; }
.dir-write { color: #8a5a00; background: var(--warn-bg); }
.dir-read { color: var(--accent-deep); background: var(--info-bg); }
.dir-write { color: var(--warn-ink-deep); background: var(--warn-bg); }
/* ── Inline notice ───────────────────────────────────────────────────────────
A .panel with a warning tint — for "this thing is gone / degraded" banners. */
.notice {
padding: 0.85rem 1.1rem;
margin-bottom: 1rem;
color: #b56a00;
color: var(--warn-ink);
background: var(--warn-bg);
border-color: #efd6a6;
border-color: var(--warn-border);
}
@@ -13,6 +13,28 @@ public class NavRailTests : TestContext
Assert.Contains("Clusters", a.TextContent);
}
// Theme-004: when Icon is supplied it is wrapped in a .rail-ico span (now styled).
[Fact]
public void NavRailItem_wraps_icon_in_rail_ico_span_when_supplied()
{
var cut = RenderComponent<NavRailItem>(p => p
.Add(x => x.Href, "/clusters")
.Add(x => x.Text, "Clusters")
.Add(x => x.Icon, (RenderFragment)(b => b.AddMarkupContent(0, "<svg class='ico'/>"))));
var ico = cut.Find("a.rail-link .rail-ico");
Assert.NotNull(ico);
Assert.NotNull(cut.Find("a.rail-link .rail-ico .ico"));
}
[Fact]
public void NavRailItem_omits_rail_ico_span_when_no_icon()
{
var cut = RenderComponent<NavRailItem>(p => p
.Add(x => x.Href, "/clusters")
.Add(x => x.Text, "Clusters"));
Assert.Empty(cut.FindAll(".rail-ico"));
}
[Fact]
public void NavRailSection_renders_title_and_children_open_by_default()
{
@@ -1,4 +1,6 @@
using System.IO;
using System.Linq;
using System.Text.RegularExpressions;
namespace ZB.MOM.WW.Theme.Tests;
@@ -32,4 +34,36 @@ public class StaticAssetsTests
[InlineData("ibm-plex-mono-500.woff2")]
public void Fonts_are_vendored(string file) =>
Assert.True(File.Exists(Path.Combine(Wwwroot, "fonts", file)));
// Theme-002: .chip-idle pairs the idle background with the matching --idle
// foreground token (per DESIGN-TOKENS.md), not --ink-soft.
[Fact]
public void ChipIdle_pairs_idle_foreground_with_idle_background()
{
var css = File.ReadAllText(Path.Combine(Wwwroot, "css", "theme.css"));
var rule = Regex.Match(css, @"\.chip-idle\s*\{[^}]*\}").Value;
Assert.Contains("color: var(--idle)", rule);
Assert.Contains("background: var(--idle-bg)", rule);
Assert.DoesNotContain("--ink-soft", rule);
}
// Theme-003: no hardcoded hex colours appear outside the :root token block in
// either stylesheet — every shade is a named token.
[Theory]
[InlineData("theme.css")]
[InlineData("layout.css")]
public void No_hardcoded_hex_outside_root_token_block(string file)
{
var css = File.ReadAllText(Path.Combine(Wwwroot, "css", file));
// Strip the :root { ... } declaration block(s) — the one place hex literals live.
var withoutRoot = Regex.Replace(css, @":root\s*\{[^}]*\}", string.Empty);
var hexLiterals = Regex.Matches(withoutRoot, @"#[0-9a-fA-F]{3,8}\b")
.Select(m => m.Value)
.ToList();
Assert.True(hexLiterals.Count == 0,
$"{file} has hardcoded hex outside :root: {string.Join(", ", hexLiterals)}");
}
}
@@ -40,4 +40,26 @@ public class ThemeShellTests : TestContext
.Add(x => x.RailFooter, (RenderFragment)(b => b.AddMarkupContent(0, "<span class='sess'>S</span>"))));
Assert.NotNull(cut.Find(".rail-foot .sess"));
}
// Theme-001: the mobile hamburger must work without Bootstrap collapse JS.
// The shell is a native <details> disclosure whose <summary> is the toggle —
// no data-bs-toggle / data-bs-target attributes, so no Bootstrap JS dependency.
[Fact]
public void Mobile_toggle_is_a_css_only_details_disclosure_not_bootstrap_collapse()
{
var cut = RenderComponent<ThemeShell>(p => p.Add(x => x.Product, "OtOpcUa"));
// Root is a <details> whose direct <summary> is the hamburger toggle.
var shell = cut.Find("details.app-shell");
var summary = cut.Find("details.app-shell > summary.rail-toggle");
Assert.NotNull(summary);
// The rail is plain markup inside the <details> — no Bootstrap .collapse class.
var rail = cut.Find("#theme-rail");
Assert.DoesNotContain("collapse", rail.ClassList);
// No Bootstrap collapse JS hooks anywhere in the shell markup.
Assert.DoesNotContain("data-bs-toggle", shell.OuterHtml);
Assert.DoesNotContain("data-bs-target", shell.OuterHtml);
}
}
+227
View File
@@ -0,0 +1,227 @@
# Code Review — Audit
| Field | Value |
|-------|-------|
| Library | `ZB.MOM.WW.Audit/` |
| Packages | `ZB.MOM.WW.Audit` |
| Component spec | `components/audit/spec/SPEC.md` |
| Shared contract | `components/audit/shared-contract/ZB.MOM.WW.Audit.md` |
| Status | Reviewed |
| Last reviewed | 2026-06-01 |
| Reviewer | Claude (automated baseline) |
| Commit reviewed | `5f75cd4` |
| Open findings | 0 |
## Summary
`ZB.MOM.WW.Audit` is a small, clean, well-scoped library: one canonical `AuditEvent`
sealed record, a 3-value `AuditOutcome` enum, two narrow seams (`IAuditWriter`,
`IAuditRedactor`), five shipped helpers, and a single `AddZbAudit` registration. The record
matches `EVENT-MODEL.md` field-for-field (names, types, required/optional split, and the
`OccurredAtUtc` UTC-normalization contract — including value-equality on the normalized
instant). The dependency footprint is exactly as specified: the only non-BCL reference is
`Microsoft.Extensions.DependencyInjection.Abstractions` (verified against the built
`deps.json`), with no Akka/EF/SQLite/Serilog/Telemetry leakage. Guard clauses are present on
every public entry point, `RedactingAuditWriter` guarantees redaction runs before the inner
writer (no unredacted-PII leak), and `CompositeAuditWriter` isolates per-writer failures so a
throwing writer does not lose downstream events. 19 tests cover the happy paths and the most
important contract guards. No Critical or High findings. The five findings are all Medium/Low
and concern: a cancellation-propagation tension with the "never throw to the caller" writer
contract (Audit-001), a partial over-redaction path in `TruncatingAuditRedactor` (Audit-002),
mutable options on a singleton redactor (Audit-003), un-emitted XML docs (Audit-004, a
family-wide baseline pattern), and a couple of missing edge-case tests (Audit-005).
## Checklist coverage
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☑ | `Truncate` boundary logic correct for sane configs; marker-longer-than-max clips safely. Partial over-redaction edge under negative max → Audit-002. |
| 2 | Public API surface & compatibility | ☑ | Minimal, intentional surface; `sealed` records/classes; nullable annotations correct; no internal-type leakage. Options type is a mutable class, not the record the contract describes → Audit-003. |
| 3 | Concurrency & thread safety | ☑ | `AuditEvent` immutable (readonly backing field); singletons (`NoOp`/`Null`) stateless; composite snapshots its writers via `ToArray()`. Mutable singleton options noted in Audit-003. |
| 4 | Error handling & resilience | ☑ | Guard clauses on all entry points; composite swallows writer faults. OCE re-throw vs writer contract → Audit-001; partial over-redact → Audit-002. |
| 5 | Security & secret handling | ☑ | `RedactingAuditWriter` applies the redactor strictly before the inner writer — no unredacted-event leak. `Actor` kept a plain string (no Auth coupling), as specified. No secrets in code/messages. |
| 6 | Performance & resource management | ☑ | No `IDisposable` to manage; no hot-path allocations beyond a single `with`-copy per redact; DI registration is trivial. No issues found. |
| 7 | Spec & shared-contract adherence | ☑ | `AuditEvent` + `AuditOutcome` match `EVENT-MODEL.md` exactly; `AddZbAudit` uses `TryAdd` as specified. OCE re-throw deviates from the §1 writer hard rule → Audit-001. |
| 8 | Packaging, dependencies & project layout | ☑ | Sole non-BCL dep confirmed via `deps.json`; net10.0; correct PackageId/version (0.1.0). XML docs not packed → Audit-004. |
| 9 | Testing coverage | ☑ | 19 tests; public contract exercised (round-trip, UTC normalization, fan-out, cancellation, redaction ordering). Missing never-throw/over-redact and null-writer cases → Audit-005. |
| 10 | Documentation & XML docs | ☑ | XML docs present and accurate on the whole public surface; README accurate. Docs not emitted to a file → Audit-004. |
## Findings
### Audit-001 — `CompositeAuditWriter` re-throws `OperationCanceledException` to the caller, contradicting the "must not throw" writer contract
| | |
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | `ZB.MOM.WW.Audit/src/ZB.MOM.WW.Audit/CompositeAuditWriter.cs:24` |
**Description**
`IAuditWriter`'s hard contract is unambiguous: "Best-effort; must not throw to the caller"
(`IAuditWriter.cs:11`), and SPEC §1 states "`WriteAsync` MUST NOT throw to the caller … a
failed write must never abort the user-facing action it is recording." The shared contract
further says a cancellation "does not constitute a contract violation" and the implementation
"may choose to complete a partially-written event anyway" — it permits ignoring the token, not
throwing on it.
`CompositeAuditWriter.WriteAsync` re-throws `OperationCanceledException`
(`catch (OperationCanceledException) { throw; }`, line 24). The very common call shape is to
pass the ambient request token, e.g. `writer.WriteAsync(evt, httpContext.RequestAborted)`. When
a client disconnects mid-request, the inner writer observes the cancelled token, throws OCE, and
the composite propagates it straight into the user-facing action — exactly the abort the seam
promises never to cause. This also directly contradicts the helper's own XML doc, which claims
"the fan-out drains and the caller is never aborted" (`CompositeAuditWriter.cs:4`). Because the
composite is the recommended way to wire multiple sinks, every consumer that uses it inherits
this behaviour; the only workaround is to not pass a request-scoped token, which is non-obvious.
Note this is a deliberate, tested choice (`CompositeAuditWriterTests.cs:41`), so it is a
design/contract conflict rather than an oversight — but the conflict is real: the writer contract
says "never throw," the composite says "drains, never aborts," and the code throws on
cancellation.
**Recommendation**
Reconcile the contract and the implementation. Either (a) honour the "never throw" rule
literally — swallow OCE like any other writer failure (drop the remaining writers or continue,
but do not propagate), and update the XML doc accordingly; or (b) if cancellation propagation is
genuinely wanted, amend the `IAuditWriter` SPEC/shared-contract to carve out OCE explicitly and
add a doc warning that consumers must not pass a request-scoped token they cannot afford to see
surface. Option (a) is the lower-blast-radius choice and matches the seam's stated intent.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `CompositeAuditWriter.WriteAsync` now swallows `OperationCanceledException` like any other writer failure (single bare `catch`), so cancellation never surfaces to the caller; XML doc and the cancellation test updated to assert non-propagation.
### Audit-002 — `TruncatingAuditRedactor` over-redaction is partial: the catch path scrubs only `DetailsJson`, leaving `Target` unredacted
| | |
|--|--|
| Severity | Low |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | `ZB.MOM.WW.Audit/src/ZB.MOM.WW.Audit/TruncatingAuditRedactor.cs:27-31`, `:34-40` |
**Description**
The redactor's over-redact fallback returns `rawEvent with { DetailsJson = null }`
(line 30). That scrubs only `DetailsJson``Target`, which this redactor is also responsible
for capping, is carried through untouched. The spec's "over-redact" rule is "returns a strictly
safer event" (SPEC §2); leaving an oversized/sensitive `Target` in place is not strictly safer.
This becomes reachable under misconfiguration. `Truncate(value, max)` does `marker[..max]` when
`marker.Length >= max` (line 38). For a negative `MaxTargetLength` or `MaxDetailsJsonLength`,
that slice throws `ArgumentOutOfRangeException`, the `Apply` catch fires, and the result keeps
the **original, untruncated `Target`** while only nulling `DetailsJson`. So a config bug that
should fail safe instead leaves the `Target` field unredacted. (`max == 0` is handled correctly —
`marker[..0]` is `""`.) The redactor never throws, so the "must not throw" half of the contract
holds; it is the "strictly safer" half that is violated.
**Recommendation**
Make the fallback strictly safer across the fields this redactor owns, e.g.
`return rawEvent with { DetailsJson = null, Target = null };` in the catch. Additionally, guard
against nonsensical caps at the boundary — clamp `max` to `>= 0` in `Truncate` (treat negative
as 0), or validate the options in the constructor — so the catch path is only ever hit by truly
unexpected failures rather than a predictable negative-length misconfiguration.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — the over-redact catch now returns `rawEvent with { DetailsJson = null, Target = null }` (strictly safer), and `Truncate` clamps a negative `max` to 0 so a negative-length misconfiguration fails safe instead of throwing; new tests pin both behaviours.
### Audit-003 — `TruncatingAuditRedactorOptions` is a mutable class, not the immutable "options record" the contract describes
| | |
|--|--|
| Severity | Low |
| Category | Public API surface & compatibility |
| Status | Resolved |
| Location | `ZB.MOM.WW.Audit/src/ZB.MOM.WW.Audit/TruncatingAuditRedactorOptions.cs:4-12` |
**Description**
The shared contract lists `TruncatingAuditRedactorOptions` as an "Options record"
(`shared-contract/ZB.MOM.WW.Audit.md:93`), but it is implemented as a `sealed class` with three
mutable `get; set;` auto-properties. `TruncatingAuditRedactor` captures the instance by reference
(`_options = options ?? new(...)`, `TruncatingAuditRedactor.cs:14`) and the redactor is typically
registered as a singleton. Because the options object stays mutable, a caller (or any other
holder of the same reference) can change `MaxDetailsJsonLength` / `MaxTargetLength` /
`TruncationMarker` after the redactor is built, changing redaction behaviour at runtime under a
concurrent host with no synchronization. This is a (minor) thread-safety footgun and a drift from
both the documented "record" shape and the library's otherwise consistent immutability posture
(`AuditEvent` is a record with init-only members).
**Recommendation**
Make the options immutable: convert to a `sealed record` with `init`-only properties (matching
the contract wording), or have the redactor defensively snapshot the three values into private
readonly fields at construction so post-construction mutation cannot affect it.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `TruncatingAuditRedactorOptions` is now a `sealed record` with `init`-only properties, matching the contract's "options record" and removing the post-construction mutation footgun on the singleton redactor.
### Audit-004 — XML documentation is authored but not emitted, so IntelliSense docs do not ship to consumers
| | |
|--|--|
| Severity | Low |
| Category | Documentation & XML docs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Audit/src/ZB.MOM.WW.Audit/ZB.MOM.WW.Audit.csproj:1-18`, `ZB.MOM.WW.Audit/Directory.Build.props:1-10` |
**Description**
Every public type and member carries accurate XML doc comments, but neither the project nor
`Directory.Build.props` sets `GenerateDocumentationFile`. As a result no `ZB.MOM.WW.Audit.xml`
is produced or packed alongside the DLL (confirmed: no XML doc file in `bin/`), so consuming
apps get no IntelliSense/tooltip documentation from the package. For a shared library where, per
the review process, "public docs matter," the authored docs are effectively invisible to
consumers. (This is a family-wide baseline gap — none of the sibling `ZB.MOM.WW.*` libraries set
the flag either — so it is a baseline issue rather than an Audit-specific regression, hence Low.)
**Recommendation**
Set `<GenerateDocumentationFile>true</GenerateDocumentationFile>` (ideally once in the shared
`Directory.Build.props` so it applies across the family). Because the public surface is fully
documented, no CS1591 "missing XML comment" warnings should appear.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — set `<GenerateDocumentationFile>true</GenerateDocumentationFile>` in the packable `ZB.MOM.WW.Audit.csproj`; `ZB.MOM.WW.Audit.xml` now builds with zero CS1591 warnings and packs into the nupkg under `lib/net10.0/`.
### Audit-005 — Missing edge-case tests for the redactor never-throw/over-redact contract and composite null/empty handling
| | |
|--|--|
| Severity | Low |
| Category | Testing coverage |
| Status | Resolved |
| Location | `ZB.MOM.WW.Audit/tests/ZB.MOM.WW.Audit.Tests/TruncatingAuditRedactorTests.cs`, `.../CompositeAuditWriterTests.cs` |
**Description**
The 19 tests cover the happy paths and the key guards (UTC normalization + equality, fan-out,
cancellation propagation, redaction ordering, marker-longer-than-max). Two contract-critical
behaviours are untested:
1. The `IAuditRedactor` "never throws / over-redacts on failure" contract — no test drives
`TruncatingAuditRedactor.Apply` into its `catch` branch, so the over-redaction fallback
(and the asymmetry flagged in Audit-002) is unverified.
2. `CompositeAuditWriter` boundary inputs — no test covers an empty writer list (should be a
no-op) or a list containing a `null` writer (currently swallowed via the bare `catch`,
silently dropping that sink — behaviour that should be pinned by a test).
Pinning these would have surfaced Audit-002 and would lock the documented never-throw guarantee.
**Recommendation**
Add: (a) a redactor test that forces an internal failure (e.g. a misconfigured cap) and asserts
`Apply` returns a strictly-safer event without throwing; (b) a composite test for the empty-list
no-op; and (c) a composite test documenting the null-entry behaviour (whether swallowed or
guarded — pair with the chosen fix for the null case).
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — added four tests: redactor over-redact catch branch scrubs both fields, negative-max clamp, and `CompositeAuditWriter` empty-list no-op + null-writer-entry swallow (19 → 23 tests).
+254
View File
@@ -0,0 +1,254 @@
# Code Review — Auth
| Field | Value |
|-------|-------|
| Library | `ZB.MOM.WW.Auth/` |
| Packages | `ZB.MOM.WW.Auth.Abstractions`, `ZB.MOM.WW.Auth.Ldap`, `ZB.MOM.WW.Auth.ApiKeys`, `ZB.MOM.WW.Auth.AspNetCore` |
| Component spec | `components/auth/spec/SPEC.md` |
| Shared contract | `components/auth/shared-contract/ZB.MOM.WW.Auth.md` |
| Status | Reviewed |
| Last reviewed | 2026-06-01 |
| Reviewer | Claude (automated baseline) |
| Commit reviewed | `5f75cd4` |
| Open findings | 0 |
## Summary
The security-critical core of this library is sound: API-key secrets are 32-byte CSPRNG values,
hashing is HMAC-SHA256 keyed by an external pepper, verification uses
`CryptographicOperations.FixedTimeEquals`, the SQLite stores use parameterized SQL throughout,
LDAP filter values are RFC 4515-escaped before being interpolated into the search filter, group-DN
parsing is RFC 4514 escape-aware, and the LDAP service is genuinely fail-closed (never throws,
requires exactly one match plus a verified password plus >=1 group for success). Test coverage is
strong and contract-focused (~172 tests; every failure mode, the parser, the hasher, the escaper,
the stores, the migrator and the DI wiring are exercised). The findings below are about
robustness-around-the-edges and documentation, not the cryptographic core. The two themes are
(1) two narrow paths that can turn a recoverable condition into a thrown exception on the
authentication hot path, contradicting the library's own "only exception path is cancellation" /
"fail fast at startup" promises, and (2) public-facing documentation (README) that misstates the
hashing algorithm and the AspNetCore surface. The library is **not yet adopted** by the three apps
(tracked in `components/auth/GAPS.md`).
## Checklist coverage
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☑ | Core logic correct; see Auth-003 (JSON parse can throw on hot path), Auth-005 (KeyPrefix value inconsistency) |
| 2 | Public API surface & compatibility | ☑ | Surface is minimal, sealed records, correct nullable annotations, internal seams hidden via `InternalsVisibleTo`. No issues found |
| 3 | Concurrency & thread safety | ☑ | Options/records immutable; services stateless singletons; fresh SQLite connection per op with WAL + busy_timeout. No issues found |
| 4 | Error handling & resilience | ☑ | LDAP path fully fail-closed; but see Auth-001 (no `ValidateOnStart`), Auth-002 (`MarkUsedAsync` failure fails a valid auth), Auth-003 |
| 5 | Security & secret handling | ☑ | Peppered HMAC-SHA256 + FixedTimeEquals, CSPRNG secrets, parameterized SQL, LDAP escaping, hash-free list projection, secrets never logged. See Auth-006 (TLS cert-validation hook) |
| 6 | Performance & resource management | ☑ | Connections disposed via `await using`; `LdapConnection` disposed; no hot-path allocs of note. No issues found |
| 7 | Spec & shared-contract adherence | ☑ | Matches SPEC §1–§5 closely (HMAC-SHA256 pepper, constant-time compare, discriminated failures, cookie hardening). See Auth-001 (insecure-refusal deferred to first use) |
| 8 | Packaging, dependencies & project layout | ☑ | Clean 4-package split, central versions, minimal deps, Abstractions has no runtime deps. No issues found |
| 9 | Testing coverage | ☑ | Broad and contract-level; gaps: no test for corrupt-`scopes` deserialization (Auth-003) or `MarkUsedAsync`-throws (Auth-002); `ValidateOnStart` behavior untested (Auth-001) |
| 10 | Documentation & XML docs | ☑ | XML docs thorough and accurate; README is stale — see Auth-004 |
## Findings
### Auth-001 — LDAP options validator is registered but never runs at startup
| | |
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.AspNetCore/ServiceCollectionExtensions.cs:40` |
**Description**
`AddZbLdapAuth` binds `LdapOptions` with `services.Configure<LdapOptions>(...)` and registers the
validator with a plain `services.AddSingleton<IValidateOptions<LdapOptions>, LdapOptionsValidator>()`.
It never calls `OptionsBuilder.ValidateOnStart()`. With `Microsoft.Extensions.Options`, an
`IValidateOptions<T>` runs only when the options instance is first materialized (`IOptions<T>.Value`),
not at host startup. Here the only consumer that materializes the options is the `ILdapAuthService`
singleton factory (line 48), whose lambda runs on first resolution of `ILdapAuthService` — i.e. on
the first login attempt, not at boot.
The result contradicts the explicit promise in both the method comment ("Fail fast at startup on a
misconfigured directory rather than on first login") and the validator's own XML doc ("Validates
`LdapOptions` at startup so a misconfiguration fails fast at boot"). It also weakens SPEC §2 step 1,
which says an insecure transport without `AllowInsecure` must be *refused* — that refusal is silently
deferred until a user tries to log in. A process can therefore boot green with `Transport=None` and
no `AllowInsecure`, and only surface the misconfiguration as a failed login later. Because this is a
shared library consumed by three hosts, the missing guard affects all of them.
**Recommendation**
Register options via the builder and add `ValidateOnStart`:
```csharp
services.AddOptions<LdapOptions>()
.Bind(config.GetSection(sectionPath))
.ValidateOnStart();
services.AddSingleton<IValidateOptions<LdapOptions>, LdapOptionsValidator>();
```
(`ValidateOnStart` relies on the host's start-time options validation, which the ASP.NET Core hosts
already run.) Add a test asserting that building + starting the host fails for an insecure config.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `AddZbLdapAuth` now binds via `AddOptions<LdapOptions>().Bind(...).ValidateOnStart()` so the validator runs at host start, not on first login (test: `ServiceCollectionExtensionsTests.AddZbLdapAuth_StartingHost_FailsForInsecureConfig`, plus a `...SucceedsForSecureConfig` companion).
### Auth-002 — A failed `MarkUsedAsync` write turns a valid API key into a thrown exception
| | |
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.ApiKeys/ApiKeyVerifier.cs:66` |
**Description**
After a successful constant-time secret match, `VerifyAsync` does
`await store.MarkUsedAsync(record.KeyId, _timeProvider.GetUtcNow(), ct)` before returning the
success result. `MarkUsedAsync` is a best-effort "last used" bookkeeping write. If that write fails
(SQLite still `SQLITE_BUSY` after the 5 s busy-timeout, disk full, DB locked by a long migration,
etc.), the exception propagates out of `VerifyAsync` and the otherwise-valid credential fails
authentication. The verifier's own class remark states "The only exception path is cancellation",
so callers (gRPC interceptors / middleware) will not be wrapping this in a try/catch and a transient
storage hiccup becomes an auth outage for a legitimate caller.
**Recommendation**
Treat the usage write as best-effort: either swap the order (return success, then fire-and-record),
or wrap the `MarkUsedAsync` call so a failure is logged/swallowed rather than failing the
verification — the authentication decision has already been made by line 60. If `MarkUsedAsync` is
allowed to throw, update the XML remark to say so. Add a test where the store's `MarkUsedAsync`
throws and assert the result is still `Succeeded == true`.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `VerifyAsync` now wraps the best-effort `MarkUsedAsync` write in try/catch, re-throwing only `OperationCanceledException` and swallowing all other failures so a valid key still authenticates (tests: `ApiKeyVerifierTests.VerifyAsync_ValidKey_MarkUsedThrows_StillSucceeds` and `..._MarkUsedThrowsOperationCanceled_Propagates`).
### Auth-003 — Corrupt `scopes`/`constraints` column throws `JsonException` through the verifier
| | |
|--|--|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.ApiKeys/Sqlite/ScopeSerializer.cs:31`, `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.ApiKeys/Sqlite/SqliteApiKeyStore.cs:71` |
**Description**
`SqliteApiKeyStore.ReadRecord` calls `ScopeSerializer.Deserialize(reader.GetString(4))`, which does
`JsonSerializer.Deserialize<string[]>(value)`. If the `scopes` column ever holds non-array or
malformed JSON (operator tampering, a partial write, a future schema/format change, or a bug in
another writer), `Deserialize` throws `JsonException`. That exception propagates out of
`FindByKeyIdAsync`, and therefore out of `ApiKeyVerifier.VerifyAsync`, as an unhandled exception —
again contradicting the "only exception path is cancellation" contract and converting a single
poisoned row into a crash on the auth path rather than a structured fail-closed result. The same
applies to the admin `ListAsync` projection.
**Recommendation**
Decide the policy explicitly. Either (a) catch `JsonException` in `ScopeSerializer.Deserialize` and
treat a malformed column as an empty/invalid scope set so verification fails closed structurally, or
(b) catch it at the store boundary and surface it as a controlled error. Add a test that inserts a
row with a malformed `scopes` value and asserts the verifier returns a structured failure (or a
documented exception) rather than an unhandled `JsonException`.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `ScopeSerializer.Deserialize` now catches `JsonException` and fails closed to an empty scope set (policy (a)), so a corrupt `scopes`/`constraints` column degrades to a zero-scope identity instead of throwing through the store and verifier (tests: `SqliteApiKeyStoreTests.ScopeSerializer_DeserializeMalformed_ReturnsEmptySet_DoesNotThrow` (Theory) and `..._FindByKeyId_CorruptScopesColumn_ReturnsRecordWithEmptyScopes_DoesNotThrow`).
### Auth-004 — README misstates the hashing algorithm and the AspNetCore public surface
| | |
|--|--|
| Severity | Low |
| Category | Documentation & XML docs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Auth/README.md:13`, `ZB.MOM.WW.Auth/README.md:14` |
**Description**
The README's package table contains two inaccuracies that matter for a security library:
1. It describes `ZB.MOM.WW.Auth.ApiKeys` as a store "with pepper-based **PBKDF2** hashing". The code
uses **HMAC-SHA256** keyed by the pepper (`ApiKeySecretHasher.Hash``new HMACSHA256(pepperBytes)`),
not PBKDF2. This is the correct algorithm per SPEC §4 ("HMAC-SHA256 with an external pepper"), so
the *code* is right and the *README* is wrong — but a reader auditing the crypto will be misled
about iteration count / KDF properties that do not exist here.
2. It says `ZB.MOM.WW.Auth.AspNetCore` provides "ASP.NET Core DI helpers (`AddZbAuth`) ... and
`LdapOptionsValidator` registration. **Wires together Ldap + ApiKeys + cookie middleware**." The
actual public surface is a single extension method `AddZbLdapAuth` (LDAP only). There is no
`AddZbAuth`, no API-key re-export from this package, and no cookie-middleware wiring —
`ZbCookieDefaults.Apply` is a helper the consumer must call itself, and API-key DI lives in the
separate `ZB.MOM.WW.Auth.ApiKeys` package (`AddZbApiKeyAuth`). The shared contract
(`ZB.MOM.WW.Auth.md`) likewise lists a combined DI shape that the code splits differently.
**Recommendation**
Fix the README table: change "PBKDF2" to "HMAC-SHA256", and correct the AspNetCore row to describe
the real surface (`AddZbLdapAuth`, `ZbCookieDefaults.Apply`, `ZbClaimTypes`) and note that API-key
wiring is `AddZbApiKeyAuth` in the ApiKeys package. Optionally reconcile the shared-contract doc's
DI-method names with the shipped names.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — README package table corrected: "PBKDF2" → "pepper-keyed HMAC-SHA256" and the AspNetCore row rewritten to the real surface (`AddZbLdapAuth`, `ZbCookieDefaults.Apply`, `ZbClaimTypes`, LDAP-only, no API-key/cookie-middleware wiring), with the stale `ApiKeys` dependency dropped from that row to match the `.csproj` (test: doc-only).
### Auth-005 — `CreateKeyAsync` persists `KeyPrefix` as `prefix_keyId`, inconsistent with the read path
| | |
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.ApiKeys/Admin/ApiKeyAdminCommands.cs:104` |
**Description**
When creating a key, `KeyPrefix` is set to `$"{_options.TokenPrefix}_{keyId}"` (e.g. `mxgw_key-1`),
embedding the key id into the column. Elsewhere `KeyPrefix` is treated as the bare token prefix:
the verifier test (`ApiKeyVerifierTests.BuildRecord`) and the store test
(`SqliteApiKeyStoreTests.SampleRecord`) seed it as `"mxgw"` / `"mxgw_ab12"`, and the `ApiKeyRecord`
contract gives no indication it should include the key id. `KeyPrefix` is not consulted during
verification (the token is re-parsed from the header), so there is no security impact — but the
persisted/displayed value is inconsistent and self-referential (the key id is already its own
column), which will confuse admin tooling that surfaces `KeyPrefix`.
**Recommendation**
Persist `KeyPrefix = _options.TokenPrefix` (the bare prefix) and let the key id remain the `KeyId`
column, or document precisely what `KeyPrefix` is meant to contain on the `ApiKeyRecord` /
`ApiKeyListItem` records and align all call sites to it.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `CreateKeyAsync` now persists `KeyPrefix = _options.TokenPrefix` (the bare prefix) instead of `prefix_keyId`, aligning the write path with the read/test paths (test: `ApiKeyAdminCommandsTests.CreateKey_PersistsBareTokenPrefix_NotPrefixUnderscoreKeyId`).
### Auth-006 — LDAP connection ignores `allowInsecure` and offers no TLS certificate-validation hook
| | |
|--|--|
| Severity | Low |
| Category | Security & secret handling |
| Status | Resolved |
| Location | `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.Ldap/Internal/NovellLdapConnection.cs:16` |
**Description**
`NovellLdapConnection.Connect` receives `bool allowInsecure` but never uses it; transport-security
correctness rests entirely on `LdapOptionsValidator` having run beforehand (and that registration is
itself lazy — see Auth-001). The method also sets `SecureSocketLayer`/`StartTls` but never configures
a server-certificate validation delegate or any pinning/CA hook. The underlying
`Novell.Directory.Ldap.NETStandard` 3.6.0 validates against the OS trust store by default (it does
*not* blind-accept), so this is not an unauthenticated-TLS hole today — hence Low — but the library
gives a consumer no seam to tighten (pin a CA, validate the SAN against `Server`, or deliberately
accept a self-signed dev cert), and the unused parameter reads as a guard that does nothing.
**Recommendation**
Either remove the unused `allowInsecure` parameter from the `ILdapConnection.Connect` seam or use it
to gate an explicit "accept invalid cert" path for dev only. Add an optional
`RemoteCertificateValidationCallback` (or CA-pinning option) so production consumers can harden LDAPS
beyond the OS default, and document the default validation behavior.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — added an optional `LdapOptions.ServerCertificateValidationCallback` (BCL `RemoteCertificateValidationCallback`, no new dependency) threaded through the `ILdapConnection.Connect` seam; `NovellLdapConnection` now builds `LdapConnectionOptions.ConfigureRemoteCertificateValidationCallback(...)` (a supplied callback wins; otherwise the previously-dead `allowInsecure` gates a dev-only accept-any path; otherwise OS-trust-store default), and the default validation behaviour is documented in XML/README (tests: `LdapAuthServiceTests.Connect_ReceivesConfiguredCertValidationCallback`, `..._Connect_ReceivesAllowInsecureFlag_FromOptions`, `..._Connect_NoCertCallbackConfigured_PassesNull`).
+186
View File
@@ -0,0 +1,186 @@
# Code Review — Configuration
| Field | Value |
|-------|-------|
| Library | `ZB.MOM.WW.Configuration/` |
| Packages | `ZB.MOM.WW.Configuration` |
| Component spec | `components/configuration/spec/SPEC.md` |
| Shared contract | `components/configuration/shared-contract/ZB.MOM.WW.Configuration.md` |
| Status | Reviewed |
| Last reviewed | 2026-06-01 |
| Reviewer | Claude (automated baseline) |
| Commit reviewed | `5f75cd4` |
| Open findings | 0 |
## Summary
The library is small (five source files, ~230 LOC) and in good health. Its core promise —
**accumulate every failure, never short-circuit** — holds across all three entry points
(`ValidationBuilder`, `OptionsValidatorBase`, `ConfigPreflight`): the override never returns
early, each primitive appends to a `List<string>` and returns `this`, and the `Success`/`Fail`
decision is taken only after every rule has run. The boundary cases that matter are correct:
`Port` rejects 0 and 65536 and accepts 1/65535; `HostPort` correctly rejects bare hosts, empty
ports, out-of-range ports, and (deliberately) unbracketed IPv6; `Required` treats null / empty /
whitespace alike; `OneOf` is case-insensitive and treats null as a failure (documented);
`MinCount` treats null as zero. Null-guarding is consistent (`ArgumentNullException`/
`ArgumentException.ThrowIfNullOrWhiteSpace` on every public entry point). The
`ConfigPreflight.ThrowIfInvalid()` envelope is **byte-compatible** with the ScadaBridge
`StartupValidator` format the SPEC pins (`"Configuration validation failed:\n - <field> <reason>"`,
`\n`-joined), and the validator created by `OptionsValidatorBase` is stateless (fresh
`ValidationBuilder` per call), so the singleton-registration requirement in SPEC §3 is honoured.
Dependency footprint is minimal and exactly as documented (four `Microsoft.Extensions.*`
abstractions, no ASP.NET Core). The four findings are all **Low**: a non-idempotent DI
registration (`AddSingleton` vs `TryAddEnumerable`), a small wording inconsistency in the raw-port
check, over-permissive integer port parsing, and the XML-doc-not-shipped packaging gap (which is
fleet-wide, not specific to this library). Test coverage (27 tests across the four public surfaces)
is solid for the happy and primary-failure paths; a few edge cases noted below are untested.
## Checklist coverage
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☑ | Accumulation never short-circuits; port/host:port/duration/min-count boundaries correct. Port parsing over-permissive (003). |
| 2 | Public API surface & compatibility | ☑ | Surface matches the shared contract exactly; nullability annotations correct; `sealed`/`abstract` deliberate; internal `Checks` does not leak. No issues. |
| 3 | Concurrency & thread safety | ☑ | Validator is stateless (fresh builder per `Validate`); `ValidationBuilder`/`ConfigPreflight` are per-call locals. Safe as a singleton. No issues. |
| 4 | Error handling & resilience | ☑ | Guard clauses on every public entry; correct exception types (`OptionsValidationException` via `ValidateOnStart`, `InvalidOperationException` from `ConfigPreflight`); no swallowed exceptions. No issues. |
| 5 | Security & secret handling | ☑ | No secrets handled; failure messages echo config *values* (port/host/role), which are non-sensitive by design. No issues. |
| 6 | Performance & resource management | ☑ | Startup-only, no hot path, nothing disposable, no async. No issues. |
| 7 | Spec & shared-contract adherence | ☑ | `ConfigPreflight` envelope byte-compatible with `StartupValidator`; primitives match §2 table. Minor wording inconsistency in `PortValue` (002). |
| 8 | Packaging, dependencies & project layout | ☑ | Single package, minimal closure, central versions, `.gitignore` present, no tracked build output. XML docs / README not packaged (004). |
| 9 | Testing coverage | ☑ | 27 tests cover the four public surfaces + accumulation + byte-format. Untested: `AddValidatedOptions` null guards, double-registration, exact wording strings. |
| 10 | Documentation & XML docs | ☑ | Public XML docs present and accurate on every type/member. Not emitted into the nupkg (004). |
## Findings
### Configuration-001 — `AddValidatedOptions` uses `AddSingleton`, so a double call registers (and runs) the validator twice
| | |
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Configuration/src/ZB.MOM.WW.Configuration/ServiceCollectionExtensions.cs:36` |
**Description**
`AddValidatedOptions` registers the validator with
`services.AddSingleton<IValidateOptions<TOptions>, TValidator>()`. `IValidateOptions<TOptions>`
is a multi-registration (enumerable) service — the options pipeline resolves *all* registered
`IValidateOptions<TOptions>` and runs each. Because `AddSingleton` appends unconditionally, calling
`AddValidatedOptions<TOptions, TValidator>` twice for the same `TOptions`/`TValidator` pair (e.g.
two modules both guarding the same section, or an accidental duplicate during refactoring)
registers the validator twice, so it executes twice and every accumulated failure is reported
twice in the resulting `OptionsValidationException`. The framework-idiomatic registration for
`IValidateOptions<T>` is `TryAddEnumerable(ServiceDescriptor.Singleton<IValidateOptions<TOptions>,
TValidator>())`, which is idempotent. Impact is limited (consumers normally call once per section,
and duplicate messages are cosmetic, not a correctness break of the accumulate-all contract), hence
Low — but the deviation from the .NET idiom is real and easy to fix. The SPEC §3 wording ("registers
the validator as a singleton") is satisfied either way.
**Recommendation**
Replace `services.AddSingleton<IValidateOptions<TOptions>, TValidator>()` with
`services.TryAddEnumerable(ServiceDescriptor.Singleton<IValidateOptions<TOptions>, TValidator>())`
(via `Microsoft.Extensions.DependencyInjection.Extensions`). Add a test asserting that calling
`AddValidatedOptions` twice yields a single set of failure messages.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `AddValidatedOptions` now registers the validator via `TryAddEnumerable(ServiceDescriptor.Singleton<...>())`, so a double call is idempotent (test: `AddValidatedOptionsTests.Calling_twice_registers_validator_once`).
### Configuration-002 — `Checks.PortValue` quotes the raw value on a parse failure but not on a range failure
| | |
|--|--|
| Severity | Low |
| Category | Spec & shared-contract adherence |
| Status | Resolved |
| Location | `ZB.MOM.WW.Configuration/src/ZB.MOM.WW.Configuration/Checks.cs:21` |
**Description**
`PortValue` produces two different message shapes for the same field depending on *why* the port
is invalid. When the raw string parses but is out of range (e.g. `"0"` or `"70000"`), it delegates
to `Checks.Port(int, field)`, which renders `"<field> must be between 1 and 65535 (was 0)"`
**unquoted**. When the raw string does not parse (e.g. `"notaport"`, `null`), it renders
`"<field> must be between 1 and 65535 (was 'notaport')"`**quoted**. So two failures of the same
rule, from the same raw-config caller, read with inconsistent quoting. The shared contract
(`shared-contract/ZB.MOM.WW.Configuration.md`, "Internal `Checks` seam") states `Checks` is "the
single source of failure wording" so "a port failure reads the same whether it came from a bound
options object or a raw config key" — this inconsistency is in mild tension with that claim. It is
cosmetic (no consumer parses the text) and the overlapping integer case *does* match the bound-options
wording, so it is Low.
**Recommendation**
Make the two branches use a consistent quoting convention for the offending value — e.g. have the
range-failure branch also quote (`(was '0')`), or have the parse-failure branch route through a
shared formatter. Lock the exact strings down with a wording assertion test.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `Checks.PortValue` now quotes the offending raw value on both the parse-failure and range-failure branches (`(was '0')`), wording pinned by test (test: `ChecksWordingTests.PortValue_range_failure_quotes_the_value`).
### Configuration-003 — Port parsing accepts leading sign and surrounding whitespace and is culture-sensitive
| | |
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Configuration/src/ZB.MOM.WW.Configuration/Checks.cs:22`, `ZB.MOM.WW.Configuration/src/ZB.MOM.WW.Configuration/Checks.cs:36` |
**Description**
Both `PortValue` (`int.TryParse(raw, out var port)`) and `HostPort`
(`int.TryParse(value[(idx + 1)..], out var port)`) use the default `int.TryParse` overload, which
applies `NumberStyles.Integer` (`AllowLeadingWhite | AllowTrailingWhite | AllowLeadingSign`) and the
current culture. As a result strings the documentation describes as "integer TCP port" are accepted
more loosely than intended: `"+5000"`, `" 5000 "`, and `"host: 5000"` (space after the colon) all
parse and pass, and parsing is locale-dependent. A leading `-` parses to a negative number that the
subsequent range check rejects (so `"-1"` is correctly rejected), but the whitespace/leading-`+`
cases silently pass. This is a robustness nuance rather than a security or correctness break — a port
that survives the loose parse is still in `1..65535` — so it is Low.
**Recommendation**
Parse with an explicit, culture-invariant, strict style, e.g.
`int.TryParse(raw, NumberStyles.None, CultureInfo.InvariantCulture, out var port)` (rejects sign and
whitespace), in both `PortValue` and `HostPort`. Add `Theory` cases for `"+5000"`, `" 5000 "`, and a
space-after-colon endpoint to pin the behaviour.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — both `PortValue` and `HostPort` now parse with `int.TryParse(s, NumberStyles.None, CultureInfo.InvariantCulture, ...)`, rejecting leading sign/whitespace and culture-dependent formats (test: `ChecksWordingTests.PortValue_rejects_loose_inputs`, `HostPort_rejects_loose_port_inputs`).
### Configuration-004 — XML documentation and README are not packaged into the nupkg
| | |
|--|--|
| Severity | Low |
| Category | Documentation & XML docs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Configuration/src/ZB.MOM.WW.Configuration/ZB.MOM.WW.Configuration.csproj:1`, `ZB.MOM.WW.Configuration/Directory.Build.props:1` |
**Description**
Every public type and member carries accurate XML doc comments, but neither the project nor
`Directory.Build.props` sets `<GenerateDocumentationFile>true</GenerateDocumentationFile>`, so no
`.xml` doc file is produced or included in the `dotnet pack` output. Consumers of the package
therefore get **no IntelliSense documentation** for `OptionsValidatorBase`, `ValidationBuilder`,
`AddValidatedOptions`, or `ConfigPreflight`, despite the docs existing in source. The project also
does not set `PackageReadmeFile`, so the README is not embedded in the package for display on the
NuGet feed. Category 10 of the review process explicitly notes "these are libraries — public docs
matter," so the gap is worth recording. It is fleet-wide (the sibling `ZB.MOM.WW.Audit` library has
the same omission), not a Configuration-specific regression, hence Low.
**Recommendation**
Add `<GenerateDocumentationFile>true</GenerateDocumentationFile>` (ideally in
`Directory.Build.props` so the whole family inherits it) and consider `<PackageReadmeFile>README.md</PackageReadmeFile>`
plus packing the README. Treat as a shared-infra change across the six libraries rather than a
one-off.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `GenerateDocumentationFile=true` added to `Directory.Build.props` (test project opts out) and `PackageReadmeFile`/README pack item added to the csproj, so `dotnet pack` now ships `ZB.MOM.WW.Configuration.xml` and `README.md` in the nupkg.
+255
View File
@@ -0,0 +1,255 @@
# Code Review — Health
| Field | Value |
|-------|-------|
| Library | `ZB.MOM.WW.Health/` |
| Packages | `ZB.MOM.WW.Health`, `ZB.MOM.WW.Health.Akka`, `ZB.MOM.WW.Health.EntityFrameworkCore` |
| Component spec | `components/health/spec/SPEC.md` |
| Shared contract | `components/health/shared-contract/ZB.MOM.WW.Health.md` |
| Status | Reviewed |
| Last reviewed | 2026-06-01 |
| Reviewer | Claude (automated baseline) |
| Commit reviewed | `5f75cd4` |
| Open findings | 0 |
## Summary
The library is small, cohesive, and well-documented, and it tracks the normalized spec closely:
the three-tier endpoint convention, the canonical JSON writer, the `IActiveNodeGate` seam, the
configurable `AkkaClusterStatusPolicy` presets, and the role-filtered `ActiveNodeHealthCheck` all
match the contract. The package split is clean — Akka and EF Core stay out of the core package, so
MxGateway can take core only (category 8: no leakage). The decision logic is factored into pure,
exhaustively table-tested functions (`AkkaClusterStatusPolicy`, `ActiveNodeDecision`), and the test
suite (58 tests) exercises the public surface through a real TestServer / real SQLite / real probe
delegates rather than mocks.
The findings cluster around **error-handling completeness in the health-check probes** (category 4),
the highest-risk area for this library: a probe that throws past the `IHealthCheck` boundary, or that
returns a status the spec says it should not, degrades the value of every consuming app's `/health/*`
endpoints. The two Akka checks and the gRPC check do not guard the cluster-state / probe call the way
the spec requires (return Degraded on inaccessible cluster) or the way the sibling `DatabaseHealthCheck`
does (catch-all → Unhealthy). The framework's `HealthCheckService` catches escaping exceptions, so none
of these crash an endpoint — that caps their severity at Medium — but they produce a result that
disagrees with the documented contract. The remainder are Low: a JSON shape nuance (`description` key
omitted vs. emitted-null), recommended-tag drift in XML docs, a return-value footgun in `MapZbHealth`,
and missing `GenerateDocumentationFile` so the (otherwise excellent) XML docs do not ship in the nupkgs.
## Checklist coverage
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☑ | Tier predicates, policy presets, and `ActiveNodeDecision` matrices are correct and match the spec tables. No logic defects found. |
| 2 | Public API surface & compatibility | ☑ | Health-005. Surface is minimal, sealed, well-named; nullable annotations correct; no internal types leak (`ActiveNodeDecision` is `internal`). |
| 3 | Concurrency & thread safety | ☑ | Options are mutable POCOs but per-registration; checks hold no shared mutable state; cluster-state reads are snapshot reads; static writer options are immutable. No issues found. |
| 4 | Error handling & resilience | ☑ | Health-001, Health-002. Akka checks don't guard cluster-state access; gRPC check lets non-Rpc/non-OCE exceptions escape. |
| 5 | Security & secret handling | ☑ | No secrets/PII. Exception objects attached to `HealthCheckResult` (standard); endpoints `AllowAnonymous` by design. No issues found. |
| 6 | Performance & resource management | ☑ | `DatabaseHealthCheck` is pool-safe and eagerly releases the timeout timer; scopes/contexts disposed via `await using`; channel not owned (correctly not disposed). No issues found. |
| 7 | Spec & shared-contract adherence | ☑ | Health-001, Health-003, Health-004. Endpoint convention + status strings otherwise match §3. |
| 8 | Packaging, dependencies & project layout | ☑ | Health-006. Dependency split correct; central versions; net10.0; correct package ids. |
| 9 | Testing coverage | ☑ | Strong: TestServer tier tests, real SQLite EF tests, table-driven policy/decision tests, gRPC probe + timeout + cancellation. Gaps noted inline in findings, not raised separately. |
| 10 | Documentation & XML docs | ☑ | Health-004, Health-006. Public API XML docs are thorough and accurate; the issues are recommended-tag drift and docs not being packed. |
## Findings
### Health-001 — Akka health checks throw (instead of returning Degraded) when cluster state is inaccessible
| | |
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health.Akka/AkkaClusterHealthCheck.cs:45`, `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health.Akka/ActiveNodeHealthCheck.cs:106` |
**Description**
Both Akka checks guard only the *null* ActorSystem case: `_serviceProvider.GetService<ActorSystem>()`
returns `null` → Degraded. But once a non-null `ActorSystem` is resolved they call `Cluster.Get(system)`
and read `.SelfMember` / `.State.Leader` with no try/catch. The spec is explicit that this path must be
Degraded, not an exception:
> "in both modes, if the ActorSystem is not yet ready **or cluster state is inaccessible** (e.g.
> during startup), the check returns Degraded (startup-safety rule)." — `SPEC.md` §2.2 note.
`Cluster.Get(system)` throws (`ConfigurationException`) when the `Akka.Cluster` extension is not
configured on the resolved ActorSystem, and `SelfMember` can throw during the window where the
ActorSystem object exists but the cluster has not finished initialising — exactly the startup race the
spec calls out. The ActorSystem being present-but-not-yet-clustered is a very real ordering in a
`Microsoft.Extensions.Hosting` app (the ActorSystem is registered in DI before `Cluster` has joined).
The escaping exception is caught by ASP.NET Core's `HealthCheckService`, which records the entry as
Unhealthy — so the endpoint does not crash. But Unhealthy on the `ready` tier returns **503** and
removes the node from load-balancing, whereas the spec's intended Degraded returns **200** (still in
rotation). A transient startup window therefore yanks a healthy-but-starting node out of rotation,
the opposite of the "startup-safe" guarantee the XML docs advertise.
**Recommendation**
Wrap the cluster-state read in a try/catch and return `HealthCheckResult.Degraded(...)` on failure, so
both the null-ActorSystem and inaccessible-cluster paths converge on Degraded as the spec requires:
resolve `Cluster.Get(system)` and read membership inside a `try`, catching the cluster extension's
startup exceptions and mapping them to Degraded with a "cluster not yet ready" description.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — wrapped the `Cluster.Get(system)`/`SelfMember`/leader reads in `AkkaClusterHealthCheck` and `ActiveNodeHealthCheck` in `try/catch (Exception when not OCE)` returning `Degraded("Akka cluster state not yet accessible.")`; tests add a plain non-clustered ActorSystem to assert Degraded instead of the escaping `ConfigurationException`.
### Health-002 — `GrpcDependencyHealthCheck` lets non-`RpcException`/non-`OperationCanceledException` errors escape
| | |
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health/GrpcDependencyHealthCheck.cs:54` |
**Description**
`CheckHealthAsync` catches only `RpcException` and `OperationCanceledException`. Any other exception
thrown by the probe escapes the method. This is reachable on both paths:
- Default probe: `GrpcChannel.ConnectAsync` can throw exceptions other than `RpcException`
e.g. `InvalidOperationException` (channel/socket misuse, shutdown) or `HttpRequestException` /
`SocketException` surfaced from the transport before a gRPC status is produced.
- Custom probe (`GrpcDependencyOptions.Probe`): a caller-supplied delegate (e.g. a
`grpc.health.v1.Health/Check` RPC built on a non-gRPC client) can throw anything.
The XML docs and shared contract describe the result as Unhealthy "when it returns `false`, throws an
`RpcException`, or times out" — they do not promise to handle arbitrary exceptions, and the code does
not. By contrast the sibling `DatabaseHealthCheck` has a catch-all `catch (Exception ex)`
`Unhealthy("Database connection failed.", ex)` (`DatabaseHealthCheck.cs:78`), so the two probes are
asymmetric. As with Health-001 the framework's `HealthCheckService` records the escaping exception as
Unhealthy, so no endpoint crashes — but the dependency-named, controlled description
(`"{name} probe failed…"`) is lost, and the library violates its own "must not throw past the
`IHealthCheck` boundary" intent.
**Recommendation**
Add a trailing `catch (Exception ex)` returning `HealthCheckResult.Unhealthy($"{name} probe failed:
{ex.Message}", ex)`, after the existing OCE/Rpc handlers (so external cancellation still propagates).
This brings the gRPC probe in line with `DatabaseHealthCheck` and the documented contract.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — added a trailing `catch (Exception ex)` to `GrpcDependencyHealthCheck.CheckHealthAsync` returning `Unhealthy("{name} probe failed: {ex.Message}", ex)` after the OCE/Rpc external-cancellation handlers; new test asserts a probe throwing `InvalidOperationException` maps to Unhealthy.
### Health-003 — Null `description` is omitted from the JSON body instead of emitted as `null`
| | |
|--|--|
| Severity | Low |
| Category | Spec & shared-contract adherence |
| Status | Resolved |
| Location | `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health/ZbHealthWriter.cs:32` |
**Description**
`SerializerOptions` sets `DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull`, so a check that
produces no description renders an entry with **no** `description` key at all:
`{ "status": "Healthy", "durationMs": 0.1 }`. The spec models `description` as a `string?` that "may be
null" and both the canonical example (`SPEC.md` §3) and the `HealthChecks.UI.Client` shape the writer
claims to mirror emit the key with a `null` value rather than dropping it. Consumers/dashboards parsing
the body and reading `entries.<name>.description` must therefore handle a *missing* property, not just a
null one. The deviation is undocumented and untested — `ResponseWriterTests` only asserts the
present-description case.
Low because the aggregate `status`/HTTP-code contract that orchestrators key on is unaffected; only the
descriptive sub-field shape drifts.
**Recommendation**
Either remove `WhenWritingNull` (so `description: null` is emitted, matching the spec example and the
UI-client shape), or document explicitly in the writer's XML docs and `SPEC.md` §3 that absent
descriptions are omitted, and add a test pinning the chosen behaviour.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — removed `DefaultIgnoreCondition = WhenWritingNull` from `ZbHealthWriter.SerializerOptions` so a null `description` renders as `"description": null` (matching the spec example / UI-client shape); writer XML doc now states the key is always present, and a new `ResponseWriterTests` case pins present-and-null.
### Health-004 — XML docs recommend the `active` tag for ready-tier probes, contradicting the spec
| | |
|--|--|
| Severity | Low |
| Category | Documentation & XML docs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health.Akka/AkkaClusterHealthCheck.cs:11`, `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health/GrpcDependencyHealthCheck.cs:21` |
**Description**
`AkkaClusterHealthCheck`'s summary says *"Register to the `ZbHealthTags.Ready` tag (recommended
`[ready, active]`)"*, and `GrpcDependencyHealthCheck` likewise recommends tagging `[ready, active]`.
The spec is unambiguous that both probes belong to the `ready` tier only: §2.2 "Registered to the
`ready` tag" (Akka cluster) and §2.4 "Registered to the `ready` tag" (gRPC dependency). The `active`
tier is reserved by §1 / §2.3 for the leader/active-singleton probe (`ActiveNodeHealthCheck`); putting
cluster-membership or gRPC-reachability checks on the `active` tier pollutes the active-node tier with
non-leadership concerns and contradicts the convergence convention the library exists to enforce.
**Recommendation**
Bring the XML-doc recommendations in line with the spec — recommend `ZbHealthTags.Ready` only for both
probes (drop the `active` suggestion); or, if the dual-tag recommendation is intentional, reconcile it
back into `SPEC.md` §2.2/§2.4 so code and spec agree.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — updated the XML summaries on `AkkaClusterHealthCheck` and `GrpcDependencyHealthCheck` to recommend `ZbHealthTags.Ready` only (dropped the `[ready, active]` suggestion), aligning the docs with SPEC §2.2/§2.4.
### Health-005 — `MapZbHealth` returns only the readiness builder, silently dropping conventions for the active/live tiers
| | |
|--|--|
| Severity | Low |
| Category | Public API surface & compatibility |
| Status | Resolved |
| Location | `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health/ZbHealthEndpointExtensions.cs:64` |
**Description**
`MapZbHealth` maps three endpoints but returns the `IEndpointConventionBuilder` for `/health/ready`
only. Any convention a caller chains onto the result — `.RequireHost(...)`, `.RequireAuthorization()`,
`.WithMetadata(...)`, `.CacheOutput()` — applies to the readiness endpoint **alone**; the active and
liveness endpoints silently do not receive it. The behaviour is documented in the XML `<returns>`, so
this is a deliberate, disclosed trade-off rather than a bug, but it is a real footgun: the most natural
reading of `endpoints.MapZbHealth().RequireHost("…")` is "gate all three health endpoints", and the
result type gives the caller no signal that two of the three are excluded.
**Recommendation**
Consider returning a composite builder that fans conventions out to all three endpoints (collect the
three `IEndpointConventionBuilder`s and wrap them) so chained conventions behave as a caller expects.
If the single-builder return is kept for simplicity, strengthen the `<returns>` remark to a warning and
note it in the README's `MapZbHealth` example.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `MapZbHealth` now returns a private `CompositeEndpointConventionBuilder` that fans `Add`/`Finally` to all three endpoint builders (readiness, active, liveness); `<returns>` docs updated, and a new `TierMappingTests` case proves `.RequireHost(...)` gates all three tiers.
### Health-006 — XML documentation is not emitted into the packed nupkgs
| | |
|--|--|
| Severity | Low |
| Category | Packaging, dependencies & project layout |
| Status | Resolved |
| Location | `ZB.MOM.WW.Health/Directory.Build.props:3`, `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health/ZB.MOM.WW.Health.csproj:3` |
**Description**
Every public type and member in all three packages carries thorough, accurate XML documentation, but no
project (and neither `Directory.Build.props` nor the three `.csproj` files) sets
`<GenerateDocumentationFile>true</GenerateDocumentationFile>`. As a result `dotnet pack` produces nupkgs
that contain the DLLs but **no `.xml` doc files**, so consumers of `ZB.MOM.WW.Health` / `.Akka` /
`.EntityFrameworkCore` get no IntelliSense summaries or parameter docs for the API. For a shared library
whose value proposition is a documented common surface across three apps, this discards the
documentation effort at the package boundary. (It also means the compiler does not enforce the
"public members are documented" invariant via CS1591.)
**Recommendation**
Set `<GenerateDocumentationFile>true</GenerateDocumentationFile>` once in `Directory.Build.props` so all
three packable projects emit and pack their XML docs. Optionally pair with targeted `<NoWarn>` if any
CS1591 warnings surface on currently-undocumented members.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — set `<GenerateDocumentationFile>true</GenerateDocumentationFile>` (plus `NoWarn=CS1591` for test/non-packed members) in `Directory.Build.props`; verified `dotnet pack -c Release` now ships `lib/net10.0/*.xml` in all three nupkgs.
+80
View File
@@ -0,0 +1,80 @@
# Code Reviews
Comprehensive, per-library code reviews of the `ZB.MOM.WW.*` shared libraries hosted
in this repo. Each library (one self-contained `.slnx` at the repo root) has its own
folder containing a `findings.md`. This README is the aggregated index — the single
place to see all outstanding work.
> Generated by `regen-readme.py` from the per-library `findings.md` files. Do not
> edit by hand — edit the findings files and re-run the script.
## How it works
- Reviews are performed one library at a time against a fixed checklist.
- Each library is reviewed against its normalized component spec under `components/`.
- Every finding is recorded in the library's `findings.md` with a severity and status.
- Findings are **never deleted** — they are closed by changing their status, keeping
a full audit trail.
- This README aggregates every **pending** finding (`Open` / `In Progress`) across all
libraries.
See **[REVIEW-PROCESS.md](REVIEW-PROCESS.md)** for the full procedure: the review
checklist, severity definitions, finding format, the library → component-spec mapping,
and how to mark items resolved.
## Layout
```
code-reviews/
├── README.md # this file — process overview + pending findings
├── REVIEW-PROCESS.md # how to perform a review and track findings
├── regen-readme.py # regenerates this README from the findings files
├── _template/findings.md # copy-this template for a library review
└── <Library>/findings.md # one folder per ZB.MOM.WW.* shared library
```
## Summary
6 of 6 libraries reviewed. 0 pending findings across all libraries.
| Severity | Open findings |
|----------|---------------|
| Critical | 0 |
| High | 0 |
| Medium | 0 |
| Low | 0 |
| **Total** | **0** |
## Library Status
| Library | Last reviewed | Commit | Open (C/H/M/L) | Open | Total |
|---------|---------------|--------|----------------|------|-------|
| [Audit](Audit/findings.md) | 2026-06-01 | `5f75cd4` | 0/0/0/0 | 0 | 5 |
| [Auth](Auth/findings.md) | 2026-06-01 | `5f75cd4` | 0/0/0/0 | 0 | 6 |
| [Configuration](Configuration/findings.md) | 2026-06-01 | `5f75cd4` | 0/0/0/0 | 0 | 4 |
| [Health](Health/findings.md) | 2026-06-01 | `5f75cd4` | 0/0/0/0 | 0 | 6 |
| [Telemetry](Telemetry/findings.md) | 2026-06-01 | `5f75cd4` | 0/0/0/0 | 0 | 8 |
| [Theme](Theme/findings.md) | 2026-06-01 | `5f75cd4` | 0/0/0/0 | 0 | 6 |
## Pending Findings
Every `Open` / `In Progress` finding across all libraries, highest severity first.
Resolved findings drop off this list but remain recorded in their library's
`findings.md` (see [REVIEW-PROCESS.md](REVIEW-PROCESS.md) §4–§5). Full detail —
description, location, recommendation — lives in the library's `findings.md`.
### Critical (0)
_None open._
### High (0)
_None open._
### Medium (0)
_None open._
### Low (0)
_None open._
+152
View File
@@ -0,0 +1,152 @@
# Code Review Process
This document describes how to perform a comprehensive, per-library code review of the
shared libraries hosted in **scadaproj** and how to track findings to resolution.
scadaproj is an umbrella workspace, but it hosts first-party source: the six
`ZB.MOM.WW.*` shared libraries (the realized output of the
[component normalizations](../components/README.md)). Those libraries are the code that
gets reviewed here. A **library** is one self-contained solution at the repo root (its own
`.slnx`, with `src/` and `tests/`) — e.g. `ZB.MOM.WW.Auth/`. Each library has its own
folder under `code-reviews/` containing a single `findings.md`.
The review unit is the **library**, not the individual NuGet package. A library that ships
several packages (Auth → 4, Health → 3, Telemetry → 2) is reviewed as a whole; findings
point at the specific package/file in their **Location**.
### The six libraries and their design context
Each library is reviewed against its normalized component spec. The library folder name is
the `ZB.MOM.WW.` prefix stripped; note the spec directory is **not** always the same word.
| Library folder | Library root | Packages | Component spec |
|----------------|--------------|----------|----------------|
| `Auth` | `ZB.MOM.WW.Auth/` | `Abstractions`, `Ldap`, `ApiKeys`, `AspNetCore` | [`components/auth/`](../components/auth/) |
| `Theme` | `ZB.MOM.WW.Theme/` | single RCL | [`components/ui-theme/`](../components/ui-theme/) |
| `Health` | `ZB.MOM.WW.Health/` | `Health`, `.Akka`, `.EntityFrameworkCore` | [`components/health/`](../components/health/) |
| `Telemetry` | `ZB.MOM.WW.Telemetry/` | `Telemetry`, `.Serilog` | [`components/observability/`](../components/observability/) |
| `Configuration` | `ZB.MOM.WW.Configuration/` | single | [`components/configuration/`](../components/configuration/) |
| `Audit` | `ZB.MOM.WW.Audit/` | single | [`components/audit/`](../components/audit/) |
## 1. Before you start
1. Pick the library to review. Its review folder is `code-reviews/<Library>/` where
`<Library>` is the table above (the `ZB.MOM.WW.` prefix stripped).
2. Identify the design context for the library:
- Its component spec: `components/<component>/spec/SPEC.md`.
- Its proposed shared contract: `components/<component>/shared-contract/ZB.MOM.WW.<Library>.md`.
- The library's own `README.md` and the relevant row in the **Component normalization**
table in the root [`CLAUDE.md`](../CLAUDE.md).
- The adoption backlog `components/<component>/GAPS.md` (what is deliberately deferred).
3. Record the exact commit being reviewed: `git rev-parse --short HEAD`. Every review is a
snapshot — a finding only means something relative to a known commit.
4. Build and test the library so findings are grounded in a green (or known-red) baseline:
`cd ZB.MOM.WW.<Library> && dotnet test`.
5. Open `code-reviews/<Library>/findings.md` and fill in the header table (reviewer, date,
commit SHA).
## 2. Review checklist
Work through **every** category below for the library. A comprehensive review means the
checklist is completed even where it produces no findings — record "No issues found" for a
category rather than leaving it ambiguous.
These libraries are reusable packages consumed by three separate apps (OtOpcUa,
MxAccessGateway, ScadaBridge), so the bar is **API stability and minimal blast radius**:
a defect here can break every consumer at once.
1. **Correctness & logic bugs** — off-by-one, null handling, incorrect conditionals,
misuse of BCL/framework APIs, broken edge cases.
2. **Public API surface & compatibility** — the public surface is intentional and minimal;
naming follows .NET conventions; nullable reference annotations are correct; `sealed` /
`abstract` / `virtual` choices are deliberate; no internal types leak through public
signatures; evolution is additive-only with semver discipline (libraries are at `0.1.0`).
3. **Concurrency & thread safety** — these libraries run inside concurrent hosts; options
and value types are immutable, singletons/statics are thread-safe, no shared mutable
state, correct `async`/`await`, no sync-over-async.
4. **Error handling & resilience** — exceptions thrown to consumers are appropriate and
documented; argument validation / guard clauses (`ArgumentNullException` etc.); no
swallowed exceptions; fail-fast vs. graceful degradation is deliberate.
5. **Security & secret handling** — Auth peppered-HMAC API keys and LDAP-injection surface;
redaction seams (`ILogRedactor` in Telemetry, `IAuditRedactor` in Audit); input
validation; no secrets or PII in messages, logs, or exception text.
6. **Performance & resource management**`IDisposable` correctness and disposal,
allocations on hot paths, buffering/back-pressure, no unnecessary work in DI
registration, async all the way down.
7. **Spec & shared-contract adherence** — does the code match the library's
`components/<component>/spec/SPEC.md` and `shared-contract/ZB.MOM.WW.<Library>.md`?
Flag both code that drifts from the normalized contract **and** spec/contract docs that
are now stale relative to the code.
8. **Packaging, dependencies & project layout** — minimal dependency footprint (e.g. Audit's
only non-BCL dependency is `Microsoft.Extensions.DependencyInjection.Abstractions`);
correct target framework (.NET 10); package id/version metadata; `dotnet pack` produces
the expected nupkg set; central versions in `Directory.Packages.props`; no public
dependency leakage; Abstractions-vs-implementation split; Options pattern owned by the
library; `AddZb*` DI registration ergonomics.
9. **Testing coverage** — are the library's behaviours covered by tests in its `tests/`
project? Note untested critical paths, missing edge-case tests, and whether the public
contract (not just internals) is exercised.
10. **Documentation & XML docs** — XML doc accuracy and presence on the **public** API
(these are libraries — public docs matter), library `README.md` correctness, misleading
or stale comments, undocumented non-obvious behaviour.
## 3. Recording findings
Add one entry per finding to the `## Findings` section of the library's `findings.md`,
using the entry format in [`_template/findings.md`](_template/findings.md).
- **Finding ID**`<Library>-NNN`, numbered sequentially within the library and never
reused (e.g. `Auth-001`, `Telemetry-003`). IDs are permanent even after resolution.
- **Severity:**
- **Critical** — security breach, data loss, or a defect that crashes/corrupts or
deadlocks every consuming app (e.g. a broken public contract, secret leakage).
- **High** — incorrect behaviour with significant impact across consumers, or a
public-API / back-compat break, with no safe workaround.
- **Medium** — incorrect or risky behaviour with limited impact or a workaround.
- **Low** — minor issues, style, maintainability, documentation.
- **Category** — one of the 10 checklist categories above.
- **Location**`file:line` (clickable), or a list of locations.
- **Description** — what is wrong and why it matters.
- **Recommendation** — concrete suggested fix.
After recording findings, update the library header table (status, open-finding count) and
refresh the base README (step 5).
## 4. Marking an item resolved
Findings are **never deleted** — they are an audit trail. To close one, change its
**Status** and complete the **Resolution** field:
- `Open` — newly recorded, not yet addressed.
- `In Progress` — a fix is actively being worked on.
- `Resolved` — fixed. The Resolution field must state the fixing commit SHA, the date, and
a one-line description of the fix.
- `Won't Fix` — intentionally not fixed. The Resolution field must justify why.
- `Deferred` — valid but postponed. The Resolution field must say what it is waiting on
(e.g. a tracked issue, a `GAPS.md` item, or a later milestone).
`Resolved`, `Won't Fix`, and `Deferred` findings are all considered **closed** and drop off
the base README's pending list. `Open` and `In Progress` are **pending**.
## 5. Updating the base README
`code-reviews/README.md` holds the single cross-library view (process overview, the Pending
Findings tables, and the Library Status table). It is **generated** from the per-library
`findings.md` files — do not edit it by hand.
After any review or status change, regenerate it:
```
python3 code-reviews/regen-readme.py
```
`regen-readme.py --check` exits non-zero if `README.md` is stale, for use in CI.
The per-library `findings.md` files are the source of truth; `README.md` is the aggregated
index and must always agree with them — which the script guarantees.
## 6. Re-reviewing a library
Re-reviews append to the same `findings.md`. Update the header to the new commit and date,
continue the finding numbering from the last used ID, and leave prior findings (including
closed ones) in place as history.
+344
View File
@@ -0,0 +1,344 @@
# Code Review — Telemetry
| Field | Value |
|-------|-------|
| Library | `ZB.MOM.WW.Telemetry/` |
| Packages | `ZB.MOM.WW.Telemetry`, `ZB.MOM.WW.Telemetry.Serilog` |
| Component spec | `components/observability/spec/SPEC.md` |
| Shared contract | `components/observability/shared-contract/ZB.MOM.WW.Telemetry.md` |
| Status | Reviewed |
| Last reviewed | 2026-06-01 |
| Reviewer | Claude (automated baseline) |
| Commit reviewed | `5f75cd4` |
| Open findings | 0 |
## Summary
The library is small, focused, and well-structured: two packages with a clean Serilog/OTel
boundary (the `Serilog.*` stack appears only in the `.Serilog` package; the core package is
pure OTel + ASP.NET Core framework reference), correct argument validation, deliberate
`sealed` types, thorough XML docs, and a deliberate no-process-global-state design for
`AddZbSerilog` that is well covered by `MultiHostTests`. The identity triple, Resource
omission rules, exporter wiring (Prometheus always-on, OTLP additive), and trace/log
correlation all match the spec's intent and are exercised by the 19 tests.
The most material problems are in the redaction seam — the one component the review brief
flags as security-critical. `RedactionEnricher` honours only *replacement* of scalar
properties: it silently ignores the redactor **removing** a key (a documented capability of
`ILogRedactor`), and it cannot see inside destructured/structured property values, so a
secret logged as a field of `{@Object}` is never scrubbed. Both let secrets reach sinks
despite a conforming redactor. Secondary themes: a spec drift around an undocumented
`service.instance.id` Resource attribute, two hand-maintained Resource-attribute builders
that can drift apart, and a stale doc-comment on `MapZbMetrics`. Tests are solid for the
happy paths but have no coverage for redactor removal or structured-value redaction.
## Checklist coverage
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☑ | Redactor "remove" path is a no-op (Telemetry-001); structured values opaque to redactor (Telemetry-002). |
| 2 | Public API surface & compatibility | ☑ | Surface minimal, `sealed`, nullable-correct. `ZbResource.InstanceId` is an added public member not in the contract (Telemetry-004). |
| 3 | Concurrency & thread safety | ☑ | No issues found. Enrichers stateless; `Lazy` uses `ExecutionAndPublication`; `Activity.Current` is async-local. |
| 4 | Error handling & resilience | ☑ | Guard clauses present. `new Uri(OtlpEndpoint)` can throw late on malformed input (Telemetry-006). |
| 5 | Security & secret handling | ☑ | Redaction gaps (Telemetry-001/002) are security-relevant — secrets can survive a conforming redactor. |
| 6 | Performance & resource management | ☑ | Per-event dictionary snapshot when a redactor is registered (Telemetry-007); acceptable but noted. |
| 7 | Spec & shared-contract adherence | ☑ | Undocumented `service.instance.id` attribute (Telemetry-004); two Resource builders that can drift (Telemetry-005). |
| 8 | Packaging, dependencies & project layout | ☑ | No issues found. Serilog stack confined to `.Serilog`; central versions; correct net10.0; framework ref justified. |
| 9 | Testing coverage | ☑ | No tests for redactor removal or structured-value redaction (Telemetry-003). |
| 10 | Documentation & XML docs | ☑ | `MapZbMetrics` doc-comment is stale: claims "only valid when Exporter = Prometheus" (Telemetry-008). |
## Findings
### Telemetry-001 — `RedactionEnricher` ignores property removal, leaving secrets in the event
| | |
|--|--|
| Severity | High |
| Category | Security & secret handling |
| Status | Resolved |
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry.Serilog/RedactionEnricher.cs:49-67` |
**Description**
`ILogRedactor.Redact` is documented to let a project "remove or replace any sensitive
values" (`ILogRedactor.cs:13` and the XML doc on the interface method: *"remove or replace"*;
the shared contract repeats *"Remove or replace any sensitive values"*). `RedactionEnricher`
builds a `snapshot` dictionary, hands it to the redactor, then writes back only via
`AddOrUpdateProperty` for entries that remain in the snapshot and `HasChanged`:
```csharp
foreach (var entry in snapshot)
{
if (HasChanged(logEvent, entry.Key, entry.Value))
logEvent.AddOrUpdateProperty(propertyFactory.CreateProperty(entry.Key, entry.Value));
}
```
If a redactor *removes* a key from the dictionary (`properties.Remove("apiKey")`) — the most
natural way to implement "must not leave the process" — that key simply no longer appears in
the write-back loop, so the original property is **never removed from `logEvent`**. The
secret reaches every sink unredacted, even though the redactor did exactly what its contract
permits. This defeats the seam's stated operational guarantee ("secrets never leave the
process in log events") for any removal-style redactor.
**Recommendation**
After calling the redactor, reconcile deletions: for each property key present on the
original `logEvent` but absent from the returned `snapshot`, call
`logEvent.RemovePropertyIfPresent(key)`. (Capture the original key set before mutation, then
diff.) Add a test asserting a removing redactor scrubs the property (see Telemetry-003).
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `RedactionEnricher` now captures the original property
key set and calls `RemovePropertyIfPresent` for any key the redactor dropped from the snapshot,
so a removing redactor scrubs the property; covered by a new removing-redactor test.
### Telemetry-002 — Redactor cannot inspect or scrub destructured/structured property values
| | |
|--|--|
| Severity | Medium |
| Category | Security & secret handling |
| Status | Resolved |
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry.Serilog/RedactionEnricher.cs:49-55` |
**Description**
The snapshot only unwraps `ScalarValue`; every other `LogEventPropertyValue`
(`StructureValue` from `{@Object}`, `SequenceValue`, `DictionaryValue`) is passed to the
redactor as the raw Serilog wrapper object:
```csharp
snapshot[property.Key] = property.Value is ScalarValue scalar ? scalar.Value : property.Value;
```
A project redactor written against the seam (`IDictionary<string, object?>` of "values")
therefore sees an opaque `StructureValue` for a destructured payload — it cannot read or
mask a secret *field inside* a logged object (e.g. `logger.Information("{@Command}", cmd)`
where `cmd.ApiKey` is sensitive). MxGateway's reference redactor specifically guards
"which command payloads must not leave the process" (per `ILogRedactor` XML doc and the
contract), which is precisely the destructured-object case. The seam silently cannot meet
that requirement; the redactor only works for top-level scalar properties.
**Recommendation**
Document the seam's actual reach (scalar top-level properties only) on `ILogRedactor` and in
the shared contract, and/or recursively project `StructureValue`/`SequenceValue`/
`DictionaryValue` into the snapshot and rebuild them on write-back so nested fields are
redactable. At minimum, make the limitation explicit so consumers do not assume nested
payloads are scrubbed when they are not.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 (documented the scalar-only limitation), then **superseded by
`05cc62a`, 2026-06-01 — nested redaction implemented.** `RedactionEnricher` now projects each
structured value into a mutable nested view the redactor descends into recursively
(`StructureValue``IDictionary<string,object?>`, `SequenceValue``IList<object?>`,
`DictionaryValue``IDictionary<string,object?>`), so a field nested inside a `{@Object}` can be
masked or removed. The Project/Rebuild round-trip preserves `StructureValue.TypeTag` and original
dictionary keys, and a structural `ValueEquals` skips write-back for properties the redactor left
untouched (no reallocation; scalar fast path retained). The earlier documented-limitation wording on
the `ILogRedactor` XML doc, shared contract, and README was replaced to document the recursive reach.
### Telemetry-003 — No tests for redactor removal or structured-value redaction
| | |
|--|--|
| Severity | Medium |
| Category | Testing coverage |
| Status | Resolved |
| Location | `ZB.MOM.WW.Telemetry/tests/ZB.MOM.WW.Telemetry.Serilog.Tests/RedactionTests.cs:33-69` |
**Description**
`RedactionTests` covers exactly two redaction behaviours: a registered redactor replacing a
scalar value, and a no-op when none is registered. The `FakeRedactor` only ever *reassigns*
`properties["apiKey"]`. There is no test that a redactor which **removes** a key actually
scrubs it (the Telemetry-001 defect would have been caught), and no test that a redactor can
mask a field of a destructured/structured property (Telemetry-002). For a seam whose entire
purpose is secret containment, the most security-relevant behaviours are untested.
**Recommendation**
Add tests: (a) a redactor calling `properties.Remove(key)` results in the property being
absent from the emitted `LogEvent`; (b) a redactor attempting to mask a nested field of a
`{@Object}` payload, asserting the documented behaviour (whichever resolution Telemetry-002
takes). These should fail today and pin the fixes.
**Resolution**
Resolved in `544a6dd`, 2026-06-01, then extended in `05cc62a` — added
`Removing_redactor_scrubs_the_property_from_the_event` (red→green for Telemetry-001), a
Resource-attribute parity test, and (for the Telemetry-002 implementation) a nested-reach suite:
mask and remove a field inside a destructured `{@Object}`, mask a sequence element, mask a
dictionary value, mask a field two levels deep, and an untouched-structure-survives check. The
earlier `Redactor_cannot_reach_a_field_inside_a_destructured_object` limitation test was replaced.
### Telemetry-004 — `service.instance.id` Resource attribute is undocumented in spec and contract
| | |
|--|--|
| Severity | Low |
| Category | Spec & shared-contract adherence |
| Status | Resolved |
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry/ZbResource.cs:19-45` |
**Description**
`ZbResource` adds a `service.instance.id` attribute (deterministic `MachineName:ProcessId`)
to the Resource, and exposes it as a new public member `ZbResource.InstanceId`. The
normalized Resource attribute set is enumerated exhaustively in two authoritative docs —
`SPEC.md` §2 and `METRIC-CONVENTIONS.md` §4 — and **neither lists `service.instance.id`**;
the shared contract (`ZB.MOM.WW.Telemetry.md`) likewise documents `ZbResource.Build` as
populating only `service.name/namespace/version/site.id/node.role/host.name` and does not
mention an `InstanceId` member. The attribute itself is a reasonable, standards-aligned
improvement (and disabling the OTel SDK's random-GUID default is sensible for cross-signal
correlation), but it is a silent divergence: the spec/contract are now stale relative to the
code. Per REVIEW-PROCESS §2.7, both directions of drift must be flagged.
**Recommendation**
Add `service.instance.id` (with the `MachineName:ProcessId` rationale) to the Resource table
in `SPEC.md` §2 and `METRIC-CONVENTIONS.md` §4, and document the public `ZbResource.InstanceId`
member in the shared contract, so the normalized spec and the code agree.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — kept the attribute (documented the
`MachineName:ProcessId` rationale) and added `service.instance.id` to the Resource tables in
`SPEC.md` §2 and `METRIC-CONVENTIONS.md` §4, plus the `ZbResource.InstanceId` member to the shared
contract; spec and code now agree.
### Telemetry-005 — Two hand-maintained Resource-attribute builders can silently drift
| | |
|--|--|
| Severity | Low |
| Category | Spec & shared-contract adherence |
| Status | Resolved |
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry/ZbResource.cs:38-64`, `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry.Serilog/ZbSerilogConfig.cs:125-151` |
**Description**
The Resource attached to metrics/traces is built by `ZbResource.Configure` (via the OTel
`AddService` + `AddAttributes` API), while the Resource attached to the OTLP *log* sink is
built independently by `ZbSerilogConfig.BuildResourceAttributes` (a hand-rolled
`Dictionary<string, object>`). The two currently agree, but they enumerate the same six/seven
attributes in two places with two different mechanisms, so a future change to one (a new
attribute, a renamed key, a changed omission rule) will silently desynchronize logs from
metrics/traces and break the cross-signal correlation the library's whole "unifying hinge"
depends on. There is no test asserting parity between the two attribute sets.
**Recommendation**
Derive both from a single source of truth — e.g. have `ZbResource` expose the canonical
attribute map (already mostly the shape `BuildResourceAttributes` returns) and have the
Serilog sink consume it — or add a parity test that asserts the two attribute sets are
key-for-key identical for a representative options object.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — introduced `ZbResource.BuildAttributes` as the single
source of truth; `ZbResource.Configure` (OTel SDK) and `ZbSerilogConfig.BuildResourceAttributes`
(OTLP log sink) now both derive from it, and a parity test asserts the two sets are identical.
### Telemetry-006 — Malformed `OtlpEndpoint` throws `UriFormatException` late, with no context
| | |
|--|--|
| Severity | Low |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry/ZbTelemetryExtensions.cs:127-135` |
**Description**
`ConfigureOtlp` does `otlp.Endpoint = new Uri(options.OtlpEndpoint)` with no validation. A
malformed endpoint string (typo, missing scheme) throws a raw `UriFormatException` deep
inside OTel exporter construction at host-build time, with no mention of which option was at
fault. `BuildOptions` already fails fast and clearly for a missing `ServiceName`, but does
not validate that `OtlpEndpoint` is a well-formed absolute URI when `Exporter == Otlp` (nor
that it is non-empty — an Otlp exporter with a null endpoint is silently registered and
points nowhere). The Serilog path (`ZbSerilogConfig`) has the same untyped string→endpoint
handoff.
**Recommendation**
In `BuildOptions`, when `Exporter == ZbExporter.Otlp`, validate `OtlpEndpoint` with
`Uri.TryCreate(..., UriKind.Absolute, out _)` and throw an `ArgumentException` naming the
option (consistent with the existing `ServiceName` guard) rather than letting a bare
`UriFormatException` escape later.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — added `ZbTelemetryOptionsValidator.Validate`, called from
both `BuildOptions` and `AddZbSerilog`: when `Exporter == Otlp` it requires a non-empty,
well-formed absolute `OtlpEndpoint` and throws a named `ArgumentException` (no-op for Prometheus);
covered by three new tests.
### Telemetry-007 — Redaction snapshot allocates a dictionary on every log event
| | |
|--|--|
| Severity | Low |
| Category | Performance & resource management |
| Status | Resolved |
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry.Serilog/RedactionEnricher.cs:49-67` |
**Description**
When an `ILogRedactor` is registered, `Enrich` allocates a new
`Dictionary<string, object?>(logEvent.Properties.Count)`, copies every property into it, and
then iterates again to diff/write-back — on every single log event, across every logging
thread. Enrichers are on the hottest path in the library (they run for each event the level
filter admits). The early-return when no redactor is registered keeps the common case free,
so the cost is borne only by redaction-enabled consumers (MxGateway), but for a high-volume
gateway this is non-trivial steady-state allocation/GC pressure.
**Recommendation**
Consider redacting in place against `logEvent.Properties` without a full snapshot copy (e.g.
only materialize replacements for keys the redactor touches), or short-circuit when the event
has no properties. At minimum, document the per-event cost so consumers can weigh enabling
redaction on very hot loggers. Acceptable as-is given redaction is opt-in and security-first.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — `Enrich` now short-circuits before any snapshot allocation
when the event has no properties (and still early-returns when no redactor is registered), so the
per-event dictionary copy is only paid when there is actually something to redact.
### Telemetry-008 — `MapZbMetrics` XML doc claims it is "only valid when Exporter = Prometheus" — stale
| | |
|--|--|
| Severity | Low |
| Category | Documentation & XML docs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry/ZbMetricsEndpointExtensions.cs:11-14` |
**Description**
The XML doc on `MapZbMetrics` states it is "Only valid when
`ZbTelemetryOptions.Exporter = ZbExporter.Prometheus`." That contradicts the actual wiring:
`ApplyMetricsExporter` (`ZbTelemetryExtensions.cs:107-116`) **always** calls
`AddPrometheusExporter()` regardless of the `Exporter` setting — OTLP is purely additive.
The library's own README ("Prometheus is **always wired** for metrics regardless of the
`Exporter` setting") and the test `AddZbTelemetry_OtlpExporter_StillServesPrometheusEndpoint`
both confirm `/metrics` works under `Exporter = Otlp`. The doc-comment therefore tells
consumers the opposite of the real (and intended) behaviour and could lead them to wrongly
believe `MapZbMetrics` is a no-op under OTLP. The same stale wording is mirrored in the
shared contract (`ZB.MOM.WW.Telemetry.md`, `MapZbMetrics` summary).
**Recommendation**
Update the doc-comment to state that the Prometheus exporter is always registered and
`MapZbMetrics` is valid under any `Exporter` value (Prometheus is always-on; OTLP is an
overlay). Align the shared-contract summary for `MapZbMetrics` to match.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — rewrote the `MapZbMetrics` XML doc to state it is valid
under any `Exporter` value (Prometheus always-on; OTLP additive overlay) and aligned the matching
shared-contract summary.
+242
View File
@@ -0,0 +1,242 @@
# Code Review — Theme
| Field | Value |
|-------|-------|
| Library | `ZB.MOM.WW.Theme/` |
| Packages | `ZB.MOM.WW.Theme` (Razor Class Library) |
| Component spec | `components/ui-theme/spec/SPEC.md` |
| Shared contract | `components/ui-theme/shared-contract/ZB.MOM.WW.Theme.md` |
| Status | Reviewed |
| Last reviewed | 2026-06-01 |
| Reviewer | Claude (automated baseline) |
| Commit reviewed | `5f75cd4` |
| Open findings | 0 |
## Summary
`ZB.MOM.WW.Theme` is a small, clean, well-scoped Blazor Razor Class Library: ten flat-namespace
SSR components, two stylesheets (`theme.css` 379 lines, `layout.css` 191 lines), three vendored
IBM Plex `.woff2` fonts, and two public enums. The component C# is correct and minimal — stateless
render logic, no concurrency surface, no secret handling, sensible `[Parameter]`/`EditorRequired`
choices, and `TreatWarningsAsErrors`. The 32 bUnit tests (20 `[Fact]` + 12 `[Theory]` cases) exercise
the public render contract of every component plus a static-asset check on the corrected font path.
The standout security-sensitive area — `LoginCard` (open-redirect + antiforgery) — is handled
correctly by delegating both concerns to the consumer with explicit XML docs and in-markup security
notes, matching the spec. No findings are Critical or High. The findings are concentrated in two
themes: (1) one functional gap — the mobile hamburger relies on Bootstrap's collapse JS, an
undocumented runtime dependency that contradicts the "no JavaScript" framing; and (2) CSS / token
hygiene — a token-fidelity drift on `.chip-idle`, hardcoded hex values that violate the spec's
"no hardcoded hex" rule, an undefined `.rail-ico` class, orphan/dead nav CSS, and a near-total
absence of XML docs on the public parameter surface.
## Checklist coverage
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☑ | Component render logic correct; `open="@Expanded"` boolean-attribute binding behaves (tests confirm); chip/variant `switch` mappings complete. No issues found. |
| 2 | Public API surface & compatibility | ☑ | Flat `ZB.MOM.WW.Theme` namespace; minimal, intentional; `EditorRequired` applied; nullable annotations correct; no internal types leak. Surface matches contract. No issues found. |
| 3 | Concurrency & thread safety | ☑ | Stateless components, no shared mutable state, no `async`/statics/singletons. No issues found (UI kit). |
| 4 | Error handling & resilience | ☑ | No throwing paths; required params default to `string.Empty`; null `RenderFragment` slots are guarded (`is not null` / `IsNullOrEmpty`). No issues found. |
| 5 | Security & secret handling | ☑ | No secrets/PII. `LoginCard` open-redirect + CSRF correctly delegated to consumer with docs. `Accent` interpolates into a `style` attr but is developer-supplied (not untrusted). No issues found. |
| 6 | Performance & resource management | ☑ | No `IDisposable`, no allocations of note, fonts vendored once. No issues found. |
| 7 | Spec & shared-contract adherence | ☑ | Component/parameter API matches SPEC §4 and the shared contract. Drifts found: `.chip-idle` foreground (Theme-002) and "no hardcoded hex" rule (Theme-003). |
| 8 | Packaging, dependencies & project layout | ☑ | `net10.0` RCL, single package id `ZB.MOM.WW.Theme`, `FrameworkReference` only, version `0.1.0` (inherited from `Directory.Build.props`), static assets under `_content/`. No issues found. |
| 9 | Testing coverage | ☑ | 32 tests cover every component's render contract + corrected font path. CSS-class wiring (e.g. `.rail-ico`, orphan nav CSS) is not asserted, which is how Theme-004/005 slipped through. |
| 10 | Documentation & XML docs | ☑ | README/contract accurate. XML docs present only on `LoginCard`; absent on the other 9 components, all their parameters, and both enums (Theme-006). |
## Findings
### Theme-001 — Mobile hamburger toggle silently depends on Bootstrap collapse JS
| | |
|--|--|
| Severity | Medium |
| Category | Spec & shared-contract adherence |
| Status | Resolved |
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/Components/ThemeShell.razor:5` |
**Description**
`ThemeShell` renders the narrow-viewport navigation toggle as a Bootstrap collapse trigger:
`<button ... data-bs-toggle="collapse" data-bs-target="#theme-rail" aria-expanded="false">`, with the
rail wrapped in `<div class="collapse d-lg-block" id="theme-rail">`. The expand/collapse behaviour
and the `aria-expanded` state transition are driven entirely by **Bootstrap's collapse JavaScript
bundle** at runtime. The shared contract states "No global JavaScript … `NavRailSection` is CSS-only"
and the spec emphasizes static-SSR/no-JS operation, while §5 only requires each app to keep its own
Bootstrap **`<link>`** (CSS) — Bootstrap *JS* is never mentioned. A consumer that includes only
Bootstrap CSS (a reasonable reading of the docs) gets a hamburger button that does nothing below the
`lg` breakpoint: the rail stays `display:none` (`.collapse`) and is unreachable on mobile/narrow
windows. There is no compile-time signal, and the hardcoded `aria-expanded="false"` never updates
without the JS, so assistive tech is also misinformed. This affects every consumer's small-viewport
navigation.
**Recommendation**
Either (a) document the Bootstrap-JS runtime requirement explicitly in the README/spec adoption steps
("the side-rail mobile toggle requires Bootstrap's collapse JS bundle"), or (b) make the toggle
JS-independent to match the "no JavaScript" promise — e.g. wrap the rail in a `<details>`/checkbox
CSS-only disclosure (the same pattern `NavRailSection` already uses), so collapse works in pure static
SSR. If keeping the Bootstrap-collapse approach, document that `aria-expanded` requires the JS to stay
accurate.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — removed the Bootstrap-collapse JS dependency: the shell is now a native CSS-only `<details>`/`<summary>` disclosure (no `data-bs-*`), force-shown on lg+; bUnit test asserts no Bootstrap collapse hooks.
### Theme-002 — `.chip-idle` foreground diverges from the documented token pairing
| | |
|--|--|
| Severity | Low |
| Category | Spec & shared-contract adherence |
| Status | Resolved |
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/wwwroot/css/theme.css:177` |
**Description**
`DESIGN-TOKENS.md` ("Status tokens — tinted backgrounds") states the chip classes pair each tinted
background with its matching foreground token — "`--ok-bg` background with `--ok` foreground … Used by
`.chip-ok`, `.chip-warn`, `.chip-bad`, `.chip-idle`". The `--idle` token (`#868e96`) is correctly used
for `.s-idle` and the `.conn-pill .dot`, but `.chip-idle` uses `color: var(--ink-soft)` (`#5a6066`)
instead of `color: var(--idle)`:
```css
.chip-idle { color: var(--ink-soft); background: var(--idle-bg); border-color: var(--rule-strong); }
```
So the one idle chip variant does not follow the pairing rule the design-tokens doc spells out. It is
not a contrast or readability regression (`--ink-soft` is darker than `--idle`), but code and the
normalized token reference disagree, which is exactly the spec-drift category §7 asks to flag. Note the
`.chip-warn` foreground is likewise a hardcoded `#b56a00` rather than `var(--warn)` — see Theme-003.
**Recommendation**
Either change `.chip-idle` to `color: var(--idle)` to honor the documented pairing, or, if the darker
`--ink-soft` ink is the deliberate choice, amend `DESIGN-TOKENS.md` to record that `.chip-idle` uses
`--ink-soft` foreground (as it already does for the `Info` variant footnote).
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — changed `.chip-idle` foreground from `var(--ink-soft)` to `var(--idle)`, honoring the DESIGN-TOKENS pairing rule; static-asset test asserts the pairing.
### Theme-003 — Hardcoded hex values in CSS contradict the "no hardcoded hex" rule
| | |
|--|--|
| Severity | Low |
| Category | Spec & shared-contract adherence |
| Status | Resolved |
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/wwwroot/css/theme.css:174` (and theme.css:96-98,145-150,175-176,251-254,281,290-291,335,348,368-369,377; layout.css:123,128,184) |
**Description**
`SPEC.md` §1 and `DESIGN-TOKENS.md` both assert the rule: "Components carry **no hardcoded hex values**
— everything references these tokens" / "no hardcoded hex values appear in component markup or CSS".
In practice both stylesheets contain numerous literal hex colors outside the `:root` token block —
chip/agg border tints (`#c6e6cd`, `#eec3c3`, `#efd6a6`, `#bfe3c6`, `#f0d9ab`, `#f0c0c0`), chip/notice
foregrounds (`#b56a00`, `#8a5a00`), zebra/hover fills (`#fbfbf9`, `#f3f6fd`, `#eef2fc`), and the
info/dir tints (`#e7ecfb`, `#cdd9f7`). These are derived shades (lighter borders, hover washes) with no
corresponding token, so they are not re-themable via the token block and any per-app accent override
will not reach them. The rule as written is therefore aspirational, and the `DESIGN-TOKENS.md` Info
footnote itself already references a raw `#e7ecfb`, so the doc partially contradicts its own rule.
**Recommendation**
Reconcile code and doc: either promote the recurring derived shades to named tokens (e.g.
`--ok-border`, `--hover-bg`, `--info-bg`/`--info-border`) and reference them, or soften the spec wording
to "no hardcoded hex in **component markup**; CSS derives border/hover shades from tokens where
practical" so the rule matches reality. This keeps the "token changes are breaking" SemVer guarantee
honest.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — promoted every derived border/ink/wash shade to named `:root` tokens (`--ok-border`, `--warn-ink`, `--info-bg`, `--hover-bg`, `--active-bg`, `--zebra-bg`, …) and referenced them; no hex literal remains outside `:root` in either stylesheet, asserted by a new test.
### Theme-004 — `NavRailItem` emits a `.rail-ico` span that no stylesheet defines
| | |
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/Components/NavRailItem.razor:4` |
**Description**
When the optional `Icon` slot is supplied, `NavRailItem` wraps it: `<span class="rail-ico">@Icon</span>`.
Neither `theme.css` nor `layout.css` defines a `.rail-ico` rule (confirmed by grep). The icon therefore
renders as an unstyled inline span — no sizing, alignment, or gap relative to the label text — so an
icon + text rail link will not line up the way the other rail styling implies. It is not a crash, but
the component ships a class hook with no backing style, which is a latent presentation bug for any
consumer that uses the `Icon` parameter (and none of the 32 tests cover the icon path).
**Recommendation**
Add a `.rail-ico` rule to `layout.css` (e.g. `display:inline-flex; width:1rem; margin-right:0.4rem;
color:var(--ink-faint);`) alongside the existing `.rail-link` rules, and add a bUnit test asserting the
`.rail-ico` span is emitted when `Icon` is supplied.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — added a `.rail-ico` rule to `layout.css` (inline-flex, sized, gapped, `--ink-faint`); bUnit tests assert the span is emitted when `Icon` is supplied and omitted otherwise.
### Theme-005 — Orphan and unstyled nav CSS classes in `layout.css`
| | |
|--|--|
| Severity | Low |
| Category | Documentation & XML docs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/wwwroot/css/layout.css:75` (`.rail-eyebrow`), `:103` (`.rail-eyebrow-chevron`); `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/Components/NavRailSection.razor:5` (`.rail-eyebrow-label`) |
**Description**
`NavRailSection` renders `<summary class="rail-eyebrow-toggle"><span class="rail-eyebrow-label">@Title</span></summary>`
and `layout.css` provides the actual collapse chevron via `.rail-section > summary::before` (lines
178-181). That leaves several mismatched class hooks: `.rail-eyebrow` (line 75, non-toggle) and
`.rail-eyebrow-chevron` (lines 103-108) are defined in CSS but emitted by **no** component (dead CSS),
while `.rail-eyebrow-label` is emitted by the component but has **no** CSS rule (it merely inherits from
`.rail-eyebrow-toggle`). The result is two parallel chevron mechanisms (the styled-but-unused
`.rail-eyebrow-chevron` and the actually-used `::before`) and stylesheet cruft that misleads future
maintainers about how the section header is themed.
**Recommendation**
Remove the dead `.rail-eyebrow` and `.rail-eyebrow-chevron` rules (or wire `.rail-eyebrow-chevron` into
`NavRailSection` and drop the `::before` approach — pick one), and either add a `.rail-eyebrow-label`
rule or drop the now-redundant wrapper span. Keeping one chevron mechanism removes the ambiguity.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — removed the dead `.rail-eyebrow` and `.rail-eyebrow-chevron` rules and dropped the redundant `.rail-eyebrow-label` wrapper span in `NavRailSection`, leaving one chevron mechanism (`summary::before`).
### Theme-006 — Public component/parameter surface lacks XML documentation
| | |
|--|--|
| Severity | Low |
| Category | Documentation & XML docs |
| Status | Resolved |
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/Components/ThemeShell.razor:23` (and `BrandBar.razor`, `NavRailItem.razor`, `NavRailSection.razor`, `StatusPill.razor`, `TechButton.razor`, `TechCard.razor`, `TechField.razor`, `ThemeHead.razor`, `ButtonVariant.cs`, `StatusState.cs`) |
**Description**
This is a packaged library (`IsPackable=true`, `0.1.0`) consumed by three separate apps, and
`REVIEW-PROCESS.md` §2.10 calls for "XML doc accuracy and presence on the **public** API (these are
libraries — public docs matter)." Only `LoginCard` carries XML docs (on `ReturnUrl` and `ChildContent`).
The other nine components, all of their `[Parameter]` properties, and both public enums
(`ButtonVariant`, `StatusState`) have no `///` documentation, so consumers get no IntelliSense/tooltip
help on the parameter contract (e.g. that `ThemeShell.Accent` overrides the `--accent` token, that
`NavRailItem.Match` defaults to `Prefix`, or what each `StatusState`/`ButtonVariant` member maps to).
The README and shared contract document all of this, but that prose is not surfaced at the call site.
**Recommendation**
Add brief `///` summaries to the public components, their parameters, and the two enum types — at
minimum the non-obvious ones (`ThemeShell.Accent`, `NavRailItem.Match`, `TechButton.Busy`,
`TechCard.Title` vs `Header` precedence, and each enum member). Consider enabling
`GenerateDocumentationFile` so missing-doc warnings surface under the existing `TreatWarningsAsErrors`.
**Resolution**
Resolved in `544a6dd`, 2026-06-01 — added `///` summaries to all nine remaining components, every public `[Parameter]`, and both enums (`ButtonVariant`, `StatusState`); enabled `GenerateDocumentationFile` (CS1591 left off the error set for Razor-generated members) so docs build under `TreatWarningsAsErrors` with 0 warnings.
+71
View File
@@ -0,0 +1,71 @@
# Code Review — <Library>
<!--
Template for a shared-library review. Copy the structure below into
code-reviews/<Library>/findings.md and fill it in.
<Library> is the ZB.MOM.WW.* library with the prefix stripped (Auth, Theme,
Health, Telemetry, Configuration, Audit). See ../REVIEW-PROCESS.md for the
full process and the library → component-spec mapping.
-->
| Field | Value |
|-------|-------|
| Library | `ZB.MOM.WW.<Library>/` |
| Packages | `<package ids, e.g. ZB.MOM.WW.<Library>>` |
| Component spec | `components/<component>/spec/SPEC.md` |
| Shared contract | `components/<component>/shared-contract/ZB.MOM.WW.<Library>.md` |
| Status | Not yet reviewed \| In progress \| Reviewed |
| Last reviewed | YYYY-MM-DD |
| Reviewer | <name> |
| Commit reviewed | `<short SHA>` |
| Open findings | 0 |
## Summary
One short paragraph: overall health of the library, themes across findings, and anything
notable that is not a finding (e.g. test count, dependency footprint, adoption status).
## Checklist coverage
Confirm every category was examined. Record "No issues found" where applicable.
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☐ | |
| 2 | Public API surface & compatibility | ☐ | |
| 3 | Concurrency & thread safety | ☐ | |
| 4 | Error handling & resilience | ☐ | |
| 5 | Security & secret handling | ☐ | |
| 6 | Performance & resource management | ☐ | |
| 7 | Spec & shared-contract adherence | ☐ | |
| 8 | Packaging, dependencies & project layout | ☐ | |
| 9 | Testing coverage | ☐ | |
| 10 | Documentation & XML docs | ☐ | |
## Findings
<!-- One entry per finding. Copy the block below. Never delete a finding; close it
by changing Status and completing Resolution. -->
### <Library>-001 — <Short title>
| | |
|--|--|
| Severity | Critical \| High \| Medium \| Low |
| Category | <one of the 10 checklist categories> |
| Status | Open \| In Progress \| Resolved \| Won't Fix \| Deferred |
| Location | `ZB.MOM.WW.<Library>/src/<Package>/<File>.cs:<line>` |
**Description**
What is wrong and why it matters.
**Recommendation**
Concrete suggested fix.
**Resolution**
_Unresolved._
<!-- When closed: fixing commit `<SHA>`, date YYYY-MM-DD, one-line description.
For Won't Fix / Deferred, justify the decision here. -->
+198
View File
@@ -0,0 +1,198 @@
#!/usr/bin/env python3
"""Regenerate code-reviews/README.md from the per-library findings.md files.
The findings files are the source of truth; README.md is a generated index.
Run this after recording, resolving, or re-triaging a finding so the aggregated
tables stay in sync (see REVIEW-PROCESS.md section 5).
Usage:
python3 regen-readme.py # rewrite README.md
python3 regen-readme.py --check # exit 1 if README.md is stale (for CI)
Works from any directory paths are resolved relative to this script.
"""
import os
import re
import sys
BASE = os.path.dirname(os.path.abspath(__file__))
SEVERITY_ORDER = {"Critical": 0, "High": 1, "Medium": 2, "Low": 3}
PENDING_STATUSES = {"Open", "In Progress"}
NOT_REVIEWED = "" # rendered for libraries with no recorded review date/commit yet
def discover_libraries():
"""Library folders are every subdirectory of code-reviews/ holding a findings.md,
excluding the _template folder. Returned sorted for a stable README order."""
libraries = []
for name in sorted(os.listdir(BASE)):
if name.startswith(("_", ".")):
continue
if os.path.isfile(os.path.join(BASE, name, "findings.md")):
libraries.append(name)
return libraries
def parse_header(library, text):
"""Extract (last_reviewed, commit) from the library's header table.
Returns None for either field when it is absent or still templated, so the
README can render a 'not yet reviewed' dash instead of a fake baseline."""
last = re.search(r"\|\s*Last reviewed\s*\|\s*([0-9]{4}-[0-9]{2}-[0-9]{2})", text)
commit = re.search(r"\|\s*Commit reviewed\s*\|\s*`([^`<]+)`", text)
return (
last.group(1) if last else None,
commit.group(1) if commit else None,
)
def parse_findings(library):
"""Parse one library's findings.md into ((last_reviewed, commit), [(library, id, severity, title, status), ...])."""
text = open(os.path.join(BASE, library, "findings.md")).read()
header = parse_header(library, text)
findings = []
for block in re.split(r"^### ", text, flags=re.M)[1:]:
head = block.splitlines()[0].strip()
m = re.match(r"([A-Za-z][A-Za-z0-9]*-\d+)\b(.*)", head)
if not m:
raise SystemExit(f"{library}/findings.md: unparseable finding heading: {head!r}")
fid = m.group(1).strip()
title = m.group(2).strip().lstrip("—–-").strip().replace("|", "\\|")
sev = re.search(r"\|\s*Severity\s*\|\s*([A-Za-z]+)", block)
status = re.search(r"\|\s*Status\s*\|\s*([A-Za-z' ]+?)\s*\|", block)
if not sev or not status:
raise SystemExit(f"{library}/findings.md: {fid} is missing a Severity or Status field")
findings.append((library, fid, sev.group(1), title, status.group(1).strip()))
return header, findings
def finding_number(finding):
return int(re.search(r"-(\d+)$", finding[1]).group(1))
def build_readme(libraries, per_library):
pending = sorted(
(f for fs in per_library.values() for f in fs[1] if f[4] in PENDING_STATUSES),
key=lambda f: (SEVERITY_ORDER.get(f[2], 9), f[0], finding_number(f)),
)
reviewed = sum(1 for lib in libraries if per_library[lib][0][0] is not None)
def severity_total(sev):
return sum(1 for f in pending if f[2] == sev)
def open_count(library, sev):
return sum(1 for f in per_library[library][1]
if f[2] == sev and f[4] in PENDING_STATUSES)
lines = []
add = lines.append
add("# Code Reviews")
add("")
add("Comprehensive, per-library code reviews of the `ZB.MOM.WW.*` shared libraries hosted")
add("in this repo. Each library (one self-contained `.slnx` at the repo root) has its own")
add("folder containing a `findings.md`. This README is the aggregated index — the single")
add("place to see all outstanding work.")
add("")
add("> Generated by `regen-readme.py` from the per-library `findings.md` files. Do not")
add("> edit by hand — edit the findings files and re-run the script.")
add("")
add("## How it works")
add("")
add("- Reviews are performed one library at a time against a fixed checklist.")
add("- Each library is reviewed against its normalized component spec under `components/`.")
add("- Every finding is recorded in the library's `findings.md` with a severity and status.")
add("- Findings are **never deleted** — they are closed by changing their status, keeping")
add(" a full audit trail.")
add("- This README aggregates every **pending** finding (`Open` / `In Progress`) across all")
add(" libraries.")
add("")
add("See **[REVIEW-PROCESS.md](REVIEW-PROCESS.md)** for the full procedure: the review")
add("checklist, severity definitions, finding format, the library → component-spec mapping,")
add("and how to mark items resolved.")
add("")
add("## Layout")
add("")
add("```")
add("code-reviews/")
add("├── README.md # this file — process overview + pending findings")
add("├── REVIEW-PROCESS.md # how to perform a review and track findings")
add("├── regen-readme.py # regenerates this README from the findings files")
add("├── _template/findings.md # copy-this template for a library review")
add("└── <Library>/findings.md # one folder per ZB.MOM.WW.* shared library")
add("```")
add("")
add("## Summary")
add("")
add(f"{reviewed} of {len(libraries)} libraries reviewed. "
f"{len(pending)} pending finding{'s' if len(pending) != 1 else ''} across all libraries.")
add("")
add("| Severity | Open findings |")
add("|----------|---------------|")
for sev in ("Critical", "High", "Medium", "Low"):
add(f"| {sev} | {severity_total(sev)} |")
add(f"| **Total** | **{len(pending)}** |")
add("")
add("## Library Status")
add("")
add("| Library | Last reviewed | Commit | Open (C/H/M/L) | Open | Total |")
add("|---------|---------------|--------|----------------|------|-------|")
for library in libraries:
counts = [open_count(library, s) for s in ("Critical", "High", "Medium", "Low")]
last_reviewed, commit = per_library[library][0]
last_cell = last_reviewed if last_reviewed else NOT_REVIEWED
commit_cell = f"`{commit}`" if commit else NOT_REVIEWED
add(f"| [{library}]({library}/findings.md) | {last_cell} | {commit_cell} "
f"| {counts[0]}/{counts[1]}/{counts[2]}/{counts[3]} "
f"| {sum(counts)} | {len(per_library[library][1])} |")
add("")
add("## Pending Findings")
add("")
add("Every `Open` / `In Progress` finding across all libraries, highest severity first.")
add("Resolved findings drop off this list but remain recorded in their library's")
add("`findings.md` (see [REVIEW-PROCESS.md](REVIEW-PROCESS.md) §4–§5). Full detail —")
add("description, location, recommendation — lives in the library's `findings.md`.")
add("")
for sev in ("Critical", "High", "Medium", "Low"):
rows = [f for f in pending if f[2] == sev]
add(f"### {sev} ({len(rows)})")
add("")
if not rows:
add("_None open._")
add("")
continue
add("| ID | Library | Title |")
add("|----|---------|-------|")
for library, fid, _, title, _ in rows:
add(f"| {fid} | [{library}]({library}/findings.md) | {title} |")
add("")
return "\n".join(lines)
def main():
check = "--check" in sys.argv[1:]
libraries = discover_libraries()
per_library = {m: parse_findings(m) for m in libraries}
content = build_readme(libraries, per_library)
readme_path = os.path.join(BASE, "README.md")
pending = sum(1 for fs in per_library.values()
for f in fs[1] if f[4] in PENDING_STATUSES)
total = sum(len(fs[1]) for fs in per_library.values())
if check:
current = open(readme_path).read() if os.path.exists(readme_path) else ""
if current != content:
print("README.md is stale — run: python3 code-reviews/regen-readme.py")
sys.exit(1)
print(f"README.md is up to date ({pending} pending / {total} total).")
return
open(readme_path, "w").write(content)
print(f"README.md regenerated — {pending} pending, {total} total findings "
f"across {len(libraries)} libraries.")
if __name__ == "__main__":
main()
@@ -124,17 +124,32 @@ public static class ZbTelemetryExtensions
/// Used internally by AddZbTelemetry. Exposed for tests and custom pipelines.
public static class ZbResource
{
/// Deterministic, process-stable service instance identifier, formatted as
/// "MachineName:ProcessId". Populates OTel Resource service.instance.id on every signal
/// (metrics, traces, logs). The OTel SDK's random-GUID default is disabled in favour of this
/// value so all signals from one process share one restart-stable instance id, enabling
/// cross-signal correlation. Always present (not optional).
public static string InstanceId { get; }
/// Returns a ResourceBuilder pre-populated with service.name, service.namespace,
/// service.version, site.id, node.role, and host.name (always Environment.MachineName).
/// Attributes with null values are omitted from the Resource.
/// service.version, service.instance.id (always — see InstanceId), site.id, node.role, and
/// host.name (always Environment.MachineName). Attributes with null/empty values are omitted.
public static ResourceBuilder Build(ZbTelemetryOptions options);
/// The single source of truth for the shared Resource attribute set. Both the OTel SDK
/// metrics/traces pipeline and the Serilog OTLP log sink derive their attributes from this one
/// map, so logs cannot drift from metrics/traces. service.name/namespace/instance.id/host.name
/// are always present; service.version/site.id/node.role are included only when the option is
/// non-null/non-empty.
public static IReadOnlyDictionary<string, object> BuildAttributes(ZbTelemetryOptions options);
}
/// Endpoint extension for mounting the Prometheus /metrics scrape endpoint.
public static class ZbMetricsEndpointExtensions
{
/// Mounts the Prometheus /metrics endpoint.
/// Only valid when ZbTelemetryOptions.Exporter = ZbExporter.Prometheus (or both).
/// Mounts the Prometheus /metrics endpoint. Valid under ANY ZbTelemetryOptions.Exporter value:
/// AddZbTelemetry always wires the Prometheus exporter, and OTLP (ZbExporter.Otlp) is only an
/// additive overlay — so /metrics serves scrape data even when Exporter = ZbExporter.Otlp.
/// Call after app.UseRouting().
public static IEndpointConventionBuilder MapZbMetrics(
this IEndpointRouteBuilder endpoints);
@@ -104,6 +104,7 @@ joinable in any OTel-compatible backend.
| `service.name` | string | Yes | Short lower-case app id: `otopcua`, `mxgateway`, `scadabridge` |
| `service.namespace` | string | Yes | Always `"ZB.MOM.WW"` — do not override |
| `service.version` | string | Recommended | Populate from `AssemblyInformationalVersion`; absent is better than wrong |
| `service.instance.id` | string | Auto | Always `ZbResource.InstanceId` = deterministic `MachineName:ProcessId`. The OTel SDK random-GUID default is disabled so every signal from one process shares one restart-stable instance id (cross-signal correlation); never override |
| `site.id` | string | Recommended | Physical or logical site identifier; omit for single-site deployments |
| `node.role` | string | Recommended | Node function: `"central"`, `"site"`, `"hub"`, `"standalone"` |
| `host.name` | string | Auto | Always `Environment.MachineName`; never override |
+1
View File
@@ -68,6 +68,7 @@ The OTel `Resource` attached to all three signals is built from `ZbTelemetryOpti
| `service.name` | `ServiceName` | Required. Lower-case short identifier (`otopcua`, `mxgateway`, `scadabridge`) |
| `service.namespace` | `ServiceNamespace` | Default `"ZB.MOM.WW"` — constant across the fleet |
| `service.version` | `ServiceVersion` | Optional; recommend populating from `AssemblyInformationalVersion` |
| `service.instance.id` | _(auto)_ | Always populated from `ZbResource.InstanceId` — a deterministic `MachineName:ProcessId`. The OTel SDK's random-GUID default is disabled so every signal from the same process carries an identical, restart-stable instance id for cross-signal correlation |
| `site.id` | `SiteId` | Optional; identifies the physical/logical site |
| `node.role` | `NodeRole` | Optional; e.g. `"central"`, `"site"`, `"hub"` |
| `host.name` | _(auto)_ | Always populated from `Environment.MachineName` |