Baseline code review of the six ZB.MOM.WW.* shared libraries
All six libraries reviewed at commit 5f75cd4 against their components/ specs,
following code-reviews/REVIEW-PROCESS.md. 35 findings (0 Critical, 1 High,
9 Medium, 25 Low); none block adoption.
- Auth 0/0/3/3 (security core sound; startup-validation + key-verify contract gaps)
- Telemetry 0/1/2/5 (HIGH Telemetry-001: redactor 'remove' is a no-op -> secrets reach sinks)
- Health 0/0/2/4 (Akka checks throw instead of Degraded when cluster not yet up)
- Theme 0/0/1/5 (undocumented Bootstrap-collapse JS dep; token/CSS hygiene)
- Audit 0/0/1/4 (composite re-throws OCE vs never-throw writer contract)
- Configuration 0/0/0/4 (DI idempotency, port-parse strictness, packaging)
Cross-cutting: XML docs authored but GenerateDocumentationFile unset -> docs
not shipped in any nupkg (Auth/Health/Telemetry/Configuration/Audit).
README.md regenerated from the per-library findings; regen-readme.py --check passes.
This commit is contained in:
+208
-17
@@ -6,31 +6,222 @@
|
||||
| Packages | `ZB.MOM.WW.Audit` |
|
||||
| Component spec | `components/audit/spec/SPEC.md` |
|
||||
| Shared contract | `components/audit/shared-contract/ZB.MOM.WW.Audit.md` |
|
||||
| Status | Not yet reviewed |
|
||||
| Last reviewed | — |
|
||||
| Reviewer | — |
|
||||
| Commit reviewed | — |
|
||||
| Open findings | 0 |
|
||||
| Status | Reviewed |
|
||||
| Last reviewed | 2026-06-01 |
|
||||
| Reviewer | Claude (automated baseline) |
|
||||
| Commit reviewed | `5f75cd4` |
|
||||
| Open findings | 5 |
|
||||
|
||||
## Summary
|
||||
|
||||
_Not yet reviewed._
|
||||
`ZB.MOM.WW.Audit` is a small, clean, well-scoped library: one canonical `AuditEvent`
|
||||
sealed record, a 3-value `AuditOutcome` enum, two narrow seams (`IAuditWriter`,
|
||||
`IAuditRedactor`), five shipped helpers, and a single `AddZbAudit` registration. The record
|
||||
matches `EVENT-MODEL.md` field-for-field (names, types, required/optional split, and the
|
||||
`OccurredAtUtc` UTC-normalization contract — including value-equality on the normalized
|
||||
instant). The dependency footprint is exactly as specified: the only non-BCL reference is
|
||||
`Microsoft.Extensions.DependencyInjection.Abstractions` (verified against the built
|
||||
`deps.json`), with no Akka/EF/SQLite/Serilog/Telemetry leakage. Guard clauses are present on
|
||||
every public entry point, `RedactingAuditWriter` guarantees redaction runs before the inner
|
||||
writer (no unredacted-PII leak), and `CompositeAuditWriter` isolates per-writer failures so a
|
||||
throwing writer does not lose downstream events. 19 tests cover the happy paths and the most
|
||||
important contract guards. No Critical or High findings. The five findings are all Medium/Low
|
||||
and concern: a cancellation-propagation tension with the "never throw to the caller" writer
|
||||
contract (Audit-001), a partial over-redaction path in `TruncatingAuditRedactor` (Audit-002),
|
||||
mutable options on a singleton redactor (Audit-003), un-emitted XML docs (Audit-004, a
|
||||
family-wide baseline pattern), and a couple of missing edge-case tests (Audit-005).
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
| # | Category | Examined | Notes |
|
||||
|---|----------|----------|-------|
|
||||
| 1 | Correctness & logic bugs | ☐ | |
|
||||
| 2 | Public API surface & compatibility | ☐ | |
|
||||
| 3 | Concurrency & thread safety | ☐ | |
|
||||
| 4 | Error handling & resilience | ☐ | |
|
||||
| 5 | Security & secret handling | ☐ | |
|
||||
| 6 | Performance & resource management | ☐ | |
|
||||
| 7 | Spec & shared-contract adherence | ☐ | |
|
||||
| 8 | Packaging, dependencies & project layout | ☐ | |
|
||||
| 9 | Testing coverage | ☐ | |
|
||||
| 10 | Documentation & XML docs | ☐ | |
|
||||
| 1 | Correctness & logic bugs | ☑ | `Truncate` boundary logic correct for sane configs; marker-longer-than-max clips safely. Partial over-redaction edge under negative max → Audit-002. |
|
||||
| 2 | Public API surface & compatibility | ☑ | Minimal, intentional surface; `sealed` records/classes; nullable annotations correct; no internal-type leakage. Options type is a mutable class, not the record the contract describes → Audit-003. |
|
||||
| 3 | Concurrency & thread safety | ☑ | `AuditEvent` immutable (readonly backing field); singletons (`NoOp`/`Null`) stateless; composite snapshots its writers via `ToArray()`. Mutable singleton options noted in Audit-003. |
|
||||
| 4 | Error handling & resilience | ☑ | Guard clauses on all entry points; composite swallows writer faults. OCE re-throw vs writer contract → Audit-001; partial over-redact → Audit-002. |
|
||||
| 5 | Security & secret handling | ☑ | `RedactingAuditWriter` applies the redactor strictly before the inner writer — no unredacted-event leak. `Actor` kept a plain string (no Auth coupling), as specified. No secrets in code/messages. |
|
||||
| 6 | Performance & resource management | ☑ | No `IDisposable` to manage; no hot-path allocations beyond a single `with`-copy per redact; DI registration is trivial. No issues found. |
|
||||
| 7 | Spec & shared-contract adherence | ☑ | `AuditEvent` + `AuditOutcome` match `EVENT-MODEL.md` exactly; `AddZbAudit` uses `TryAdd` as specified. OCE re-throw deviates from the §1 writer hard rule → Audit-001. |
|
||||
| 8 | Packaging, dependencies & project layout | ☑ | Sole non-BCL dep confirmed via `deps.json`; net10.0; correct PackageId/version (0.1.0). XML docs not packed → Audit-004. |
|
||||
| 9 | Testing coverage | ☑ | 19 tests; public contract exercised (round-trip, UTC normalization, fan-out, cancellation, redaction ordering). Missing never-throw/over-redact and null-writer cases → Audit-005. |
|
||||
| 10 | Documentation & XML docs | ☑ | XML docs present and accurate on the whole public surface; README accurate. Docs not emitted to a file → Audit-004. |
|
||||
|
||||
## Findings
|
||||
|
||||
_No findings recorded yet._
|
||||
### Audit-001 — `CompositeAuditWriter` re-throws `OperationCanceledException` to the caller, contradicting the "must not throw" writer contract
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Audit/src/ZB.MOM.WW.Audit/CompositeAuditWriter.cs:24` |
|
||||
|
||||
**Description**
|
||||
|
||||
`IAuditWriter`'s hard contract is unambiguous: "Best-effort; must not throw to the caller"
|
||||
(`IAuditWriter.cs:11`), and SPEC §1 states "`WriteAsync` MUST NOT throw to the caller … a
|
||||
failed write must never abort the user-facing action it is recording." The shared contract
|
||||
further says a cancellation "does not constitute a contract violation" and the implementation
|
||||
"may choose to complete a partially-written event anyway" — it permits ignoring the token, not
|
||||
throwing on it.
|
||||
|
||||
`CompositeAuditWriter.WriteAsync` re-throws `OperationCanceledException`
|
||||
(`catch (OperationCanceledException) { throw; }`, line 24). The very common call shape is to
|
||||
pass the ambient request token, e.g. `writer.WriteAsync(evt, httpContext.RequestAborted)`. When
|
||||
a client disconnects mid-request, the inner writer observes the cancelled token, throws OCE, and
|
||||
the composite propagates it straight into the user-facing action — exactly the abort the seam
|
||||
promises never to cause. This also directly contradicts the helper's own XML doc, which claims
|
||||
"the fan-out drains and the caller is never aborted" (`CompositeAuditWriter.cs:4`). Because the
|
||||
composite is the recommended way to wire multiple sinks, every consumer that uses it inherits
|
||||
this behaviour; the only workaround is to not pass a request-scoped token, which is non-obvious.
|
||||
|
||||
Note this is a deliberate, tested choice (`CompositeAuditWriterTests.cs:41`), so it is a
|
||||
design/contract conflict rather than an oversight — but the conflict is real: the writer contract
|
||||
says "never throw," the composite says "drains, never aborts," and the code throws on
|
||||
cancellation.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Reconcile the contract and the implementation. Either (a) honour the "never throw" rule
|
||||
literally — swallow OCE like any other writer failure (drop the remaining writers or continue,
|
||||
but do not propagate), and update the XML doc accordingly; or (b) if cancellation propagation is
|
||||
genuinely wanted, amend the `IAuditWriter` SPEC/shared-contract to carve out OCE explicitly and
|
||||
add a doc warning that consumers must not pass a request-scoped token they cannot afford to see
|
||||
surface. Option (a) is the lower-blast-radius choice and matches the seam's stated intent.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Audit-002 — `TruncatingAuditRedactor` over-redaction is partial: the catch path scrubs only `DetailsJson`, leaving `Target` unredacted
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Error handling & resilience |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Audit/src/ZB.MOM.WW.Audit/TruncatingAuditRedactor.cs:27-31`, `:34-40` |
|
||||
|
||||
**Description**
|
||||
|
||||
The redactor's over-redact fallback returns `rawEvent with { DetailsJson = null }`
|
||||
(line 30). That scrubs only `DetailsJson` — `Target`, which this redactor is also responsible
|
||||
for capping, is carried through untouched. The spec's "over-redact" rule is "returns a strictly
|
||||
safer event" (SPEC §2); leaving an oversized/sensitive `Target` in place is not strictly safer.
|
||||
|
||||
This becomes reachable under misconfiguration. `Truncate(value, max)` does `marker[..max]` when
|
||||
`marker.Length >= max` (line 38). For a negative `MaxTargetLength` or `MaxDetailsJsonLength`,
|
||||
that slice throws `ArgumentOutOfRangeException`, the `Apply` catch fires, and the result keeps
|
||||
the **original, untruncated `Target`** while only nulling `DetailsJson`. So a config bug that
|
||||
should fail safe instead leaves the `Target` field unredacted. (`max == 0` is handled correctly —
|
||||
`marker[..0]` is `""`.) The redactor never throws, so the "must not throw" half of the contract
|
||||
holds; it is the "strictly safer" half that is violated.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Make the fallback strictly safer across the fields this redactor owns, e.g.
|
||||
`return rawEvent with { DetailsJson = null, Target = null };` in the catch. Additionally, guard
|
||||
against nonsensical caps at the boundary — clamp `max` to `>= 0` in `Truncate` (treat negative
|
||||
as 0), or validate the options in the constructor — so the catch path is only ever hit by truly
|
||||
unexpected failures rather than a predictable negative-length misconfiguration.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Audit-003 — `TruncatingAuditRedactorOptions` is a mutable class, not the immutable "options record" the contract describes
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Public API surface & compatibility |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Audit/src/ZB.MOM.WW.Audit/TruncatingAuditRedactorOptions.cs:4-12` |
|
||||
|
||||
**Description**
|
||||
|
||||
The shared contract lists `TruncatingAuditRedactorOptions` as an "Options record"
|
||||
(`shared-contract/ZB.MOM.WW.Audit.md:93`), but it is implemented as a `sealed class` with three
|
||||
mutable `get; set;` auto-properties. `TruncatingAuditRedactor` captures the instance by reference
|
||||
(`_options = options ?? new(...)`, `TruncatingAuditRedactor.cs:14`) and the redactor is typically
|
||||
registered as a singleton. Because the options object stays mutable, a caller (or any other
|
||||
holder of the same reference) can change `MaxDetailsJsonLength` / `MaxTargetLength` /
|
||||
`TruncationMarker` after the redactor is built, changing redaction behaviour at runtime under a
|
||||
concurrent host with no synchronization. This is a (minor) thread-safety footgun and a drift from
|
||||
both the documented "record" shape and the library's otherwise consistent immutability posture
|
||||
(`AuditEvent` is a record with init-only members).
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Make the options immutable: convert to a `sealed record` with `init`-only properties (matching
|
||||
the contract wording), or have the redactor defensively snapshot the three values into private
|
||||
readonly fields at construction so post-construction mutation cannot affect it.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Audit-004 — XML documentation is authored but not emitted, so IntelliSense docs do not ship to consumers
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & XML docs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Audit/src/ZB.MOM.WW.Audit/ZB.MOM.WW.Audit.csproj:1-18`, `ZB.MOM.WW.Audit/Directory.Build.props:1-10` |
|
||||
|
||||
**Description**
|
||||
|
||||
Every public type and member carries accurate XML doc comments, but neither the project nor
|
||||
`Directory.Build.props` sets `GenerateDocumentationFile`. As a result no `ZB.MOM.WW.Audit.xml`
|
||||
is produced or packed alongside the DLL (confirmed: no XML doc file in `bin/`), so consuming
|
||||
apps get no IntelliSense/tooltip documentation from the package. For a shared library where, per
|
||||
the review process, "public docs matter," the authored docs are effectively invisible to
|
||||
consumers. (This is a family-wide baseline gap — none of the sibling `ZB.MOM.WW.*` libraries set
|
||||
the flag either — so it is a baseline issue rather than an Audit-specific regression, hence Low.)
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Set `<GenerateDocumentationFile>true</GenerateDocumentationFile>` (ideally once in the shared
|
||||
`Directory.Build.props` so it applies across the family). Because the public surface is fully
|
||||
documented, no CS1591 "missing XML comment" warnings should appear.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Audit-005 — Missing edge-case tests for the redactor never-throw/over-redact contract and composite null/empty handling
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Audit/tests/ZB.MOM.WW.Audit.Tests/TruncatingAuditRedactorTests.cs`, `.../CompositeAuditWriterTests.cs` |
|
||||
|
||||
**Description**
|
||||
|
||||
The 19 tests cover the happy paths and the key guards (UTC normalization + equality, fan-out,
|
||||
cancellation propagation, redaction ordering, marker-longer-than-max). Two contract-critical
|
||||
behaviours are untested:
|
||||
|
||||
1. The `IAuditRedactor` "never throws / over-redacts on failure" contract — no test drives
|
||||
`TruncatingAuditRedactor.Apply` into its `catch` branch, so the over-redaction fallback
|
||||
(and the asymmetry flagged in Audit-002) is unverified.
|
||||
2. `CompositeAuditWriter` boundary inputs — no test covers an empty writer list (should be a
|
||||
no-op) or a list containing a `null` writer (currently swallowed via the bare `catch`,
|
||||
silently dropping that sink — behaviour that should be pinned by a test).
|
||||
|
||||
Pinning these would have surfaced Audit-002 and would lock the documented never-throw guarantee.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Add: (a) a redactor test that forces an internal failure (e.g. a misconfigured cap) and asserts
|
||||
`Apply` returns a strictly-safer event without throwing; (b) a composite test for the empty-list
|
||||
no-op; and (c) a composite test documenting the null-entry behaviour (whether swallowed or
|
||||
guarded — pair with the chosen fix for the null case).
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
+235
-17
@@ -6,31 +6,249 @@
|
||||
| Packages | `ZB.MOM.WW.Auth.Abstractions`, `ZB.MOM.WW.Auth.Ldap`, `ZB.MOM.WW.Auth.ApiKeys`, `ZB.MOM.WW.Auth.AspNetCore` |
|
||||
| Component spec | `components/auth/spec/SPEC.md` |
|
||||
| Shared contract | `components/auth/shared-contract/ZB.MOM.WW.Auth.md` |
|
||||
| Status | Not yet reviewed |
|
||||
| Last reviewed | — |
|
||||
| Reviewer | — |
|
||||
| Commit reviewed | — |
|
||||
| Open findings | 0 |
|
||||
| Status | Reviewed |
|
||||
| Last reviewed | 2026-06-01 |
|
||||
| Reviewer | Claude (automated baseline) |
|
||||
| Commit reviewed | `5f75cd4` |
|
||||
| Open findings | 6 |
|
||||
|
||||
## Summary
|
||||
|
||||
_Not yet reviewed._
|
||||
The security-critical core of this library is sound: API-key secrets are 32-byte CSPRNG values,
|
||||
hashing is HMAC-SHA256 keyed by an external pepper, verification uses
|
||||
`CryptographicOperations.FixedTimeEquals`, the SQLite stores use parameterized SQL throughout,
|
||||
LDAP filter values are RFC 4515-escaped before being interpolated into the search filter, group-DN
|
||||
parsing is RFC 4514 escape-aware, and the LDAP service is genuinely fail-closed (never throws,
|
||||
requires exactly one match plus a verified password plus >=1 group for success). Test coverage is
|
||||
strong and contract-focused (~172 tests; every failure mode, the parser, the hasher, the escaper,
|
||||
the stores, the migrator and the DI wiring are exercised). The findings below are about
|
||||
robustness-around-the-edges and documentation, not the cryptographic core. The two themes are
|
||||
(1) two narrow paths that can turn a recoverable condition into a thrown exception on the
|
||||
authentication hot path, contradicting the library's own "only exception path is cancellation" /
|
||||
"fail fast at startup" promises, and (2) public-facing documentation (README) that misstates the
|
||||
hashing algorithm and the AspNetCore surface. The library is **not yet adopted** by the three apps
|
||||
(tracked in `components/auth/GAPS.md`).
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
| # | Category | Examined | Notes |
|
||||
|---|----------|----------|-------|
|
||||
| 1 | Correctness & logic bugs | ☐ | |
|
||||
| 2 | Public API surface & compatibility | ☐ | |
|
||||
| 3 | Concurrency & thread safety | ☐ | |
|
||||
| 4 | Error handling & resilience | ☐ | |
|
||||
| 5 | Security & secret handling | ☐ | |
|
||||
| 6 | Performance & resource management | ☐ | |
|
||||
| 7 | Spec & shared-contract adherence | ☐ | |
|
||||
| 8 | Packaging, dependencies & project layout | ☐ | |
|
||||
| 9 | Testing coverage | ☐ | |
|
||||
| 10 | Documentation & XML docs | ☐ | |
|
||||
| 1 | Correctness & logic bugs | ☑ | Core logic correct; see Auth-003 (JSON parse can throw on hot path), Auth-005 (KeyPrefix value inconsistency) |
|
||||
| 2 | Public API surface & compatibility | ☑ | Surface is minimal, sealed records, correct nullable annotations, internal seams hidden via `InternalsVisibleTo`. No issues found |
|
||||
| 3 | Concurrency & thread safety | ☑ | Options/records immutable; services stateless singletons; fresh SQLite connection per op with WAL + busy_timeout. No issues found |
|
||||
| 4 | Error handling & resilience | ☑ | LDAP path fully fail-closed; but see Auth-001 (no `ValidateOnStart`), Auth-002 (`MarkUsedAsync` failure fails a valid auth), Auth-003 |
|
||||
| 5 | Security & secret handling | ☑ | Peppered HMAC-SHA256 + FixedTimeEquals, CSPRNG secrets, parameterized SQL, LDAP escaping, hash-free list projection, secrets never logged. See Auth-006 (TLS cert-validation hook) |
|
||||
| 6 | Performance & resource management | ☑ | Connections disposed via `await using`; `LdapConnection` disposed; no hot-path allocs of note. No issues found |
|
||||
| 7 | Spec & shared-contract adherence | ☑ | Matches SPEC §1–§5 closely (HMAC-SHA256 pepper, constant-time compare, discriminated failures, cookie hardening). See Auth-001 (insecure-refusal deferred to first use) |
|
||||
| 8 | Packaging, dependencies & project layout | ☑ | Clean 4-package split, central versions, minimal deps, Abstractions has no runtime deps. No issues found |
|
||||
| 9 | Testing coverage | ☑ | Broad and contract-level; gaps: no test for corrupt-`scopes` deserialization (Auth-003) or `MarkUsedAsync`-throws (Auth-002); `ValidateOnStart` behavior untested (Auth-001) |
|
||||
| 10 | Documentation & XML docs | ☑ | XML docs thorough and accurate; README is stale — see Auth-004 |
|
||||
|
||||
## Findings
|
||||
|
||||
_No findings recorded yet._
|
||||
### Auth-001 — LDAP options validator is registered but never runs at startup
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.AspNetCore/ServiceCollectionExtensions.cs:40` |
|
||||
|
||||
**Description**
|
||||
|
||||
`AddZbLdapAuth` binds `LdapOptions` with `services.Configure<LdapOptions>(...)` and registers the
|
||||
validator with a plain `services.AddSingleton<IValidateOptions<LdapOptions>, LdapOptionsValidator>()`.
|
||||
It never calls `OptionsBuilder.ValidateOnStart()`. With `Microsoft.Extensions.Options`, an
|
||||
`IValidateOptions<T>` runs only when the options instance is first materialized (`IOptions<T>.Value`),
|
||||
not at host startup. Here the only consumer that materializes the options is the `ILdapAuthService`
|
||||
singleton factory (line 48), whose lambda runs on first resolution of `ILdapAuthService` — i.e. on
|
||||
the first login attempt, not at boot.
|
||||
|
||||
The result contradicts the explicit promise in both the method comment ("Fail fast at startup on a
|
||||
misconfigured directory rather than on first login") and the validator's own XML doc ("Validates
|
||||
`LdapOptions` at startup so a misconfiguration fails fast at boot"). It also weakens SPEC §2 step 1,
|
||||
which says an insecure transport without `AllowInsecure` must be *refused* — that refusal is silently
|
||||
deferred until a user tries to log in. A process can therefore boot green with `Transport=None` and
|
||||
no `AllowInsecure`, and only surface the misconfiguration as a failed login later. Because this is a
|
||||
shared library consumed by three hosts, the missing guard affects all of them.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Register options via the builder and add `ValidateOnStart`:
|
||||
|
||||
```csharp
|
||||
services.AddOptions<LdapOptions>()
|
||||
.Bind(config.GetSection(sectionPath))
|
||||
.ValidateOnStart();
|
||||
services.AddSingleton<IValidateOptions<LdapOptions>, LdapOptionsValidator>();
|
||||
```
|
||||
|
||||
(`ValidateOnStart` relies on the host's start-time options validation, which the ASP.NET Core hosts
|
||||
already run.) Add a test asserting that building + starting the host fails for an insecure config.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Auth-002 — A failed `MarkUsedAsync` write turns a valid API key into a thrown exception
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.ApiKeys/ApiKeyVerifier.cs:66` |
|
||||
|
||||
**Description**
|
||||
|
||||
After a successful constant-time secret match, `VerifyAsync` does
|
||||
`await store.MarkUsedAsync(record.KeyId, _timeProvider.GetUtcNow(), ct)` before returning the
|
||||
success result. `MarkUsedAsync` is a best-effort "last used" bookkeeping write. If that write fails
|
||||
(SQLite still `SQLITE_BUSY` after the 5 s busy-timeout, disk full, DB locked by a long migration,
|
||||
etc.), the exception propagates out of `VerifyAsync` and the otherwise-valid credential fails
|
||||
authentication. The verifier's own class remark states "The only exception path is cancellation",
|
||||
so callers (gRPC interceptors / middleware) will not be wrapping this in a try/catch and a transient
|
||||
storage hiccup becomes an auth outage for a legitimate caller.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Treat the usage write as best-effort: either swap the order (return success, then fire-and-record),
|
||||
or wrap the `MarkUsedAsync` call so a failure is logged/swallowed rather than failing the
|
||||
verification — the authentication decision has already been made by line 60. If `MarkUsedAsync` is
|
||||
allowed to throw, update the XML remark to say so. Add a test where the store's `MarkUsedAsync`
|
||||
throws and assert the result is still `Succeeded == true`.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Auth-003 — Corrupt `scopes`/`constraints` column throws `JsonException` through the verifier
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.ApiKeys/Sqlite/ScopeSerializer.cs:31`, `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.ApiKeys/Sqlite/SqliteApiKeyStore.cs:71` |
|
||||
|
||||
**Description**
|
||||
|
||||
`SqliteApiKeyStore.ReadRecord` calls `ScopeSerializer.Deserialize(reader.GetString(4))`, which does
|
||||
`JsonSerializer.Deserialize<string[]>(value)`. If the `scopes` column ever holds non-array or
|
||||
malformed JSON (operator tampering, a partial write, a future schema/format change, or a bug in
|
||||
another writer), `Deserialize` throws `JsonException`. That exception propagates out of
|
||||
`FindByKeyIdAsync`, and therefore out of `ApiKeyVerifier.VerifyAsync`, as an unhandled exception —
|
||||
again contradicting the "only exception path is cancellation" contract and converting a single
|
||||
poisoned row into a crash on the auth path rather than a structured fail-closed result. The same
|
||||
applies to the admin `ListAsync` projection.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Decide the policy explicitly. Either (a) catch `JsonException` in `ScopeSerializer.Deserialize` and
|
||||
treat a malformed column as an empty/invalid scope set so verification fails closed structurally, or
|
||||
(b) catch it at the store boundary and surface it as a controlled error. Add a test that inserts a
|
||||
row with a malformed `scopes` value and asserts the verifier returns a structured failure (or a
|
||||
documented exception) rather than an unhandled `JsonException`.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Auth-004 — README misstates the hashing algorithm and the AspNetCore public surface
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & XML docs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Auth/README.md:13`, `ZB.MOM.WW.Auth/README.md:14` |
|
||||
|
||||
**Description**
|
||||
|
||||
The README's package table contains two inaccuracies that matter for a security library:
|
||||
|
||||
1. It describes `ZB.MOM.WW.Auth.ApiKeys` as a store "with pepper-based **PBKDF2** hashing". The code
|
||||
uses **HMAC-SHA256** keyed by the pepper (`ApiKeySecretHasher.Hash` → `new HMACSHA256(pepperBytes)`),
|
||||
not PBKDF2. This is the correct algorithm per SPEC §4 ("HMAC-SHA256 with an external pepper"), so
|
||||
the *code* is right and the *README* is wrong — but a reader auditing the crypto will be misled
|
||||
about iteration count / KDF properties that do not exist here.
|
||||
2. It says `ZB.MOM.WW.Auth.AspNetCore` provides "ASP.NET Core DI helpers (`AddZbAuth`) ... and
|
||||
`LdapOptionsValidator` registration. **Wires together Ldap + ApiKeys + cookie middleware**." The
|
||||
actual public surface is a single extension method `AddZbLdapAuth` (LDAP only). There is no
|
||||
`AddZbAuth`, no API-key re-export from this package, and no cookie-middleware wiring —
|
||||
`ZbCookieDefaults.Apply` is a helper the consumer must call itself, and API-key DI lives in the
|
||||
separate `ZB.MOM.WW.Auth.ApiKeys` package (`AddZbApiKeyAuth`). The shared contract
|
||||
(`ZB.MOM.WW.Auth.md`) likewise lists a combined DI shape that the code splits differently.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Fix the README table: change "PBKDF2" to "HMAC-SHA256", and correct the AspNetCore row to describe
|
||||
the real surface (`AddZbLdapAuth`, `ZbCookieDefaults.Apply`, `ZbClaimTypes`) and note that API-key
|
||||
wiring is `AddZbApiKeyAuth` in the ApiKeys package. Optionally reconcile the shared-contract doc's
|
||||
DI-method names with the shipped names.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Auth-005 — `CreateKeyAsync` persists `KeyPrefix` as `prefix_keyId`, inconsistent with the read path
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.ApiKeys/Admin/ApiKeyAdminCommands.cs:104` |
|
||||
|
||||
**Description**
|
||||
|
||||
When creating a key, `KeyPrefix` is set to `$"{_options.TokenPrefix}_{keyId}"` (e.g. `mxgw_key-1`),
|
||||
embedding the key id into the column. Elsewhere `KeyPrefix` is treated as the bare token prefix:
|
||||
the verifier test (`ApiKeyVerifierTests.BuildRecord`) and the store test
|
||||
(`SqliteApiKeyStoreTests.SampleRecord`) seed it as `"mxgw"` / `"mxgw_ab12"`, and the `ApiKeyRecord`
|
||||
contract gives no indication it should include the key id. `KeyPrefix` is not consulted during
|
||||
verification (the token is re-parsed from the header), so there is no security impact — but the
|
||||
persisted/displayed value is inconsistent and self-referential (the key id is already its own
|
||||
column), which will confuse admin tooling that surfaces `KeyPrefix`.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Persist `KeyPrefix = _options.TokenPrefix` (the bare prefix) and let the key id remain the `KeyId`
|
||||
column, or document precisely what `KeyPrefix` is meant to contain on the `ApiKeyRecord` /
|
||||
`ApiKeyListItem` records and align all call sites to it.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Auth-006 — LDAP connection ignores `allowInsecure` and offers no TLS certificate-validation hook
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Security & secret handling |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Auth/src/ZB.MOM.WW.Auth.Ldap/Internal/NovellLdapConnection.cs:16` |
|
||||
|
||||
**Description**
|
||||
|
||||
`NovellLdapConnection.Connect` receives `bool allowInsecure` but never uses it; transport-security
|
||||
correctness rests entirely on `LdapOptionsValidator` having run beforehand (and that registration is
|
||||
itself lazy — see Auth-001). The method also sets `SecureSocketLayer`/`StartTls` but never configures
|
||||
a server-certificate validation delegate or any pinning/CA hook. The underlying
|
||||
`Novell.Directory.Ldap.NETStandard` 3.6.0 validates against the OS trust store by default (it does
|
||||
*not* blind-accept), so this is not an unauthenticated-TLS hole today — hence Low — but the library
|
||||
gives a consumer no seam to tighten (pin a CA, validate the SAN against `Server`, or deliberately
|
||||
accept a self-signed dev cert), and the unused parameter reads as a guard that does nothing.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Either remove the unused `allowInsecure` parameter from the `ILdapConnection.Connect` seam or use it
|
||||
to gate an explicit "accept invalid cert" path for dev only. Add an optional
|
||||
`RemoteCertificateValidationCallback` (or CA-pinning option) so production consumers can harden LDAPS
|
||||
beyond the OS default, and document the default validation behavior.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
@@ -6,31 +6,181 @@
|
||||
| Packages | `ZB.MOM.WW.Configuration` |
|
||||
| Component spec | `components/configuration/spec/SPEC.md` |
|
||||
| Shared contract | `components/configuration/shared-contract/ZB.MOM.WW.Configuration.md` |
|
||||
| Status | Not yet reviewed |
|
||||
| Last reviewed | — |
|
||||
| Reviewer | — |
|
||||
| Commit reviewed | — |
|
||||
| Open findings | 0 |
|
||||
| Status | Reviewed |
|
||||
| Last reviewed | 2026-06-01 |
|
||||
| Reviewer | Claude (automated baseline) |
|
||||
| Commit reviewed | `5f75cd4` |
|
||||
| Open findings | 4 |
|
||||
|
||||
## Summary
|
||||
|
||||
_Not yet reviewed._
|
||||
The library is small (five source files, ~230 LOC) and in good health. Its core promise —
|
||||
**accumulate every failure, never short-circuit** — holds across all three entry points
|
||||
(`ValidationBuilder`, `OptionsValidatorBase`, `ConfigPreflight`): the override never returns
|
||||
early, each primitive appends to a `List<string>` and returns `this`, and the `Success`/`Fail`
|
||||
decision is taken only after every rule has run. The boundary cases that matter are correct:
|
||||
`Port` rejects 0 and 65536 and accepts 1/65535; `HostPort` correctly rejects bare hosts, empty
|
||||
ports, out-of-range ports, and (deliberately) unbracketed IPv6; `Required` treats null / empty /
|
||||
whitespace alike; `OneOf` is case-insensitive and treats null as a failure (documented);
|
||||
`MinCount` treats null as zero. Null-guarding is consistent (`ArgumentNullException`/
|
||||
`ArgumentException.ThrowIfNullOrWhiteSpace` on every public entry point). The
|
||||
`ConfigPreflight.ThrowIfInvalid()` envelope is **byte-compatible** with the ScadaBridge
|
||||
`StartupValidator` format the SPEC pins (`"Configuration validation failed:\n - <field> <reason>"`,
|
||||
`\n`-joined), and the validator created by `OptionsValidatorBase` is stateless (fresh
|
||||
`ValidationBuilder` per call), so the singleton-registration requirement in SPEC §3 is honoured.
|
||||
Dependency footprint is minimal and exactly as documented (four `Microsoft.Extensions.*`
|
||||
abstractions, no ASP.NET Core). The four findings are all **Low**: a non-idempotent DI
|
||||
registration (`AddSingleton` vs `TryAddEnumerable`), a small wording inconsistency in the raw-port
|
||||
check, over-permissive integer port parsing, and the XML-doc-not-shipped packaging gap (which is
|
||||
fleet-wide, not specific to this library). Test coverage (27 tests across the four public surfaces)
|
||||
is solid for the happy and primary-failure paths; a few edge cases noted below are untested.
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
| # | Category | Examined | Notes |
|
||||
|---|----------|----------|-------|
|
||||
| 1 | Correctness & logic bugs | ☐ | |
|
||||
| 2 | Public API surface & compatibility | ☐ | |
|
||||
| 3 | Concurrency & thread safety | ☐ | |
|
||||
| 4 | Error handling & resilience | ☐ | |
|
||||
| 5 | Security & secret handling | ☐ | |
|
||||
| 6 | Performance & resource management | ☐ | |
|
||||
| 7 | Spec & shared-contract adherence | ☐ | |
|
||||
| 8 | Packaging, dependencies & project layout | ☐ | |
|
||||
| 9 | Testing coverage | ☐ | |
|
||||
| 10 | Documentation & XML docs | ☐ | |
|
||||
| 1 | Correctness & logic bugs | ☑ | Accumulation never short-circuits; port/host:port/duration/min-count boundaries correct. Port parsing over-permissive (003). |
|
||||
| 2 | Public API surface & compatibility | ☑ | Surface matches the shared contract exactly; nullability annotations correct; `sealed`/`abstract` deliberate; internal `Checks` does not leak. No issues. |
|
||||
| 3 | Concurrency & thread safety | ☑ | Validator is stateless (fresh builder per `Validate`); `ValidationBuilder`/`ConfigPreflight` are per-call locals. Safe as a singleton. No issues. |
|
||||
| 4 | Error handling & resilience | ☑ | Guard clauses on every public entry; correct exception types (`OptionsValidationException` via `ValidateOnStart`, `InvalidOperationException` from `ConfigPreflight`); no swallowed exceptions. No issues. |
|
||||
| 5 | Security & secret handling | ☑ | No secrets handled; failure messages echo config *values* (port/host/role), which are non-sensitive by design. No issues. |
|
||||
| 6 | Performance & resource management | ☑ | Startup-only, no hot path, nothing disposable, no async. No issues. |
|
||||
| 7 | Spec & shared-contract adherence | ☑ | `ConfigPreflight` envelope byte-compatible with `StartupValidator`; primitives match §2 table. Minor wording inconsistency in `PortValue` (002). |
|
||||
| 8 | Packaging, dependencies & project layout | ☑ | Single package, minimal closure, central versions, `.gitignore` present, no tracked build output. XML docs / README not packaged (004). |
|
||||
| 9 | Testing coverage | ☑ | 27 tests cover the four public surfaces + accumulation + byte-format. Untested: `AddValidatedOptions` null guards, double-registration, exact wording strings. |
|
||||
| 10 | Documentation & XML docs | ☑ | Public XML docs present and accurate on every type/member. Not emitted into the nupkg (004). |
|
||||
|
||||
## Findings
|
||||
|
||||
_No findings recorded yet._
|
||||
### Configuration-001 — `AddValidatedOptions` uses `AddSingleton`, so a double call registers (and runs) the validator twice
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Configuration/src/ZB.MOM.WW.Configuration/ServiceCollectionExtensions.cs:36` |
|
||||
|
||||
**Description**
|
||||
|
||||
`AddValidatedOptions` registers the validator with
|
||||
`services.AddSingleton<IValidateOptions<TOptions>, TValidator>()`. `IValidateOptions<TOptions>`
|
||||
is a multi-registration (enumerable) service — the options pipeline resolves *all* registered
|
||||
`IValidateOptions<TOptions>` and runs each. Because `AddSingleton` appends unconditionally, calling
|
||||
`AddValidatedOptions<TOptions, TValidator>` twice for the same `TOptions`/`TValidator` pair (e.g.
|
||||
two modules both guarding the same section, or an accidental duplicate during refactoring)
|
||||
registers the validator twice, so it executes twice and every accumulated failure is reported
|
||||
twice in the resulting `OptionsValidationException`. The framework-idiomatic registration for
|
||||
`IValidateOptions<T>` is `TryAddEnumerable(ServiceDescriptor.Singleton<IValidateOptions<TOptions>,
|
||||
TValidator>())`, which is idempotent. Impact is limited (consumers normally call once per section,
|
||||
and duplicate messages are cosmetic, not a correctness break of the accumulate-all contract), hence
|
||||
Low — but the deviation from the .NET idiom is real and easy to fix. The SPEC §3 wording ("registers
|
||||
the validator as a singleton") is satisfied either way.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Replace `services.AddSingleton<IValidateOptions<TOptions>, TValidator>()` with
|
||||
`services.TryAddEnumerable(ServiceDescriptor.Singleton<IValidateOptions<TOptions>, TValidator>())`
|
||||
(via `Microsoft.Extensions.DependencyInjection.Extensions`). Add a test asserting that calling
|
||||
`AddValidatedOptions` twice yields a single set of failure messages.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Configuration-002 — `Checks.PortValue` quotes the raw value on a parse failure but not on a range failure
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Spec & shared-contract adherence |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Configuration/src/ZB.MOM.WW.Configuration/Checks.cs:21` |
|
||||
|
||||
**Description**
|
||||
|
||||
`PortValue` produces two different message shapes for the same field depending on *why* the port
|
||||
is invalid. When the raw string parses but is out of range (e.g. `"0"` or `"70000"`), it delegates
|
||||
to `Checks.Port(int, field)`, which renders `"<field> must be between 1 and 65535 (was 0)"` —
|
||||
**unquoted**. When the raw string does not parse (e.g. `"notaport"`, `null`), it renders
|
||||
`"<field> must be between 1 and 65535 (was 'notaport')"` — **quoted**. So two failures of the same
|
||||
rule, from the same raw-config caller, read with inconsistent quoting. The shared contract
|
||||
(`shared-contract/ZB.MOM.WW.Configuration.md`, "Internal `Checks` seam") states `Checks` is "the
|
||||
single source of failure wording" so "a port failure reads the same whether it came from a bound
|
||||
options object or a raw config key" — this inconsistency is in mild tension with that claim. It is
|
||||
cosmetic (no consumer parses the text) and the overlapping integer case *does* match the bound-options
|
||||
wording, so it is Low.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Make the two branches use a consistent quoting convention for the offending value — e.g. have the
|
||||
range-failure branch also quote (`(was '0')`), or have the parse-failure branch route through a
|
||||
shared formatter. Lock the exact strings down with a wording assertion test.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Configuration-003 — Port parsing accepts leading sign and surrounding whitespace and is culture-sensitive
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Configuration/src/ZB.MOM.WW.Configuration/Checks.cs:22`, `ZB.MOM.WW.Configuration/src/ZB.MOM.WW.Configuration/Checks.cs:36` |
|
||||
|
||||
**Description**
|
||||
|
||||
Both `PortValue` (`int.TryParse(raw, out var port)`) and `HostPort`
|
||||
(`int.TryParse(value[(idx + 1)..], out var port)`) use the default `int.TryParse` overload, which
|
||||
applies `NumberStyles.Integer` (`AllowLeadingWhite | AllowTrailingWhite | AllowLeadingSign`) and the
|
||||
current culture. As a result strings the documentation describes as "integer TCP port" are accepted
|
||||
more loosely than intended: `"+5000"`, `" 5000 "`, and `"host: 5000"` (space after the colon) all
|
||||
parse and pass, and parsing is locale-dependent. A leading `-` parses to a negative number that the
|
||||
subsequent range check rejects (so `"-1"` is correctly rejected), but the whitespace/leading-`+`
|
||||
cases silently pass. This is a robustness nuance rather than a security or correctness break — a port
|
||||
that survives the loose parse is still in `1..65535` — so it is Low.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Parse with an explicit, culture-invariant, strict style, e.g.
|
||||
`int.TryParse(raw, NumberStyles.None, CultureInfo.InvariantCulture, out var port)` (rejects sign and
|
||||
whitespace), in both `PortValue` and `HostPort`. Add `Theory` cases for `"+5000"`, `" 5000 "`, and a
|
||||
space-after-colon endpoint to pin the behaviour.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Configuration-004 — XML documentation and README are not packaged into the nupkg
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & XML docs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Configuration/src/ZB.MOM.WW.Configuration/ZB.MOM.WW.Configuration.csproj:1`, `ZB.MOM.WW.Configuration/Directory.Build.props:1` |
|
||||
|
||||
**Description**
|
||||
|
||||
Every public type and member carries accurate XML doc comments, but neither the project nor
|
||||
`Directory.Build.props` sets `<GenerateDocumentationFile>true</GenerateDocumentationFile>`, so no
|
||||
`.xml` doc file is produced or included in the `dotnet pack` output. Consumers of the package
|
||||
therefore get **no IntelliSense documentation** for `OptionsValidatorBase`, `ValidationBuilder`,
|
||||
`AddValidatedOptions`, or `ConfigPreflight`, despite the docs existing in source. The project also
|
||||
does not set `PackageReadmeFile`, so the README is not embedded in the package for display on the
|
||||
NuGet feed. Category 10 of the review process explicitly notes "these are libraries — public docs
|
||||
matter," so the gap is worth recording. It is fleet-wide (the sibling `ZB.MOM.WW.Audit` library has
|
||||
the same omission), not a Configuration-specific regression, hence Low.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Add `<GenerateDocumentationFile>true</GenerateDocumentationFile>` (ideally in
|
||||
`Directory.Build.props` so the whole family inherits it) and consider `<PackageReadmeFile>README.md</PackageReadmeFile>`
|
||||
plus packing the README. Treat as a shared-infra change across the six libraries rather than a
|
||||
one-off.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
+236
-17
@@ -6,31 +6,250 @@
|
||||
| Packages | `ZB.MOM.WW.Health`, `ZB.MOM.WW.Health.Akka`, `ZB.MOM.WW.Health.EntityFrameworkCore` |
|
||||
| Component spec | `components/health/spec/SPEC.md` |
|
||||
| Shared contract | `components/health/shared-contract/ZB.MOM.WW.Health.md` |
|
||||
| Status | Not yet reviewed |
|
||||
| Last reviewed | — |
|
||||
| Reviewer | — |
|
||||
| Commit reviewed | — |
|
||||
| Open findings | 0 |
|
||||
| Status | Reviewed |
|
||||
| Last reviewed | 2026-06-01 |
|
||||
| Reviewer | Claude (automated baseline) |
|
||||
| Commit reviewed | `5f75cd4` |
|
||||
| Open findings | 6 |
|
||||
|
||||
## Summary
|
||||
|
||||
_Not yet reviewed._
|
||||
The library is small, cohesive, and well-documented, and it tracks the normalized spec closely:
|
||||
the three-tier endpoint convention, the canonical JSON writer, the `IActiveNodeGate` seam, the
|
||||
configurable `AkkaClusterStatusPolicy` presets, and the role-filtered `ActiveNodeHealthCheck` all
|
||||
match the contract. The package split is clean — Akka and EF Core stay out of the core package, so
|
||||
MxGateway can take core only (category 8: no leakage). The decision logic is factored into pure,
|
||||
exhaustively table-tested functions (`AkkaClusterStatusPolicy`, `ActiveNodeDecision`), and the test
|
||||
suite (58 tests) exercises the public surface through a real TestServer / real SQLite / real probe
|
||||
delegates rather than mocks.
|
||||
|
||||
The findings cluster around **error-handling completeness in the health-check probes** (category 4),
|
||||
the highest-risk area for this library: a probe that throws past the `IHealthCheck` boundary, or that
|
||||
returns a status the spec says it should not, degrades the value of every consuming app's `/health/*`
|
||||
endpoints. The two Akka checks and the gRPC check do not guard the cluster-state / probe call the way
|
||||
the spec requires (return Degraded on inaccessible cluster) or the way the sibling `DatabaseHealthCheck`
|
||||
does (catch-all → Unhealthy). The framework's `HealthCheckService` catches escaping exceptions, so none
|
||||
of these crash an endpoint — that caps their severity at Medium — but they produce a result that
|
||||
disagrees with the documented contract. The remainder are Low: a JSON shape nuance (`description` key
|
||||
omitted vs. emitted-null), recommended-tag drift in XML docs, a return-value footgun in `MapZbHealth`,
|
||||
and missing `GenerateDocumentationFile` so the (otherwise excellent) XML docs do not ship in the nupkgs.
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
| # | Category | Examined | Notes |
|
||||
|---|----------|----------|-------|
|
||||
| 1 | Correctness & logic bugs | ☐ | |
|
||||
| 2 | Public API surface & compatibility | ☐ | |
|
||||
| 3 | Concurrency & thread safety | ☐ | |
|
||||
| 4 | Error handling & resilience | ☐ | |
|
||||
| 5 | Security & secret handling | ☐ | |
|
||||
| 6 | Performance & resource management | ☐ | |
|
||||
| 7 | Spec & shared-contract adherence | ☐ | |
|
||||
| 8 | Packaging, dependencies & project layout | ☐ | |
|
||||
| 9 | Testing coverage | ☐ | |
|
||||
| 10 | Documentation & XML docs | ☐ | |
|
||||
| 1 | Correctness & logic bugs | ☑ | Tier predicates, policy presets, and `ActiveNodeDecision` matrices are correct and match the spec tables. No logic defects found. |
|
||||
| 2 | Public API surface & compatibility | ☑ | Health-005. Surface is minimal, sealed, well-named; nullable annotations correct; no internal types leak (`ActiveNodeDecision` is `internal`). |
|
||||
| 3 | Concurrency & thread safety | ☑ | Options are mutable POCOs but per-registration; checks hold no shared mutable state; cluster-state reads are snapshot reads; static writer options are immutable. No issues found. |
|
||||
| 4 | Error handling & resilience | ☑ | Health-001, Health-002. Akka checks don't guard cluster-state access; gRPC check lets non-Rpc/non-OCE exceptions escape. |
|
||||
| 5 | Security & secret handling | ☑ | No secrets/PII. Exception objects attached to `HealthCheckResult` (standard); endpoints `AllowAnonymous` by design. No issues found. |
|
||||
| 6 | Performance & resource management | ☑ | `DatabaseHealthCheck` is pool-safe and eagerly releases the timeout timer; scopes/contexts disposed via `await using`; channel not owned (correctly not disposed). No issues found. |
|
||||
| 7 | Spec & shared-contract adherence | ☑ | Health-001, Health-003, Health-004. Endpoint convention + status strings otherwise match §3. |
|
||||
| 8 | Packaging, dependencies & project layout | ☑ | Health-006. Dependency split correct; central versions; net10.0; correct package ids. |
|
||||
| 9 | Testing coverage | ☑ | Strong: TestServer tier tests, real SQLite EF tests, table-driven policy/decision tests, gRPC probe + timeout + cancellation. Gaps noted inline in findings, not raised separately. |
|
||||
| 10 | Documentation & XML docs | ☑ | Health-004, Health-006. Public API XML docs are thorough and accurate; the issues are recommended-tag drift and docs not being packed. |
|
||||
|
||||
## Findings
|
||||
|
||||
_No findings recorded yet._
|
||||
### Health-001 — Akka health checks throw (instead of returning Degraded) when cluster state is inaccessible
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health.Akka/AkkaClusterHealthCheck.cs:45`, `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health.Akka/ActiveNodeHealthCheck.cs:106` |
|
||||
|
||||
**Description**
|
||||
|
||||
Both Akka checks guard only the *null* ActorSystem case: `_serviceProvider.GetService<ActorSystem>()`
|
||||
returns `null` → Degraded. But once a non-null `ActorSystem` is resolved they call `Cluster.Get(system)`
|
||||
and read `.SelfMember` / `.State.Leader` with no try/catch. The spec is explicit that this path must be
|
||||
Degraded, not an exception:
|
||||
|
||||
> "in both modes, if the ActorSystem is not yet ready **or cluster state is inaccessible** (e.g.
|
||||
> during startup), the check returns Degraded (startup-safety rule)." — `SPEC.md` §2.2 note.
|
||||
|
||||
`Cluster.Get(system)` throws (`ConfigurationException`) when the `Akka.Cluster` extension is not
|
||||
configured on the resolved ActorSystem, and `SelfMember` can throw during the window where the
|
||||
ActorSystem object exists but the cluster has not finished initialising — exactly the startup race the
|
||||
spec calls out. The ActorSystem being present-but-not-yet-clustered is a very real ordering in a
|
||||
`Microsoft.Extensions.Hosting` app (the ActorSystem is registered in DI before `Cluster` has joined).
|
||||
|
||||
The escaping exception is caught by ASP.NET Core's `HealthCheckService`, which records the entry as
|
||||
Unhealthy — so the endpoint does not crash. But Unhealthy on the `ready` tier returns **503** and
|
||||
removes the node from load-balancing, whereas the spec's intended Degraded returns **200** (still in
|
||||
rotation). A transient startup window therefore yanks a healthy-but-starting node out of rotation,
|
||||
the opposite of the "startup-safe" guarantee the XML docs advertise.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Wrap the cluster-state read in a try/catch and return `HealthCheckResult.Degraded(...)` on failure, so
|
||||
both the null-ActorSystem and inaccessible-cluster paths converge on Degraded as the spec requires:
|
||||
resolve `Cluster.Get(system)` and read membership inside a `try`, catching the cluster extension's
|
||||
startup exceptions and mapping them to Degraded with a "cluster not yet ready" description.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Health-002 — `GrpcDependencyHealthCheck` lets non-`RpcException`/non-`OperationCanceledException` errors escape
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health/GrpcDependencyHealthCheck.cs:54` |
|
||||
|
||||
**Description**
|
||||
|
||||
`CheckHealthAsync` catches only `RpcException` and `OperationCanceledException`. Any other exception
|
||||
thrown by the probe escapes the method. This is reachable on both paths:
|
||||
|
||||
- Default probe: `GrpcChannel.ConnectAsync` can throw exceptions other than `RpcException` —
|
||||
e.g. `InvalidOperationException` (channel/socket misuse, shutdown) or `HttpRequestException` /
|
||||
`SocketException` surfaced from the transport before a gRPC status is produced.
|
||||
- Custom probe (`GrpcDependencyOptions.Probe`): a caller-supplied delegate (e.g. a
|
||||
`grpc.health.v1.Health/Check` RPC built on a non-gRPC client) can throw anything.
|
||||
|
||||
The XML docs and shared contract describe the result as Unhealthy "when it returns `false`, throws an
|
||||
`RpcException`, or times out" — they do not promise to handle arbitrary exceptions, and the code does
|
||||
not. By contrast the sibling `DatabaseHealthCheck` has a catch-all `catch (Exception ex)` →
|
||||
`Unhealthy("Database connection failed.", ex)` (`DatabaseHealthCheck.cs:78`), so the two probes are
|
||||
asymmetric. As with Health-001 the framework's `HealthCheckService` records the escaping exception as
|
||||
Unhealthy, so no endpoint crashes — but the dependency-named, controlled description
|
||||
(`"{name} probe failed…"`) is lost, and the library violates its own "must not throw past the
|
||||
`IHealthCheck` boundary" intent.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Add a trailing `catch (Exception ex)` returning `HealthCheckResult.Unhealthy($"{name} probe failed:
|
||||
{ex.Message}", ex)`, after the existing OCE/Rpc handlers (so external cancellation still propagates).
|
||||
This brings the gRPC probe in line with `DatabaseHealthCheck` and the documented contract.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Health-003 — Null `description` is omitted from the JSON body instead of emitted as `null`
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Spec & shared-contract adherence |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health/ZbHealthWriter.cs:32` |
|
||||
|
||||
**Description**
|
||||
|
||||
`SerializerOptions` sets `DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull`, so a check that
|
||||
produces no description renders an entry with **no** `description` key at all:
|
||||
`{ "status": "Healthy", "durationMs": 0.1 }`. The spec models `description` as a `string?` that "may be
|
||||
null" and both the canonical example (`SPEC.md` §3) and the `HealthChecks.UI.Client` shape the writer
|
||||
claims to mirror emit the key with a `null` value rather than dropping it. Consumers/dashboards parsing
|
||||
the body and reading `entries.<name>.description` must therefore handle a *missing* property, not just a
|
||||
null one. The deviation is undocumented and untested — `ResponseWriterTests` only asserts the
|
||||
present-description case.
|
||||
|
||||
Low because the aggregate `status`/HTTP-code contract that orchestrators key on is unaffected; only the
|
||||
descriptive sub-field shape drifts.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Either remove `WhenWritingNull` (so `description: null` is emitted, matching the spec example and the
|
||||
UI-client shape), or document explicitly in the writer's XML docs and `SPEC.md` §3 that absent
|
||||
descriptions are omitted, and add a test pinning the chosen behaviour.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Health-004 — XML docs recommend the `active` tag for ready-tier probes, contradicting the spec
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & XML docs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health.Akka/AkkaClusterHealthCheck.cs:11`, `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health/GrpcDependencyHealthCheck.cs:21` |
|
||||
|
||||
**Description**
|
||||
|
||||
`AkkaClusterHealthCheck`'s summary says *"Register to the `ZbHealthTags.Ready` tag (recommended
|
||||
`[ready, active]`)"*, and `GrpcDependencyHealthCheck` likewise recommends tagging `[ready, active]`.
|
||||
The spec is unambiguous that both probes belong to the `ready` tier only: §2.2 "Registered to the
|
||||
`ready` tag" (Akka cluster) and §2.4 "Registered to the `ready` tag" (gRPC dependency). The `active`
|
||||
tier is reserved by §1 / §2.3 for the leader/active-singleton probe (`ActiveNodeHealthCheck`); putting
|
||||
cluster-membership or gRPC-reachability checks on the `active` tier pollutes the active-node tier with
|
||||
non-leadership concerns and contradicts the convergence convention the library exists to enforce.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Bring the XML-doc recommendations in line with the spec — recommend `ZbHealthTags.Ready` only for both
|
||||
probes (drop the `active` suggestion); or, if the dual-tag recommendation is intentional, reconcile it
|
||||
back into `SPEC.md` §2.2/§2.4 so code and spec agree.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Health-005 — `MapZbHealth` returns only the readiness builder, silently dropping conventions for the active/live tiers
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Public API surface & compatibility |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health/ZbHealthEndpointExtensions.cs:64` |
|
||||
|
||||
**Description**
|
||||
|
||||
`MapZbHealth` maps three endpoints but returns the `IEndpointConventionBuilder` for `/health/ready`
|
||||
only. Any convention a caller chains onto the result — `.RequireHost(...)`, `.RequireAuthorization()`,
|
||||
`.WithMetadata(...)`, `.CacheOutput()` — applies to the readiness endpoint **alone**; the active and
|
||||
liveness endpoints silently do not receive it. The behaviour is documented in the XML `<returns>`, so
|
||||
this is a deliberate, disclosed trade-off rather than a bug, but it is a real footgun: the most natural
|
||||
reading of `endpoints.MapZbHealth().RequireHost("…")` is "gate all three health endpoints", and the
|
||||
result type gives the caller no signal that two of the three are excluded.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Consider returning a composite builder that fans conventions out to all three endpoints (collect the
|
||||
three `IEndpointConventionBuilder`s and wrap them) so chained conventions behave as a caller expects.
|
||||
If the single-builder return is kept for simplicity, strengthen the `<returns>` remark to a warning and
|
||||
note it in the README's `MapZbHealth` example.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Health-006 — XML documentation is not emitted into the packed nupkgs
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Packaging, dependencies & project layout |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Health/Directory.Build.props:3`, `ZB.MOM.WW.Health/src/ZB.MOM.WW.Health/ZB.MOM.WW.Health.csproj:3` |
|
||||
|
||||
**Description**
|
||||
|
||||
Every public type and member in all three packages carries thorough, accurate XML documentation, but no
|
||||
project (and neither `Directory.Build.props` nor the three `.csproj` files) sets
|
||||
`<GenerateDocumentationFile>true</GenerateDocumentationFile>`. As a result `dotnet pack` produces nupkgs
|
||||
that contain the DLLs but **no `.xml` doc files**, so consumers of `ZB.MOM.WW.Health` / `.Akka` /
|
||||
`.EntityFrameworkCore` get no IntelliSense summaries or parameter docs for the API. For a shared library
|
||||
whose value proposition is a documented common surface across three apps, this discards the
|
||||
documentation effort at the package boundary. (It also means the compiler does not enforce the
|
||||
"public members are documented" invariant via CS1591.)
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Set `<GenerateDocumentationFile>true</GenerateDocumentationFile>` once in `Directory.Build.props` so all
|
||||
three packable projects emit and pack their XML docs. Optionally pair with targeted `<NoWarn>` if any
|
||||
CS1591 warnings surface on currently-undocumented members.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
+55
-17
@@ -35,26 +35,26 @@ code-reviews/
|
||||
|
||||
## Summary
|
||||
|
||||
0 of 6 libraries reviewed. 0 pending findings across all libraries.
|
||||
6 of 6 libraries reviewed. 35 pending findings across all libraries.
|
||||
|
||||
| Severity | Open findings |
|
||||
|----------|---------------|
|
||||
| Critical | 0 |
|
||||
| High | 0 |
|
||||
| Medium | 0 |
|
||||
| Low | 0 |
|
||||
| **Total** | **0** |
|
||||
| High | 1 |
|
||||
| Medium | 9 |
|
||||
| Low | 25 |
|
||||
| **Total** | **35** |
|
||||
|
||||
## Library Status
|
||||
|
||||
| Library | Last reviewed | Commit | Open (C/H/M/L) | Open | Total |
|
||||
|---------|---------------|--------|----------------|------|-------|
|
||||
| [Audit](Audit/findings.md) | — | — | 0/0/0/0 | 0 | 0 |
|
||||
| [Auth](Auth/findings.md) | — | — | 0/0/0/0 | 0 | 0 |
|
||||
| [Configuration](Configuration/findings.md) | — | — | 0/0/0/0 | 0 | 0 |
|
||||
| [Health](Health/findings.md) | — | — | 0/0/0/0 | 0 | 0 |
|
||||
| [Telemetry](Telemetry/findings.md) | — | — | 0/0/0/0 | 0 | 0 |
|
||||
| [Theme](Theme/findings.md) | — | — | 0/0/0/0 | 0 | 0 |
|
||||
| [Audit](Audit/findings.md) | 2026-06-01 | `5f75cd4` | 0/0/1/4 | 5 | 5 |
|
||||
| [Auth](Auth/findings.md) | 2026-06-01 | `5f75cd4` | 0/0/3/3 | 6 | 6 |
|
||||
| [Configuration](Configuration/findings.md) | 2026-06-01 | `5f75cd4` | 0/0/0/4 | 4 | 4 |
|
||||
| [Health](Health/findings.md) | 2026-06-01 | `5f75cd4` | 0/0/2/4 | 6 | 6 |
|
||||
| [Telemetry](Telemetry/findings.md) | 2026-06-01 | `5f75cd4` | 0/1/2/5 | 8 | 8 |
|
||||
| [Theme](Theme/findings.md) | 2026-06-01 | `5f75cd4` | 0/0/1/5 | 6 | 6 |
|
||||
|
||||
## Pending Findings
|
||||
|
||||
@@ -67,14 +67,52 @@ description, location, recommendation — lives in the library's `findings.md`.
|
||||
|
||||
_None open._
|
||||
|
||||
### High (0)
|
||||
### High (1)
|
||||
|
||||
_None open._
|
||||
| ID | Library | Title |
|
||||
|----|---------|-------|
|
||||
| Telemetry-001 | [Telemetry](Telemetry/findings.md) | `RedactionEnricher` ignores property removal, leaving secrets in the event |
|
||||
|
||||
### Medium (0)
|
||||
### Medium (9)
|
||||
|
||||
_None open._
|
||||
| ID | Library | Title |
|
||||
|----|---------|-------|
|
||||
| Audit-001 | [Audit](Audit/findings.md) | `CompositeAuditWriter` re-throws `OperationCanceledException` to the caller, contradicting the "must not throw" writer contract |
|
||||
| Auth-001 | [Auth](Auth/findings.md) | LDAP options validator is registered but never runs at startup |
|
||||
| Auth-002 | [Auth](Auth/findings.md) | A failed `MarkUsedAsync` write turns a valid API key into a thrown exception |
|
||||
| Auth-003 | [Auth](Auth/findings.md) | Corrupt `scopes`/`constraints` column throws `JsonException` through the verifier |
|
||||
| Health-001 | [Health](Health/findings.md) | Akka health checks throw (instead of returning Degraded) when cluster state is inaccessible |
|
||||
| Health-002 | [Health](Health/findings.md) | `GrpcDependencyHealthCheck` lets non-`RpcException`/non-`OperationCanceledException` errors escape |
|
||||
| Telemetry-002 | [Telemetry](Telemetry/findings.md) | Redactor cannot inspect or scrub destructured/structured property values |
|
||||
| Telemetry-003 | [Telemetry](Telemetry/findings.md) | No tests for redactor removal or structured-value redaction |
|
||||
| Theme-001 | [Theme](Theme/findings.md) | Mobile hamburger toggle silently depends on Bootstrap collapse JS |
|
||||
|
||||
### Low (0)
|
||||
### Low (25)
|
||||
|
||||
_None open._
|
||||
| ID | Library | Title |
|
||||
|----|---------|-------|
|
||||
| Audit-002 | [Audit](Audit/findings.md) | `TruncatingAuditRedactor` over-redaction is partial: the catch path scrubs only `DetailsJson`, leaving `Target` unredacted |
|
||||
| Audit-003 | [Audit](Audit/findings.md) | `TruncatingAuditRedactorOptions` is a mutable class, not the immutable "options record" the contract describes |
|
||||
| Audit-004 | [Audit](Audit/findings.md) | XML documentation is authored but not emitted, so IntelliSense docs do not ship to consumers |
|
||||
| Audit-005 | [Audit](Audit/findings.md) | Missing edge-case tests for the redactor never-throw/over-redact contract and composite null/empty handling |
|
||||
| Auth-004 | [Auth](Auth/findings.md) | README misstates the hashing algorithm and the AspNetCore public surface |
|
||||
| Auth-005 | [Auth](Auth/findings.md) | `CreateKeyAsync` persists `KeyPrefix` as `prefix_keyId`, inconsistent with the read path |
|
||||
| Auth-006 | [Auth](Auth/findings.md) | LDAP connection ignores `allowInsecure` and offers no TLS certificate-validation hook |
|
||||
| Configuration-001 | [Configuration](Configuration/findings.md) | `AddValidatedOptions` uses `AddSingleton`, so a double call registers (and runs) the validator twice |
|
||||
| Configuration-002 | [Configuration](Configuration/findings.md) | `Checks.PortValue` quotes the raw value on a parse failure but not on a range failure |
|
||||
| Configuration-003 | [Configuration](Configuration/findings.md) | Port parsing accepts leading sign and surrounding whitespace and is culture-sensitive |
|
||||
| Configuration-004 | [Configuration](Configuration/findings.md) | XML documentation and README are not packaged into the nupkg |
|
||||
| Health-003 | [Health](Health/findings.md) | Null `description` is omitted from the JSON body instead of emitted as `null` |
|
||||
| Health-004 | [Health](Health/findings.md) | XML docs recommend the `active` tag for ready-tier probes, contradicting the spec |
|
||||
| Health-005 | [Health](Health/findings.md) | `MapZbHealth` returns only the readiness builder, silently dropping conventions for the active/live tiers |
|
||||
| Health-006 | [Health](Health/findings.md) | XML documentation is not emitted into the packed nupkgs |
|
||||
| Telemetry-004 | [Telemetry](Telemetry/findings.md) | `service.instance.id` Resource attribute is undocumented in spec and contract |
|
||||
| Telemetry-005 | [Telemetry](Telemetry/findings.md) | Two hand-maintained Resource-attribute builders can silently drift |
|
||||
| Telemetry-006 | [Telemetry](Telemetry/findings.md) | Malformed `OtlpEndpoint` throws `UriFormatException` late, with no context |
|
||||
| Telemetry-007 | [Telemetry](Telemetry/findings.md) | Redaction snapshot allocates a dictionary on every log event |
|
||||
| Telemetry-008 | [Telemetry](Telemetry/findings.md) | `MapZbMetrics` XML doc claims it is "only valid when Exporter = Prometheus" — stale |
|
||||
| Theme-002 | [Theme](Theme/findings.md) | `.chip-idle` foreground diverges from the documented token pairing |
|
||||
| Theme-003 | [Theme](Theme/findings.md) | Hardcoded hex values in CSS contradict the "no hardcoded hex" rule |
|
||||
| Theme-004 | [Theme](Theme/findings.md) | `NavRailItem` emits a `.rail-ico` span that no stylesheet defines |
|
||||
| Theme-005 | [Theme](Theme/findings.md) | Orphan and unstyled nav CSS classes in `layout.css` |
|
||||
| Theme-006 | [Theme](Theme/findings.md) | Public component/parameter surface lacks XML documentation |
|
||||
|
||||
@@ -6,31 +6,312 @@
|
||||
| Packages | `ZB.MOM.WW.Telemetry`, `ZB.MOM.WW.Telemetry.Serilog` |
|
||||
| Component spec | `components/observability/spec/SPEC.md` |
|
||||
| Shared contract | `components/observability/shared-contract/ZB.MOM.WW.Telemetry.md` |
|
||||
| Status | Not yet reviewed |
|
||||
| Last reviewed | — |
|
||||
| Reviewer | — |
|
||||
| Commit reviewed | — |
|
||||
| Open findings | 0 |
|
||||
| Status | Reviewed |
|
||||
| Last reviewed | 2026-06-01 |
|
||||
| Reviewer | Claude (automated baseline) |
|
||||
| Commit reviewed | `5f75cd4` |
|
||||
| Open findings | 8 |
|
||||
|
||||
## Summary
|
||||
|
||||
_Not yet reviewed._
|
||||
The library is small, focused, and well-structured: two packages with a clean Serilog/OTel
|
||||
boundary (the `Serilog.*` stack appears only in the `.Serilog` package; the core package is
|
||||
pure OTel + ASP.NET Core framework reference), correct argument validation, deliberate
|
||||
`sealed` types, thorough XML docs, and a deliberate no-process-global-state design for
|
||||
`AddZbSerilog` that is well covered by `MultiHostTests`. The identity triple, Resource
|
||||
omission rules, exporter wiring (Prometheus always-on, OTLP additive), and trace/log
|
||||
correlation all match the spec's intent and are exercised by the 19 tests.
|
||||
|
||||
The most material problems are in the redaction seam — the one component the review brief
|
||||
flags as security-critical. `RedactionEnricher` honours only *replacement* of scalar
|
||||
properties: it silently ignores the redactor **removing** a key (a documented capability of
|
||||
`ILogRedactor`), and it cannot see inside destructured/structured property values, so a
|
||||
secret logged as a field of `{@Object}` is never scrubbed. Both let secrets reach sinks
|
||||
despite a conforming redactor. Secondary themes: a spec drift around an undocumented
|
||||
`service.instance.id` Resource attribute, two hand-maintained Resource-attribute builders
|
||||
that can drift apart, and a stale doc-comment on `MapZbMetrics`. Tests are solid for the
|
||||
happy paths but have no coverage for redactor removal or structured-value redaction.
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
| # | Category | Examined | Notes |
|
||||
|---|----------|----------|-------|
|
||||
| 1 | Correctness & logic bugs | ☐ | |
|
||||
| 2 | Public API surface & compatibility | ☐ | |
|
||||
| 3 | Concurrency & thread safety | ☐ | |
|
||||
| 4 | Error handling & resilience | ☐ | |
|
||||
| 5 | Security & secret handling | ☐ | |
|
||||
| 6 | Performance & resource management | ☐ | |
|
||||
| 7 | Spec & shared-contract adherence | ☐ | |
|
||||
| 8 | Packaging, dependencies & project layout | ☐ | |
|
||||
| 9 | Testing coverage | ☐ | |
|
||||
| 10 | Documentation & XML docs | ☐ | |
|
||||
| 1 | Correctness & logic bugs | ☑ | Redactor "remove" path is a no-op (Telemetry-001); structured values opaque to redactor (Telemetry-002). |
|
||||
| 2 | Public API surface & compatibility | ☑ | Surface minimal, `sealed`, nullable-correct. `ZbResource.InstanceId` is an added public member not in the contract (Telemetry-004). |
|
||||
| 3 | Concurrency & thread safety | ☑ | No issues found. Enrichers stateless; `Lazy` uses `ExecutionAndPublication`; `Activity.Current` is async-local. |
|
||||
| 4 | Error handling & resilience | ☑ | Guard clauses present. `new Uri(OtlpEndpoint)` can throw late on malformed input (Telemetry-006). |
|
||||
| 5 | Security & secret handling | ☑ | Redaction gaps (Telemetry-001/002) are security-relevant — secrets can survive a conforming redactor. |
|
||||
| 6 | Performance & resource management | ☑ | Per-event dictionary snapshot when a redactor is registered (Telemetry-007); acceptable but noted. |
|
||||
| 7 | Spec & shared-contract adherence | ☑ | Undocumented `service.instance.id` attribute (Telemetry-004); two Resource builders that can drift (Telemetry-005). |
|
||||
| 8 | Packaging, dependencies & project layout | ☑ | No issues found. Serilog stack confined to `.Serilog`; central versions; correct net10.0; framework ref justified. |
|
||||
| 9 | Testing coverage | ☑ | No tests for redactor removal or structured-value redaction (Telemetry-003). |
|
||||
| 10 | Documentation & XML docs | ☑ | `MapZbMetrics` doc-comment is stale: claims "only valid when Exporter = Prometheus" (Telemetry-008). |
|
||||
|
||||
## Findings
|
||||
|
||||
_No findings recorded yet._
|
||||
### Telemetry-001 — `RedactionEnricher` ignores property removal, leaving secrets in the event
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | High |
|
||||
| Category | Security & secret handling |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry.Serilog/RedactionEnricher.cs:49-67` |
|
||||
|
||||
**Description**
|
||||
|
||||
`ILogRedactor.Redact` is documented to let a project "remove or replace any sensitive
|
||||
values" (`ILogRedactor.cs:13` and the XML doc on the interface method: *"remove or replace"*;
|
||||
the shared contract repeats *"Remove or replace any sensitive values"*). `RedactionEnricher`
|
||||
builds a `snapshot` dictionary, hands it to the redactor, then writes back only via
|
||||
`AddOrUpdateProperty` for entries that remain in the snapshot and `HasChanged`:
|
||||
|
||||
```csharp
|
||||
foreach (var entry in snapshot)
|
||||
{
|
||||
if (HasChanged(logEvent, entry.Key, entry.Value))
|
||||
logEvent.AddOrUpdateProperty(propertyFactory.CreateProperty(entry.Key, entry.Value));
|
||||
}
|
||||
```
|
||||
|
||||
If a redactor *removes* a key from the dictionary (`properties.Remove("apiKey")`) — the most
|
||||
natural way to implement "must not leave the process" — that key simply no longer appears in
|
||||
the write-back loop, so the original property is **never removed from `logEvent`**. The
|
||||
secret reaches every sink unredacted, even though the redactor did exactly what its contract
|
||||
permits. This defeats the seam's stated operational guarantee ("secrets never leave the
|
||||
process in log events") for any removal-style redactor.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
After calling the redactor, reconcile deletions: for each property key present on the
|
||||
original `logEvent` but absent from the returned `snapshot`, call
|
||||
`logEvent.RemovePropertyIfPresent(key)`. (Capture the original key set before mutation, then
|
||||
diff.) Add a test asserting a removing redactor scrubs the property (see Telemetry-003).
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Telemetry-002 — Redactor cannot inspect or scrub destructured/structured property values
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Security & secret handling |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry.Serilog/RedactionEnricher.cs:49-55` |
|
||||
|
||||
**Description**
|
||||
|
||||
The snapshot only unwraps `ScalarValue`; every other `LogEventPropertyValue`
|
||||
(`StructureValue` from `{@Object}`, `SequenceValue`, `DictionaryValue`) is passed to the
|
||||
redactor as the raw Serilog wrapper object:
|
||||
|
||||
```csharp
|
||||
snapshot[property.Key] = property.Value is ScalarValue scalar ? scalar.Value : property.Value;
|
||||
```
|
||||
|
||||
A project redactor written against the seam (`IDictionary<string, object?>` of "values")
|
||||
therefore sees an opaque `StructureValue` for a destructured payload — it cannot read or
|
||||
mask a secret *field inside* a logged object (e.g. `logger.Information("{@Command}", cmd)`
|
||||
where `cmd.ApiKey` is sensitive). MxGateway's reference redactor specifically guards
|
||||
"which command payloads must not leave the process" (per `ILogRedactor` XML doc and the
|
||||
contract), which is precisely the destructured-object case. The seam silently cannot meet
|
||||
that requirement; the redactor only works for top-level scalar properties.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Document the seam's actual reach (scalar top-level properties only) on `ILogRedactor` and in
|
||||
the shared contract, and/or recursively project `StructureValue`/`SequenceValue`/
|
||||
`DictionaryValue` into the snapshot and rebuild them on write-back so nested fields are
|
||||
redactable. At minimum, make the limitation explicit so consumers do not assume nested
|
||||
payloads are scrubbed when they are not.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Telemetry-003 — No tests for redactor removal or structured-value redaction
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Testing coverage |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Telemetry/tests/ZB.MOM.WW.Telemetry.Serilog.Tests/RedactionTests.cs:33-69` |
|
||||
|
||||
**Description**
|
||||
|
||||
`RedactionTests` covers exactly two redaction behaviours: a registered redactor replacing a
|
||||
scalar value, and a no-op when none is registered. The `FakeRedactor` only ever *reassigns*
|
||||
`properties["apiKey"]`. There is no test that a redactor which **removes** a key actually
|
||||
scrubs it (the Telemetry-001 defect would have been caught), and no test that a redactor can
|
||||
mask a field of a destructured/structured property (Telemetry-002). For a seam whose entire
|
||||
purpose is secret containment, the most security-relevant behaviours are untested.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Add tests: (a) a redactor calling `properties.Remove(key)` results in the property being
|
||||
absent from the emitted `LogEvent`; (b) a redactor attempting to mask a nested field of a
|
||||
`{@Object}` payload, asserting the documented behaviour (whichever resolution Telemetry-002
|
||||
takes). These should fail today and pin the fixes.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Telemetry-004 — `service.instance.id` Resource attribute is undocumented in spec and contract
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Spec & shared-contract adherence |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry/ZbResource.cs:19-45` |
|
||||
|
||||
**Description**
|
||||
|
||||
`ZbResource` adds a `service.instance.id` attribute (deterministic `MachineName:ProcessId`)
|
||||
to the Resource, and exposes it as a new public member `ZbResource.InstanceId`. The
|
||||
normalized Resource attribute set is enumerated exhaustively in two authoritative docs —
|
||||
`SPEC.md` §2 and `METRIC-CONVENTIONS.md` §4 — and **neither lists `service.instance.id`**;
|
||||
the shared contract (`ZB.MOM.WW.Telemetry.md`) likewise documents `ZbResource.Build` as
|
||||
populating only `service.name/namespace/version/site.id/node.role/host.name` and does not
|
||||
mention an `InstanceId` member. The attribute itself is a reasonable, standards-aligned
|
||||
improvement (and disabling the OTel SDK's random-GUID default is sensible for cross-signal
|
||||
correlation), but it is a silent divergence: the spec/contract are now stale relative to the
|
||||
code. Per REVIEW-PROCESS §2.7, both directions of drift must be flagged.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Add `service.instance.id` (with the `MachineName:ProcessId` rationale) to the Resource table
|
||||
in `SPEC.md` §2 and `METRIC-CONVENTIONS.md` §4, and document the public `ZbResource.InstanceId`
|
||||
member in the shared contract, so the normalized spec and the code agree.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Telemetry-005 — Two hand-maintained Resource-attribute builders can silently drift
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Spec & shared-contract adherence |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry/ZbResource.cs:38-64`, `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry.Serilog/ZbSerilogConfig.cs:125-151` |
|
||||
|
||||
**Description**
|
||||
|
||||
The Resource attached to metrics/traces is built by `ZbResource.Configure` (via the OTel
|
||||
`AddService` + `AddAttributes` API), while the Resource attached to the OTLP *log* sink is
|
||||
built independently by `ZbSerilogConfig.BuildResourceAttributes` (a hand-rolled
|
||||
`Dictionary<string, object>`). The two currently agree, but they enumerate the same six/seven
|
||||
attributes in two places with two different mechanisms, so a future change to one (a new
|
||||
attribute, a renamed key, a changed omission rule) will silently desynchronize logs from
|
||||
metrics/traces and break the cross-signal correlation the library's whole "unifying hinge"
|
||||
depends on. There is no test asserting parity between the two attribute sets.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Derive both from a single source of truth — e.g. have `ZbResource` expose the canonical
|
||||
attribute map (already mostly the shape `BuildResourceAttributes` returns) and have the
|
||||
Serilog sink consume it — or add a parity test that asserts the two attribute sets are
|
||||
key-for-key identical for a representative options object.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Telemetry-006 — Malformed `OtlpEndpoint` throws `UriFormatException` late, with no context
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Error handling & resilience |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry/ZbTelemetryExtensions.cs:127-135` |
|
||||
|
||||
**Description**
|
||||
|
||||
`ConfigureOtlp` does `otlp.Endpoint = new Uri(options.OtlpEndpoint)` with no validation. A
|
||||
malformed endpoint string (typo, missing scheme) throws a raw `UriFormatException` deep
|
||||
inside OTel exporter construction at host-build time, with no mention of which option was at
|
||||
fault. `BuildOptions` already fails fast and clearly for a missing `ServiceName`, but does
|
||||
not validate that `OtlpEndpoint` is a well-formed absolute URI when `Exporter == Otlp` (nor
|
||||
that it is non-empty — an Otlp exporter with a null endpoint is silently registered and
|
||||
points nowhere). The Serilog path (`ZbSerilogConfig`) has the same untyped string→endpoint
|
||||
handoff.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
In `BuildOptions`, when `Exporter == ZbExporter.Otlp`, validate `OtlpEndpoint` with
|
||||
`Uri.TryCreate(..., UriKind.Absolute, out _)` and throw an `ArgumentException` naming the
|
||||
option (consistent with the existing `ServiceName` guard) rather than letting a bare
|
||||
`UriFormatException` escape later.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Telemetry-007 — Redaction snapshot allocates a dictionary on every log event
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Performance & resource management |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry.Serilog/RedactionEnricher.cs:49-67` |
|
||||
|
||||
**Description**
|
||||
|
||||
When an `ILogRedactor` is registered, `Enrich` allocates a new
|
||||
`Dictionary<string, object?>(logEvent.Properties.Count)`, copies every property into it, and
|
||||
then iterates again to diff/write-back — on every single log event, across every logging
|
||||
thread. Enrichers are on the hottest path in the library (they run for each event the level
|
||||
filter admits). The early-return when no redactor is registered keeps the common case free,
|
||||
so the cost is borne only by redaction-enabled consumers (MxGateway), but for a high-volume
|
||||
gateway this is non-trivial steady-state allocation/GC pressure.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Consider redacting in place against `logEvent.Properties` without a full snapshot copy (e.g.
|
||||
only materialize replacements for keys the redactor touches), or short-circuit when the event
|
||||
has no properties. At minimum, document the per-event cost so consumers can weigh enabling
|
||||
redaction on very hot loggers. Acceptable as-is given redaction is opt-in and security-first.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Telemetry-008 — `MapZbMetrics` XML doc claims it is "only valid when Exporter = Prometheus" — stale
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & XML docs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Telemetry/src/ZB.MOM.WW.Telemetry/ZbMetricsEndpointExtensions.cs:11-14` |
|
||||
|
||||
**Description**
|
||||
|
||||
The XML doc on `MapZbMetrics` states it is "Only valid when
|
||||
`ZbTelemetryOptions.Exporter = ZbExporter.Prometheus`." That contradicts the actual wiring:
|
||||
`ApplyMetricsExporter` (`ZbTelemetryExtensions.cs:107-116`) **always** calls
|
||||
`AddPrometheusExporter()` regardless of the `Exporter` setting — OTLP is purely additive.
|
||||
The library's own README ("Prometheus is **always wired** for metrics regardless of the
|
||||
`Exporter` setting") and the test `AddZbTelemetry_OtlpExporter_StillServesPrometheusEndpoint`
|
||||
both confirm `/metrics` works under `Exporter = Otlp`. The doc-comment therefore tells
|
||||
consumers the opposite of the real (and intended) behaviour and could lead them to wrongly
|
||||
believe `MapZbMetrics` is a no-op under OTLP. The same stale wording is mirrored in the
|
||||
shared contract (`ZB.MOM.WW.Telemetry.md`, `MapZbMetrics` summary).
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Update the doc-comment to state that the Prometheus exporter is always registered and
|
||||
`MapZbMetrics` is valid under any `Exporter` value (Prometheus is always-on; OTLP is an
|
||||
overlay). Align the shared-contract summary for `MapZbMetrics` to match.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
+223
-17
@@ -6,31 +6,237 @@
|
||||
| Packages | `ZB.MOM.WW.Theme` (Razor Class Library) |
|
||||
| Component spec | `components/ui-theme/spec/SPEC.md` |
|
||||
| Shared contract | `components/ui-theme/shared-contract/ZB.MOM.WW.Theme.md` |
|
||||
| Status | Not yet reviewed |
|
||||
| Last reviewed | — |
|
||||
| Reviewer | — |
|
||||
| Commit reviewed | — |
|
||||
| Open findings | 0 |
|
||||
| Status | Reviewed |
|
||||
| Last reviewed | 2026-06-01 |
|
||||
| Reviewer | Claude (automated baseline) |
|
||||
| Commit reviewed | `5f75cd4` |
|
||||
| Open findings | 6 |
|
||||
|
||||
## Summary
|
||||
|
||||
_Not yet reviewed._
|
||||
`ZB.MOM.WW.Theme` is a small, clean, well-scoped Blazor Razor Class Library: ten flat-namespace
|
||||
SSR components, two stylesheets (`theme.css` 379 lines, `layout.css` 191 lines), three vendored
|
||||
IBM Plex `.woff2` fonts, and two public enums. The component C# is correct and minimal — stateless
|
||||
render logic, no concurrency surface, no secret handling, sensible `[Parameter]`/`EditorRequired`
|
||||
choices, and `TreatWarningsAsErrors`. The 32 bUnit tests (20 `[Fact]` + 12 `[Theory]` cases) exercise
|
||||
the public render contract of every component plus a static-asset check on the corrected font path.
|
||||
The standout security-sensitive area — `LoginCard` (open-redirect + antiforgery) — is handled
|
||||
correctly by delegating both concerns to the consumer with explicit XML docs and in-markup security
|
||||
notes, matching the spec. No findings are Critical or High. The findings are concentrated in two
|
||||
themes: (1) one functional gap — the mobile hamburger relies on Bootstrap's collapse JS, an
|
||||
undocumented runtime dependency that contradicts the "no JavaScript" framing; and (2) CSS / token
|
||||
hygiene — a token-fidelity drift on `.chip-idle`, hardcoded hex values that violate the spec's
|
||||
"no hardcoded hex" rule, an undefined `.rail-ico` class, orphan/dead nav CSS, and a near-total
|
||||
absence of XML docs on the public parameter surface.
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
| # | Category | Examined | Notes |
|
||||
|---|----------|----------|-------|
|
||||
| 1 | Correctness & logic bugs | ☐ | |
|
||||
| 2 | Public API surface & compatibility | ☐ | |
|
||||
| 3 | Concurrency & thread safety | ☐ | |
|
||||
| 4 | Error handling & resilience | ☐ | |
|
||||
| 5 | Security & secret handling | ☐ | |
|
||||
| 6 | Performance & resource management | ☐ | |
|
||||
| 7 | Spec & shared-contract adherence | ☐ | |
|
||||
| 8 | Packaging, dependencies & project layout | ☐ | |
|
||||
| 9 | Testing coverage | ☐ | |
|
||||
| 10 | Documentation & XML docs | ☐ | |
|
||||
| 1 | Correctness & logic bugs | ☑ | Component render logic correct; `open="@Expanded"` boolean-attribute binding behaves (tests confirm); chip/variant `switch` mappings complete. No issues found. |
|
||||
| 2 | Public API surface & compatibility | ☑ | Flat `ZB.MOM.WW.Theme` namespace; minimal, intentional; `EditorRequired` applied; nullable annotations correct; no internal types leak. Surface matches contract. No issues found. |
|
||||
| 3 | Concurrency & thread safety | ☑ | Stateless components, no shared mutable state, no `async`/statics/singletons. No issues found (UI kit). |
|
||||
| 4 | Error handling & resilience | ☑ | No throwing paths; required params default to `string.Empty`; null `RenderFragment` slots are guarded (`is not null` / `IsNullOrEmpty`). No issues found. |
|
||||
| 5 | Security & secret handling | ☑ | No secrets/PII. `LoginCard` open-redirect + CSRF correctly delegated to consumer with docs. `Accent` interpolates into a `style` attr but is developer-supplied (not untrusted). No issues found. |
|
||||
| 6 | Performance & resource management | ☑ | No `IDisposable`, no allocations of note, fonts vendored once. No issues found. |
|
||||
| 7 | Spec & shared-contract adherence | ☑ | Component/parameter API matches SPEC §4 and the shared contract. Drifts found: `.chip-idle` foreground (Theme-002) and "no hardcoded hex" rule (Theme-003). |
|
||||
| 8 | Packaging, dependencies & project layout | ☑ | `net10.0` RCL, single package id `ZB.MOM.WW.Theme`, `FrameworkReference` only, version `0.1.0` (inherited from `Directory.Build.props`), static assets under `_content/`. No issues found. |
|
||||
| 9 | Testing coverage | ☑ | 32 tests cover every component's render contract + corrected font path. CSS-class wiring (e.g. `.rail-ico`, orphan nav CSS) is not asserted, which is how Theme-004/005 slipped through. |
|
||||
| 10 | Documentation & XML docs | ☑ | README/contract accurate. XML docs present only on `LoginCard`; absent on the other 9 components, all their parameters, and both enums (Theme-006). |
|
||||
|
||||
## Findings
|
||||
|
||||
_No findings recorded yet._
|
||||
### Theme-001 — Mobile hamburger toggle silently depends on Bootstrap collapse JS
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Spec & shared-contract adherence |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/Components/ThemeShell.razor:5` |
|
||||
|
||||
**Description**
|
||||
|
||||
`ThemeShell` renders the narrow-viewport navigation toggle as a Bootstrap collapse trigger:
|
||||
`<button ... data-bs-toggle="collapse" data-bs-target="#theme-rail" aria-expanded="false">`, with the
|
||||
rail wrapped in `<div class="collapse d-lg-block" id="theme-rail">`. The expand/collapse behaviour
|
||||
and the `aria-expanded` state transition are driven entirely by **Bootstrap's collapse JavaScript
|
||||
bundle** at runtime. The shared contract states "No global JavaScript … `NavRailSection` is CSS-only"
|
||||
and the spec emphasizes static-SSR/no-JS operation, while §5 only requires each app to keep its own
|
||||
Bootstrap **`<link>`** (CSS) — Bootstrap *JS* is never mentioned. A consumer that includes only
|
||||
Bootstrap CSS (a reasonable reading of the docs) gets a hamburger button that does nothing below the
|
||||
`lg` breakpoint: the rail stays `display:none` (`.collapse`) and is unreachable on mobile/narrow
|
||||
windows. There is no compile-time signal, and the hardcoded `aria-expanded="false"` never updates
|
||||
without the JS, so assistive tech is also misinformed. This affects every consumer's small-viewport
|
||||
navigation.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Either (a) document the Bootstrap-JS runtime requirement explicitly in the README/spec adoption steps
|
||||
("the side-rail mobile toggle requires Bootstrap's collapse JS bundle"), or (b) make the toggle
|
||||
JS-independent to match the "no JavaScript" promise — e.g. wrap the rail in a `<details>`/checkbox
|
||||
CSS-only disclosure (the same pattern `NavRailSection` already uses), so collapse works in pure static
|
||||
SSR. If keeping the Bootstrap-collapse approach, document that `aria-expanded` requires the JS to stay
|
||||
accurate.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Theme-002 — `.chip-idle` foreground diverges from the documented token pairing
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Spec & shared-contract adherence |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/wwwroot/css/theme.css:177` |
|
||||
|
||||
**Description**
|
||||
|
||||
`DESIGN-TOKENS.md` ("Status tokens — tinted backgrounds") states the chip classes pair each tinted
|
||||
background with its matching foreground token — "`--ok-bg` background with `--ok` foreground … Used by
|
||||
`.chip-ok`, `.chip-warn`, `.chip-bad`, `.chip-idle`". The `--idle` token (`#868e96`) is correctly used
|
||||
for `.s-idle` and the `.conn-pill .dot`, but `.chip-idle` uses `color: var(--ink-soft)` (`#5a6066`)
|
||||
instead of `color: var(--idle)`:
|
||||
|
||||
```css
|
||||
.chip-idle { color: var(--ink-soft); background: var(--idle-bg); border-color: var(--rule-strong); }
|
||||
```
|
||||
|
||||
So the one idle chip variant does not follow the pairing rule the design-tokens doc spells out. It is
|
||||
not a contrast or readability regression (`--ink-soft` is darker than `--idle`), but code and the
|
||||
normalized token reference disagree, which is exactly the spec-drift category §7 asks to flag. Note the
|
||||
`.chip-warn` foreground is likewise a hardcoded `#b56a00` rather than `var(--warn)` — see Theme-003.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Either change `.chip-idle` to `color: var(--idle)` to honor the documented pairing, or, if the darker
|
||||
`--ink-soft` ink is the deliberate choice, amend `DESIGN-TOKENS.md` to record that `.chip-idle` uses
|
||||
`--ink-soft` foreground (as it already does for the `Info` variant footnote).
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Theme-003 — Hardcoded hex values in CSS contradict the "no hardcoded hex" rule
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Spec & shared-contract adherence |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/wwwroot/css/theme.css:174` (and theme.css:96-98,145-150,175-176,251-254,281,290-291,335,348,368-369,377; layout.css:123,128,184) |
|
||||
|
||||
**Description**
|
||||
|
||||
`SPEC.md` §1 and `DESIGN-TOKENS.md` both assert the rule: "Components carry **no hardcoded hex values**
|
||||
— everything references these tokens" / "no hardcoded hex values appear in component markup or CSS".
|
||||
In practice both stylesheets contain numerous literal hex colors outside the `:root` token block —
|
||||
chip/agg border tints (`#c6e6cd`, `#eec3c3`, `#efd6a6`, `#bfe3c6`, `#f0d9ab`, `#f0c0c0`), chip/notice
|
||||
foregrounds (`#b56a00`, `#8a5a00`), zebra/hover fills (`#fbfbf9`, `#f3f6fd`, `#eef2fc`), and the
|
||||
info/dir tints (`#e7ecfb`, `#cdd9f7`). These are derived shades (lighter borders, hover washes) with no
|
||||
corresponding token, so they are not re-themable via the token block and any per-app accent override
|
||||
will not reach them. The rule as written is therefore aspirational, and the `DESIGN-TOKENS.md` Info
|
||||
footnote itself already references a raw `#e7ecfb`, so the doc partially contradicts its own rule.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Reconcile code and doc: either promote the recurring derived shades to named tokens (e.g.
|
||||
`--ok-border`, `--hover-bg`, `--info-bg`/`--info-border`) and reference them, or soften the spec wording
|
||||
to "no hardcoded hex in **component markup**; CSS derives border/hover shades from tokens where
|
||||
practical" so the rule matches reality. This keeps the "token changes are breaking" SemVer guarantee
|
||||
honest.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Theme-004 — `NavRailItem` emits a `.rail-ico` span that no stylesheet defines
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/Components/NavRailItem.razor:4` |
|
||||
|
||||
**Description**
|
||||
|
||||
When the optional `Icon` slot is supplied, `NavRailItem` wraps it: `<span class="rail-ico">@Icon</span>`.
|
||||
Neither `theme.css` nor `layout.css` defines a `.rail-ico` rule (confirmed by grep). The icon therefore
|
||||
renders as an unstyled inline span — no sizing, alignment, or gap relative to the label text — so an
|
||||
icon + text rail link will not line up the way the other rail styling implies. It is not a crash, but
|
||||
the component ships a class hook with no backing style, which is a latent presentation bug for any
|
||||
consumer that uses the `Icon` parameter (and none of the 32 tests cover the icon path).
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Add a `.rail-ico` rule to `layout.css` (e.g. `display:inline-flex; width:1rem; margin-right:0.4rem;
|
||||
color:var(--ink-faint);`) alongside the existing `.rail-link` rules, and add a bUnit test asserting the
|
||||
`.rail-ico` span is emitted when `Icon` is supplied.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Theme-005 — Orphan and unstyled nav CSS classes in `layout.css`
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & XML docs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/wwwroot/css/layout.css:75` (`.rail-eyebrow`), `:103` (`.rail-eyebrow-chevron`); `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/Components/NavRailSection.razor:5` (`.rail-eyebrow-label`) |
|
||||
|
||||
**Description**
|
||||
|
||||
`NavRailSection` renders `<summary class="rail-eyebrow-toggle"><span class="rail-eyebrow-label">@Title</span></summary>`
|
||||
and `layout.css` provides the actual collapse chevron via `.rail-section > summary::before` (lines
|
||||
178-181). That leaves several mismatched class hooks: `.rail-eyebrow` (line 75, non-toggle) and
|
||||
`.rail-eyebrow-chevron` (lines 103-108) are defined in CSS but emitted by **no** component (dead CSS),
|
||||
while `.rail-eyebrow-label` is emitted by the component but has **no** CSS rule (it merely inherits from
|
||||
`.rail-eyebrow-toggle`). The result is two parallel chevron mechanisms (the styled-but-unused
|
||||
`.rail-eyebrow-chevron` and the actually-used `::before`) and stylesheet cruft that misleads future
|
||||
maintainers about how the section header is themed.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Remove the dead `.rail-eyebrow` and `.rail-eyebrow-chevron` rules (or wire `.rail-eyebrow-chevron` into
|
||||
`NavRailSection` and drop the `::before` approach — pick one), and either add a `.rail-eyebrow-label`
|
||||
rule or drop the now-redundant wrapper span. Keeping one chevron mechanism removes the ambiguity.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### Theme-006 — Public component/parameter surface lacks XML documentation
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & XML docs |
|
||||
| Status | Open |
|
||||
| Location | `ZB.MOM.WW.Theme/src/ZB.MOM.WW.Theme/Components/ThemeShell.razor:23` (and `BrandBar.razor`, `NavRailItem.razor`, `NavRailSection.razor`, `StatusPill.razor`, `TechButton.razor`, `TechCard.razor`, `TechField.razor`, `ThemeHead.razor`, `ButtonVariant.cs`, `StatusState.cs`) |
|
||||
|
||||
**Description**
|
||||
|
||||
This is a packaged library (`IsPackable=true`, `0.1.0`) consumed by three separate apps, and
|
||||
`REVIEW-PROCESS.md` §2.10 calls for "XML doc accuracy and presence on the **public** API (these are
|
||||
libraries — public docs matter)." Only `LoginCard` carries XML docs (on `ReturnUrl` and `ChildContent`).
|
||||
The other nine components, all of their `[Parameter]` properties, and both public enums
|
||||
(`ButtonVariant`, `StatusState`) have no `///` documentation, so consumers get no IntelliSense/tooltip
|
||||
help on the parameter contract (e.g. that `ThemeShell.Accent` overrides the `--accent` token, that
|
||||
`NavRailItem.Match` defaults to `Prefix`, or what each `StatusState`/`ButtonVariant` member maps to).
|
||||
The README and shared contract document all of this, but that prose is not surfaced at the call site.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Add brief `///` summaries to the public components, their parameters, and the two enum types — at
|
||||
minimum the non-obvious ones (`ThemeShell.Accent`, `NavRailItem.Match`, `TechButton.Busy`,
|
||||
`TechCard.Title` vs `Header` precedence, and each enum member). Consider enabling
|
||||
`GenerateDocumentationFile` so missing-doc warnings surface under the existing `TreatWarningsAsErrors`.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
Reference in New Issue
Block a user