review(Host): allow-anonymous /metrics + unconditional LDAP validator
Code review at HEAD 7286d320. Host-001 (High): /metrics was auth-gated on admin
nodes (Prometheus 401) -> AllowAnonymous. Host-002: register LdapOptionsValidator
unconditionally for fail-fast startup validation on admin-only nodes. Host-004: fix
metrics XML doc. Host-003 (docs) left Open.
This commit is contained in:
@@ -0,0 +1,104 @@
|
||||
# Code Review — Host
|
||||
|
||||
<!-- Template for a per-module findings file. Copy to code-reviews/<Module>/findings.md.
|
||||
See ../../REVIEW-PROCESS.md for the full process. The base README.md is generated
|
||||
from these files by regen-readme.py — do not edit README.md by hand. -->
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Module | `src/Server/ZB.MOM.WW.OtOpcUa.Host` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-06-19 |
|
||||
| Commit reviewed | `7286d320` |
|
||||
| Status | Reviewed |
|
||||
| Open findings | 1 |
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
A comprehensive review completes every category, recording "No issues found" where
|
||||
a category produced nothing rather than leaving it blank.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | 1 finding (Host-001) |
|
||||
| 2 | OtOpcUa conventions | No issues found |
|
||||
| 3 | Concurrency & thread safety | No issues found |
|
||||
| 4 | Error handling & resilience | No issues found |
|
||||
| 5 | Security | 1 finding (Host-002) |
|
||||
| 6 | Performance & resource management | No issues found |
|
||||
| 7 | Design-document adherence | 1 finding (Host-003) |
|
||||
| 8 | Code organization & conventions | No issues found |
|
||||
| 9 | Testing coverage | No issues found |
|
||||
| 10 | Documentation & comments | 1 finding (Host-004) |
|
||||
|
||||
## Findings
|
||||
|
||||
<!-- One ### entry per finding. IDs are <Module>-NNN, sequential within the module,
|
||||
never reused. Findings are never deleted — close them by changing Status and
|
||||
completing Resolution. -->
|
||||
|
||||
### Host-001
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | High |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs:47`, `src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs:245` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `MapOtOpcUaMetrics()` calls `MapZbMetrics()` which calls `MapPrometheusScrapingEndpoint()` without chaining `.AllowAnonymous()`. On admin-role nodes (any node with `hasAdmin=true`), `AddOtOpcUaAuth` registers a `FallbackPolicy = RequireAuthenticatedUser`, which gates every endpoint that does not explicitly opt out. `MapZbHealth` correctly calls `.AllowAnonymous()` on all three health tiers (verified in `ZbHealthEndpointExtensions.cs`), but the Prometheus `/metrics` endpoint does not. The result is that on any admin or admin+driver node the Prometheus scraper receives a `401 Unauthorized` response, breaking observability. Driver-only nodes are unaffected (no auth middleware is added when `hasAdmin=false`).
|
||||
|
||||
**Recommendation:** Chain `.AllowAnonymous()` on the `MapZbMetrics()` call site in `Program.cs`, or add it inside `MapOtOpcUaMetrics()` in `ObservabilityExtensions.cs`. The latter is consistent with how `MapOtOpcUaHealth` / `MapZbHealth` are structured.
|
||||
|
||||
**Resolution:** Fixed 2026-06-19. Added `.AllowAnonymous()` to `MapOtOpcUaMetrics()` in `ObservabilityExtensions.cs` (verified by build; no unit-test harness for Host). Aligns the metrics endpoint with the health endpoints, which have always called `AllowAnonymous()` via `ZbHealthEndpointExtensions`.
|
||||
|
||||
---
|
||||
|
||||
### Host-002
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Security |
|
||||
| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs:163–180`, `src/Server/ZB.MOM.WW.OtOpcUa.Host/Configuration/LdapOptionsValidator.cs` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `AddValidatedOptions<LdapOptions, LdapOptionsValidator>` is registered only inside the `if (hasDriver)` block. On admin-only nodes (`hasAdmin=true`, `hasDriver=false`) `AddOtOpcUaAuth` registers `IOptions<LdapOptions>` via a plain `.Bind()` call with no validator attached. This means the fail-fast startup guard — including the insecure-transport check (`Transport == None && !AllowInsecure → reject`) — is silently skipped on admin-only deployments. A production admin-only node with a misconfigured or plaintext LDAP section boots without error and the misconfiguration only surfaces as a login failure at runtime.
|
||||
|
||||
**Recommendation:** Move `AddValidatedOptions<LdapOptions, LdapOptionsValidator>` above both `if (hasAdmin)` and `if (hasDriver)` so it runs unconditionally. The validator's own guard (`if (!options.Enabled || options.DevStubMode) return;`) already makes it a safe no-op when LDAP is disabled or in dev-stub mode.
|
||||
|
||||
**Resolution:** Fixed 2026-06-19. Moved `AddValidatedOptions<LdapOptions, LdapOptionsValidator>` above both role blocks in `Program.cs`, making it unconditional. The validator's own early-return guard keeps it safe on disabled/DevStub configs. Verified by build (no unit-test harness for Host).
|
||||
|
||||
---
|
||||
|
||||
### Host-003
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Design-document adherence |
|
||||
| Location | `docs/ServiceHosting.md` (section "Per-role configuration overlays") |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** `docs/ServiceHosting.md` states the configuration loading order as "base `appsettings.json` → role overlay (`appsettings.{role}.json`) → environment overlay (`appsettings.{Environment}.json`) — later layers win." This is incorrect. The actual order established by `Program.cs:59–71` is: `appsettings.json` → `appsettings.{Environment}.json` (WebApplicationBuilder default) → `appsettings.{role}.json` (appended by Program.cs) → environment variables (re-appended) → command-line args (re-appended). The role overlay therefore **wins over** `appsettings.{Environment}.json`, not the other way around. The code behaviour is correct and intentional (explained by the comment at Program.cs:66–70); only the doc is wrong.
|
||||
|
||||
**Recommendation:** Update `docs/ServiceHosting.md` to reflect the actual precedence: `appsettings.json` < `appsettings.{Environment}.json` < `appsettings.{role}.json` < environment variables < command-line args. Note that the role overlay intentionally outranks the environment-specific JSON so role-level security defaults cannot be overridden by a developer's local `appsettings.Development.json`, while environment variables and command-line args still outrank everything. Docs-only change; no src change needed.
|
||||
|
||||
**Resolution:** _(open — docs/ edit outside this pass's src-only scope; no code change required)_
|
||||
|
||||
---
|
||||
|
||||
### Host-004
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & comments |
|
||||
| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs:41–42` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The XML doc on `MapOtOpcUaMetrics` said "the default leaves it unauthenticated for local Prometheus scrapes." Before the Host-001 fix this was factually wrong (the endpoint was auth-gated on admin nodes). The phrasing also implied intent to restrict to local use, which is misleading for production deployments where a Prometheus instance scrapes across the network.
|
||||
|
||||
**Recommendation:** Update the comment to make the `AllowAnonymous` behaviour explicit and drop the "local" qualifier, since the endpoint is intentionally open to any scraper.
|
||||
|
||||
**Resolution:** Fixed 2026-06-19 as part of the Host-001 fix. Comment updated alongside the `AllowAnonymous()` addition. Verified by build (no unit-test harness for Host).
|
||||
Reference in New Issue
Block a user