From e4abe186a10bdc8078be6ef617c4451c629fd8bb Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Fri, 19 Jun 2026 10:22:59 -0400 Subject: [PATCH] review(Host): allow-anonymous /metrics + unconditional LDAP validator Code review at HEAD 7286d320. Host-001 (High): /metrics was auth-gated on admin nodes (Prometheus 401) -> AllowAnonymous. Host-002: register LdapOptionsValidator unconditionally for fail-fast startup validation on admin-only nodes. Host-004: fix metrics XML doc. Host-003 (docs) left Open. --- code-reviews/Host/findings.md | 104 ++++++++++++++++++ .../Observability/ObservabilityExtensions.cs | 10 +- src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs | 10 +- 3 files changed, 119 insertions(+), 5 deletions(-) create mode 100644 code-reviews/Host/findings.md diff --git a/code-reviews/Host/findings.md b/code-reviews/Host/findings.md new file mode 100644 index 00000000..f78818f6 --- /dev/null +++ b/code-reviews/Host/findings.md @@ -0,0 +1,104 @@ +# Code Review — Host + + + +| Field | Value | +|---|---| +| Module | `src/Server/ZB.MOM.WW.OtOpcUa.Host` | +| Reviewer | Claude Code | +| Review date | 2026-06-19 | +| Commit reviewed | `7286d320` | +| Status | Reviewed | +| Open findings | 1 | + +## Checklist coverage + +A comprehensive review completes every category, recording "No issues found" where +a category produced nothing rather than leaving it blank. + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | 1 finding (Host-001) | +| 2 | OtOpcUa conventions | No issues found | +| 3 | Concurrency & thread safety | No issues found | +| 4 | Error handling & resilience | No issues found | +| 5 | Security | 1 finding (Host-002) | +| 6 | Performance & resource management | No issues found | +| 7 | Design-document adherence | 1 finding (Host-003) | +| 8 | Code organization & conventions | No issues found | +| 9 | Testing coverage | No issues found | +| 10 | Documentation & comments | 1 finding (Host-004) | + +## Findings + + + +### Host-001 + +| Field | Value | +|---|---| +| Severity | High | +| Category | Correctness & logic bugs | +| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs:47`, `src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs:245` | +| Status | Resolved | + +**Description:** `MapOtOpcUaMetrics()` calls `MapZbMetrics()` which calls `MapPrometheusScrapingEndpoint()` without chaining `.AllowAnonymous()`. On admin-role nodes (any node with `hasAdmin=true`), `AddOtOpcUaAuth` registers a `FallbackPolicy = RequireAuthenticatedUser`, which gates every endpoint that does not explicitly opt out. `MapZbHealth` correctly calls `.AllowAnonymous()` on all three health tiers (verified in `ZbHealthEndpointExtensions.cs`), but the Prometheus `/metrics` endpoint does not. The result is that on any admin or admin+driver node the Prometheus scraper receives a `401 Unauthorized` response, breaking observability. Driver-only nodes are unaffected (no auth middleware is added when `hasAdmin=false`). + +**Recommendation:** Chain `.AllowAnonymous()` on the `MapZbMetrics()` call site in `Program.cs`, or add it inside `MapOtOpcUaMetrics()` in `ObservabilityExtensions.cs`. The latter is consistent with how `MapOtOpcUaHealth` / `MapZbHealth` are structured. + +**Resolution:** Fixed 2026-06-19. Added `.AllowAnonymous()` to `MapOtOpcUaMetrics()` in `ObservabilityExtensions.cs` (verified by build; no unit-test harness for Host). Aligns the metrics endpoint with the health endpoints, which have always called `AllowAnonymous()` via `ZbHealthEndpointExtensions`. + +--- + +### Host-002 + +| Field | Value | +|---|---| +| Severity | Medium | +| Category | Security | +| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs:163–180`, `src/Server/ZB.MOM.WW.OtOpcUa.Host/Configuration/LdapOptionsValidator.cs` | +| Status | Resolved | + +**Description:** `AddValidatedOptions` is registered only inside the `if (hasDriver)` block. On admin-only nodes (`hasAdmin=true`, `hasDriver=false`) `AddOtOpcUaAuth` registers `IOptions` via a plain `.Bind()` call with no validator attached. This means the fail-fast startup guard — including the insecure-transport check (`Transport == None && !AllowInsecure → reject`) — is silently skipped on admin-only deployments. A production admin-only node with a misconfigured or plaintext LDAP section boots without error and the misconfiguration only surfaces as a login failure at runtime. + +**Recommendation:** Move `AddValidatedOptions` above both `if (hasAdmin)` and `if (hasDriver)` so it runs unconditionally. The validator's own guard (`if (!options.Enabled || options.DevStubMode) return;`) already makes it a safe no-op when LDAP is disabled or in dev-stub mode. + +**Resolution:** Fixed 2026-06-19. Moved `AddValidatedOptions` above both role blocks in `Program.cs`, making it unconditional. The validator's own early-return guard keeps it safe on disabled/DevStub configs. Verified by build (no unit-test harness for Host). + +--- + +### Host-003 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Design-document adherence | +| Location | `docs/ServiceHosting.md` (section "Per-role configuration overlays") | +| Status | Open | + +**Description:** `docs/ServiceHosting.md` states the configuration loading order as "base `appsettings.json` → role overlay (`appsettings.{role}.json`) → environment overlay (`appsettings.{Environment}.json`) — later layers win." This is incorrect. The actual order established by `Program.cs:59–71` is: `appsettings.json` → `appsettings.{Environment}.json` (WebApplicationBuilder default) → `appsettings.{role}.json` (appended by Program.cs) → environment variables (re-appended) → command-line args (re-appended). The role overlay therefore **wins over** `appsettings.{Environment}.json`, not the other way around. The code behaviour is correct and intentional (explained by the comment at Program.cs:66–70); only the doc is wrong. + +**Recommendation:** Update `docs/ServiceHosting.md` to reflect the actual precedence: `appsettings.json` < `appsettings.{Environment}.json` < `appsettings.{role}.json` < environment variables < command-line args. Note that the role overlay intentionally outranks the environment-specific JSON so role-level security defaults cannot be overridden by a developer's local `appsettings.Development.json`, while environment variables and command-line args still outrank everything. Docs-only change; no src change needed. + +**Resolution:** _(open — docs/ edit outside this pass's src-only scope; no code change required)_ + +--- + +### Host-004 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Documentation & comments | +| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs:41–42` | +| Status | Resolved | + +**Description:** The XML doc on `MapOtOpcUaMetrics` said "the default leaves it unauthenticated for local Prometheus scrapes." Before the Host-001 fix this was factually wrong (the endpoint was auth-gated on admin nodes). The phrasing also implied intent to restrict to local use, which is misleading for production deployments where a Prometheus instance scrapes across the network. + +**Recommendation:** Update the comment to make the `AllowAnonymous` behaviour explicit and drop the "local" qualifier, since the endpoint is intentionally open to any scraper. + +**Resolution:** Fixed 2026-06-19 as part of the Host-001 fix. Comment updated alongside the `AllowAnonymous()` addition. Verified by build (no unit-test harness for Host). diff --git a/src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs b/src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs index 34f0d665..9fb7572e 100644 --- a/src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs +++ b/src/Server/ZB.MOM.WW.OtOpcUa.Host/Observability/ObservabilityExtensions.cs @@ -37,14 +37,16 @@ public static class ObservabilityExtensions } /// - /// Mounts the Prometheus scrape endpoint on the existing ASP.NET pipeline. Call after - /// app.UseAuthentication/UseAuthorization if metrics access should require auth; - /// the default leaves it unauthenticated for local Prometheus scrapes. + /// Mounts the Prometheus /metrics scrape endpoint on the existing ASP.NET pipeline. + /// The endpoint is explicitly marked AllowAnonymous so unauthenticated Prometheus + /// scrapers can reach it regardless of the host's auth fallback policy (which on admin-role + /// nodes is RequireAuthenticatedUser). This mirrors the behaviour of + /// MapZbHealth, which also marks its endpoints anonymous. /// /// The endpoint route builder. public static IEndpointRouteBuilder MapOtOpcUaMetrics(this IEndpointRouteBuilder app) { - app.MapZbMetrics(); + app.MapZbMetrics().AllowAnonymous(); return app; } } diff --git a/src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs b/src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs index 92974dd1..256777e9 100644 --- a/src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs +++ b/src/Server/ZB.MOM.WW.OtOpcUa.Host/Program.cs @@ -82,6 +82,13 @@ builder.AddZbSerilog(o => o.ServiceName = "otopcua"); builder.Services.AddOtOpcUaConfigDb(builder.Configuration); builder.Services.AddOtOpcUaCluster(builder.Configuration); +// Validate LdapOptions unconditionally so ANY role node (admin-only, driver-only, or fused) +// fails fast at startup on a misconfigured or insecure-transport LDAP section. +// The validator's own guard (if !Enabled || DevStubMode) keeps it a safe no-op on disabled/dev +// configs. Previously this was inside the hasDriver block, which left admin-only nodes without +// the startup check (AddOtOpcUaAuth binds LdapOptions but does not attach a validator). +builder.Services.AddValidatedOptions(builder.Configuration, LdapOptions.SectionName); + if (hasDriver) { builder.Services.AddOtOpcUaRuntime(); @@ -160,7 +167,6 @@ if (hasDriver) ScriptRootLoggerFactory.Build( sp.GetRequiredService(), scriptLogFilePath, scriptLogTopicMinLevel, Serilog.Log.Logger))); - builder.Services.AddValidatedOptions(builder.Configuration, LdapOptions.SectionName); // TryAdd so a fused admin+driver node (where AddOtOpcUaAuth also registers these) ends up // with exactly one descriptor; on a driver-only node these are the sole registrations. // OtOpcUaLdapAuthService is the app ILdapAuthService (Enabled switch + DevStubMode over the @@ -168,6 +174,8 @@ if (hasDriver) // per call to turn the directory's groups into roles, so register it here for driver- // only nodes (AddOtOpcUaAuth registers it on admin nodes); ILdapGroupRoleMappingService it // depends on is already registered unconditionally by AddOtOpcUaConfigDb above. + // Note: AddValidatedOptions is now registered unconditionally + // above both role blocks so admin-only nodes also get fail-fast LDAP startup validation. builder.Services.TryAddSingleton(); builder.Services.TryAddScoped, OtOpcUaGroupRoleMapper>(); builder.Services.AddSingleton();