fix(server): resolve Medium code-review finding (Server-007)

Add configDbHealthy parameter to OpcUaApplicationHost; wire a
DbHealthCache (CanConnectAsync cached 10 s) in Program.cs so /healthz
reflects real config-DB reachability instead of the previous always-true
default; /healthz now returns 503 on a DB outage unless stale-config
cache is warm.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-22 10:59:08 -04:00
parent 8e8199752f
commit fccb529d5f
3 changed files with 59 additions and 4 deletions

View File

@@ -116,13 +116,13 @@
| Severity | Medium |
| Category | Error handling & resilience |
| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaApplicationHost.cs:179-183` |
| Status | Open |
| Status | Resolved |
**Description:** `HealthEndpointsHost` is built without a `configDbHealthy` delegate, so the default `() => true` is used — `/healthz` always reports `configDbReachable = true` and never 503s on a DB outage. `_staleConfigFlag` is also never supplied by `Program.cs`, so the stale-config signal is inert too. `/healthz` degenerates to a pure liveness probe; operators get a false-healthy during a DB outage.
**Recommendation:** Wire a real config-DB probe (cheap cached `SELECT 1`) into `HealthEndpointsHost`, and register `StaleConfigFlag` in `Program.cs`. Or move DB health to `/readyz` and drop the misleading `configDbReachable` field.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-22 — added `Func<bool>? configDbHealthy` parameter to `OpcUaApplicationHost` (defaults null, backward-compatible); `Program.cs` constructs a `DbHealthCache` that calls `CanConnectAsync` every 10 s and caches the result, then passes `() => dbHealthCache.IsHealthy`; `/healthz` now reflects real DB reachability and returns 503 on a DB outage (unless stale-config cache is warm).
### Server-008
| Field | Value |