Phase 6.1 Stream C - Health endpoints on :4841 + structured logging + Serilog JSON sink #80

Merged
dohertj2 merged 1 commits from phase-6-1-stream-c-health-logging into v2 2026-04-19 08:17:50 -04:00
Owner

Closes Stream C per docs/v2/implementation/phase-6-1-resilience-and-observability.md.

Summary

  • C.1HealthEndpointsHost on http://localhost:4841/ (HttpListener; loopback avoids URL-ACL elevation). /healthz reports process uptime + config DB reachable-or-cached; /readyz aggregates per-driver state via new DriverHealthReport pure-function service. State matrix: Healthy/Degraded → 200, NotReady/Faulted → 503. Loopback prefix documented; remote probing expects a reverse proxy.
  • C.2LogContextEnricher in Core.Observability pushes DriverInstanceId + DriverType + CapabilityName + CorrelationId onto Serilog LogContext. CapabilityInvoker now wraps every ExecuteAsync / ExecuteWriteAsync call site in the enricher scope so inner logs emit the fields automatically.
  • C.3 — Serilog pipeline in Program.cs adds Enrich.FromLogContext + opt-in JSON file sink via Serilog:WriteJson = true appsetting using Serilog.Formatting.Compact.CompactJsonFormatter. One JSON object per line — SIEMs ingest without a regex parser.
  • C.4 — Integration tests assert structured fields land: CapabilityInvokerEnrichmentTests (2) + LogContextEnricherTests (8). End-to-end integration via the invoker confirms context doesn’t leak outside the call site.

Test plan

  • 27 new tests: 9 HealthEndpointsHost (empty / Healthy / Faulted / Degraded / Initializing / stale-config / 404), 8 DriverHealthReport (aggregation + HttpStatus), 8 LogContextEnricher, 2 CapabilityInvoker enrichment.
  • Full solution dotnet test: 1016 passing (baseline 906, +110 for Phase 6.1 so far).
  • Existing 3 OpcUaApplicationHost tests updated to HealthEndpointsEnabled=false to avoid :4841 collision under parallel test execution.

🤖 Generated with Claude Code

Closes Stream C per docs/v2/implementation/phase-6-1-resilience-and-observability.md. ## Summary - **C.1** — `HealthEndpointsHost` on `http://localhost:4841/` (HttpListener; loopback avoids URL-ACL elevation). `/healthz` reports process uptime + config DB reachable-or-cached; `/readyz` aggregates per-driver state via new `DriverHealthReport` pure-function service. State matrix: Healthy/Degraded → 200, NotReady/Faulted → 503. Loopback prefix documented; remote probing expects a reverse proxy. - **C.2** — `LogContextEnricher` in `Core.Observability` pushes DriverInstanceId + DriverType + CapabilityName + CorrelationId onto Serilog LogContext. `CapabilityInvoker` now wraps every `ExecuteAsync` / `ExecuteWriteAsync` call site in the enricher scope so inner logs emit the fields automatically. - **C.3** — Serilog pipeline in Program.cs adds `Enrich.FromLogContext` + opt-in JSON file sink via `Serilog:WriteJson = true` appsetting using `Serilog.Formatting.Compact.CompactJsonFormatter`. One JSON object per line — SIEMs ingest without a regex parser. - **C.4** — Integration tests assert structured fields land: `CapabilityInvokerEnrichmentTests` (2) + `LogContextEnricherTests` (8). End-to-end integration via the invoker confirms context doesn’t leak outside the call site. ## Test plan - [x] 27 new tests: 9 HealthEndpointsHost (empty / Healthy / Faulted / Degraded / Initializing / stale-config / 404), 8 DriverHealthReport (aggregation + HttpStatus), 8 LogContextEnricher, 2 CapabilityInvoker enrichment. - [x] Full solution `dotnet test`: 1016 passing (baseline 906, +110 for Phase 6.1 so far). - [x] Existing 3 OpcUaApplicationHost tests updated to `HealthEndpointsEnabled=false` to avoid :4841 collision under parallel test execution. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
dohertj2 added 1 commit 2026-04-19 08:17:39 -04:00
Closes Stream C per docs/v2/implementation/phase-6-1-resilience-and-observability.md.

Core.Observability (new namespace):
- DriverHealthReport — pure-function aggregation over DriverHealthSnapshot list.
  Empty fleet = Healthy. Any Faulted = Faulted. Any Unknown/Initializing (no
  Faulted) = NotReady. Any Degraded or Reconnecting (no Faulted, no NotReady)
  = Degraded. Else Healthy. HttpStatus(verdict) maps to the Stream C.1 state
  matrix: Healthy/Degraded → 200, NotReady/Faulted → 503.
- LogContextEnricher — Serilog LogContext wrapper. Push(id, type, capability,
  correlationId) returns an IDisposable scope; inner log calls carry
  DriverInstanceId / DriverType / CapabilityName / CorrelationId structured
  properties automatically. NewCorrelationId = 12-hex-char GUID slice for
  cases where no OPC UA RequestHeader.RequestHandle is in flight.

CapabilityInvoker — now threads LogContextEnricher around every ExecuteAsync /
ExecuteWriteAsync call site. OtOpcUaServer passes driver.DriverType through
so logs correlate to the driver type too. Every capability call emits
structured fields per the Stream C.4 compliance check.

Server.Observability:
- HealthEndpointsHost — standalone HttpListener on http://localhost:4841/
  (loopback avoids Windows URL-ACL elevation; remote probing via reverse
  proxy or explicit netsh urlacl grant). Routes:
    /healthz → 200 when (configDbReachable OR usingStaleConfig); 503 otherwise.
      Body: status, uptimeSeconds, configDbReachable, usingStaleConfig.
    /readyz  → DriverHealthReport.Aggregate + HttpStatus mapping.
      Body: verdict, drivers[], degradedDrivers[], uptimeSeconds.
    anything else → 404.
  Disposal cooperative with the HttpListener shutdown.
- OpcUaApplicationHost starts the health host after the OPC UA server comes up
  and disposes it on shutdown. New OpcUaServerOptions knobs:
  HealthEndpointsEnabled (default true), HealthEndpointsPrefix (default
  http://localhost:4841/).

Program.cs:
- Serilog pipeline adds Enrich.FromLogContext + opt-in JSON file sink via
  `Serilog:WriteJson = true` appsetting. Uses Serilog.Formatting.Compact's
  CompactJsonFormatter (one JSON object per line — SIEMs like Splunk,
  Datadog, Graylog ingest without a regex parser).

Server.Tests:
- Existing 3 OpcUaApplicationHost integration tests now set
  HealthEndpointsEnabled=false to avoid port :4841 collisions under parallel
  execution.
- New HealthEndpointsHostTests (9): /healthz healthy empty fleet; stale-config
  returns 200 with flag; unreachable+no-cache returns 503; /readyz empty/
  Healthy/Faulted/Degraded/Initializing drivers return correct status and
  bodies; unknown path → 404. Uses ephemeral ports via Interlocked counter.

Core.Tests:
- DriverHealthReportTests (8): empty fleet, all-healthy, any-Faulted trumps,
  any-NotReady without Faulted, Degraded without Faulted/NotReady, HttpStatus
  per-verdict theory.
- LogContextEnricherTests (8): all 4 properties attach; scope disposes cleanly;
  NewCorrelationId shape; null/whitespace driverInstanceId throws.
- CapabilityInvokerEnrichmentTests (2): inner logs carry structured
  properties; no context leak outside the call site.

Full solution dotnet test: 1016 passing (baseline 906, +110 for Phase 6.1 so
far across Streams A+B+C). Pre-existing Client.CLI Subscribe flake unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 merged commit ff4a74a81f into v2 2026-04-19 08:17:50 -04:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: dohertj2/lmxopcua#80