Files
scadaproj/components/health/current-state/mxaccessgw/CURRENT-STATE.md
T
Joseph Doherty 3d25ee5090 docs(health): current-state x3 + GAPS + README
Code-verified current-state docs for OtOpcUa (three-tier full), ScadaBridge
(two-tier, no /healthz), and MxAccessGateway (bare liveness only / no probes).
GAPS backlog with P1 for MxGateway and convergence items for Akka status policy,
DB probe technique, and response writer. README with per-project status table.
2026-06-01 06:23:53 -04:00

7.0 KiB
Raw Blame History

Health — current state: MxAccessGateway

Repo: ~/Desktop/MxAccessGateway. Stack: .NET 10 gateway (x64) + .NET 4.8 worker (x86), gRPC; solution src/MxGateway.sln. Health code lives in src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs. All paths relative to repo root. Verified 2026-06-01.

Summary: bare liveness only. MxAccessGateway has a single /health/live endpoint that returns a hardcoded GatewayHealthReply JSON object. AddHealthChecks() is called at startup but is entirely unused — no IHealthCheck implementations are registered, MapHealthChecks is never called, and there is no readiness or active-node tier. The net48 x86 worker process has no HTTP server and therefore no health endpoint of any kind.

1. Endpoint wiring

src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs:

  • :61builder.Services.AddHealthChecks() is called in the DI registration block. This call is dead: no .AddCheck<T>() call follows it, no MapHealthChecks is ever called. The framework registers the health-check infrastructure but nothing is wired through it.

  • :139145MapGatewayEndpoints maps a raw endpoints.MapGet("/health/live", ...) (not MapHealthChecks). The handler is an inline lambda that returns Results.Ok(new GatewayHealthReply(...)) with a hardcoded Status: "Healthy":

    endpoints.MapGet(
        "/health/live",
        () => Results.Ok(new GatewayHealthReply(
            Status: "Healthy",
            DefaultBackend: GatewayContractInfo.DefaultBackendName,
            WorkerProtocolVersion: GatewayContractInfo.WorkerProtocolVersion)))
        .WithName("LiveHealth");
    

This endpoint always returns HTTP 200 {"Status":"Healthy",...} as long as the process is alive. It carries no authentication requirement (no [Authorize] or .RequireAuthorization()).

2. Response shape

GatewayHealthReply is a record with three fields:

  • Status — always "Healthy" (hardcoded string, not the ASP.NET Core HealthStatus enum)
  • DefaultBackend — value of GatewayContractInfo.DefaultBackendName (the configured backend name, useful for confirming which gateway instance a probe hit)
  • WorkerProtocolVersion — value of GatewayContractInfo.WorkerProtocolVersion (the gRPC protocol version the gateway expects from the worker, useful for version-skew detection)

The response is not HealthChecks.UI.Client JSON and is not the standard ASP.NET Core health response shape. It is a bespoke JSON record.

3. Probes

None. There is no IHealthCheck registered. The /health/live response does not reflect:

  • Whether the SQLite auth-store is reachable
  • Whether any active MXAccess session is functional
  • Whether the x86 worker named-pipe IPC is connected or the worker process is alive
  • Whether the gRPC service is actually accepting calls

The endpoint is purely a process liveness indicator.

4. Tier coverage

Tier Endpoint Status
Process liveness /health/live (raw MapGet) present (but non-standard)
Readiness /health/ready absent
Active node /health/active absent (not Akka-based; not applicable as-is)
healthz convention /healthz absent

MxAccessGateway is not an Akka.NET application — it has no cluster, no leader election, and no active-node concept. The "active" tier in the shared spec translates here to "is the worker process connected and the gRPC service ready to accept calls?" rather than cluster leadership.

5. x86 worker

ZB.MOM.WW.MxGateway.Worker is a .NET 4.8 console application communicating with the gateway over Windows named-pipe IPC. It has no HTTP server, no health endpoint, and no exposure to any probe mechanism. Its liveness must be inferred indirectly — either via the gateway process monitoring it (not currently implemented) or via the GrpcDependencyHealthCheck the gateway could use to probe the IPC channel.

6. Notable gaps

  • AddHealthChecks() at :61 is dead code. No IHealthCheck is ever registered via this call.
  • /health/live uses MapGet (a raw minimal-API handler) rather than MapHealthChecks. It bypasses the ASP.NET Core health-check middleware entirely, which means it does not participate in the standard health-check pipeline (no IHealthCheckPublisher, no HealthReport, no UI integration).
  • The hardcoded "Healthy" status means the endpoint cannot reflect real probe results even if probes were added later — the handler must be replaced, not just supplemented.
  • No readiness gating: orchestrators (Kubernetes, Traefik) that rely on /health/ready returning 503 until the process is actually ready will receive 200 (or 404) from MxAccessGateway today.

Adoption plan → ZB.MOM.WW.Health

Replace /health/live + wire the shared tiers:

The AddHealthChecks() call at GatewayApplication.cs:61 is already present — it just needs probes registered against it. The raw MapGet("/health/live", ...) handler at :139145 must be removed and replaced with app.MapZbHealth() from ZB.MOM.WW.Health.

Steps:

  1. Remove the inline MapGet("/health/live", ...) lambda (:139145). The GatewayHealthReply record and DefaultBackend/WorkerProtocolVersion metadata can be surfaced differently (e.g., a /info endpoint or as custom data on the health response).
  2. Register a GrpcDependencyHealthCheck (from ZB.MOM.WW.Health) that probes the named-pipe IPC channel to the x86 worker. Tag ["ready"]. This replaces the hardcoded liveness-only response with a real probe that reflects whether the worker is reachable.
  3. Optionally add a GrpcDependencyHealthCheck for any downstream gRPC dependency (e.g., the Galaxy Repository connection) if the gateway is expected to be healthy only when its upstreams are reachable. Tag ["ready"].
  4. Call app.MapZbHealth() — this maps /health/ready (tag ready), /health/active (tag active; initially empty — no active-node concept in MxGateway), and /healthz (bare liveness). The /healthz endpoint replaces the semantic role that /health/live served today.
  5. Do not add ZB.MOM.WW.Health.Akka — MxAccessGateway has no Akka dependency. The consumer matrix in the design specifies MxGateway uses the core package only.

Keep bespoke:

  • The WorkerProtocolVersion / DefaultBackend metadata from GatewayHealthReply is MxAccessGateway-specific; keep it as a separate /info endpoint or embed it as Data on a custom probe rather than normalizing it into the shared contract.
  • The x86 worker itself (net48 console, named-pipe IPC, no HTTP) remains outside the shared health scheme. The GrpcDependencyHealthCheck observes the worker indirectly from the gateway side.
  • Per-gateway auth and TLS concerns on who may call health endpoints remain per-project.

Adoption is a follow-on task (tracked in GAPS.md), not part of the ZB.MOM.WW.Health library build. MxGateway is the highest-priority adopter (P1 gap — no probes/tiers today) and should be the first app wired up once the nupkg is available.