docs(health): spec + ZB.MOM.WW.Health shared contract
Authors components/health/spec/SPEC.md (normalized three-tier endpoint convention, probe catalog, response-writer contract, migration notes) and components/health/shared-contract/ZB.MOM.WW.Health.md (paper API for the 3-package library: core, Akka, EntityFrameworkCore).
This commit is contained in:
@@ -0,0 +1,238 @@
|
||||
# Proposed shared library: `ZB.MOM.WW.Health`
|
||||
|
||||
A contract on paper — the public surface to extract so the three projects stop re-implementing
|
||||
health-check tiers, probe logic, and the active-node gating seam. Realizes
|
||||
[`../spec/SPEC.md`](../spec/SPEC.md). **Not yet created.** Reference implementations already
|
||||
exist: OtOpcUa `Health/` (three-tier + probes), ScadaBridge `Health/` (inline probes +
|
||||
`ActiveNodeGate`).
|
||||
|
||||
## Packages (.NET 10)
|
||||
|
||||
```
|
||||
ZB.MOM.WW.Health # core: tier convention, response writer, IActiveNodeGate, GrpcDependencyHealthCheck
|
||||
ZB.MOM.WW.Health.Akka # AkkaClusterHealthCheck, ActiveNodeHealthCheck, AkkaActiveNodeGate
|
||||
ZB.MOM.WW.Health.EntityFrameworkCore # DatabaseHealthCheck<TContext>
|
||||
```
|
||||
|
||||
All three are .NET 10. The split keeps Akka.Cluster and EF Core out of MxGateway's dependency
|
||||
graph — MxGateway pulls only the core package. Published to the Gitea NuGet feed; SemVer; lockstep
|
||||
to start. The x86 net48 mxaccessgw worker has no HTTP surface — net48 multi-targeting is **not**
|
||||
required.
|
||||
|
||||
## Packaging & distribution
|
||||
|
||||
**Three NuGet packages, one DLL each**, on the Gitea NuGet feed. These are **libraries** linked
|
||||
into each app — there is no central health service. Consumers reference only what they need:
|
||||
|
||||
| Package (→ DLL) | Transitive deps | MxGateway | OtOpcUa | ScadaBridge |
|
||||
|---|---|---|---|---|
|
||||
| `…Health` | `Microsoft.Extensions.Diagnostics.HealthChecks`, ASP.NET Core abstractions | ✅ | ✅ | ✅ |
|
||||
| `…Health.Akka` | Akka.Cluster | — | ✅ | ✅ |
|
||||
| `…Health.EntityFrameworkCore` | EF Core | — | ✅ | ✅ |
|
||||
|
||||
**Why MxGateway takes only core:** it is not Akka-based and does not use EF Core. The
|
||||
`GrpcDependencyHealthCheck` in the core package covers its only probe need (worker channel
|
||||
reachability), so it avoids the Akka and EF transitive trees entirely.
|
||||
|
||||
## `ZB.MOM.WW.Health`
|
||||
|
||||
```csharp
|
||||
namespace ZB.MOM.WW.Health;
|
||||
|
||||
/// Canonical tag constants — use these when calling AddCheck(..., tags: [ZbHealthTags.Ready]).
|
||||
public static class ZbHealthTags
|
||||
{
|
||||
public const string Ready = "ready";
|
||||
public const string Active = "active";
|
||||
public const string Live = "live";
|
||||
}
|
||||
|
||||
/// Options for MapZbHealth(). All paths and the response writer are overridable.
|
||||
public sealed class ZbHealthEndpointOptions
|
||||
{
|
||||
public string ReadyPath { get; set; } = "/health/ready";
|
||||
public string ActivePath { get; set; } = "/health/active";
|
||||
public string LivePath { get; set; } = "/healthz";
|
||||
|
||||
/// Defaults to ZbHealthWriter.WriteJsonAsync.
|
||||
public Func<HttpContext, HealthReport, Task>? ResponseWriter { get; set; }
|
||||
}
|
||||
|
||||
/// Extension that maps all three health tiers in one call.
|
||||
public static class ZbHealthEndpointExtensions
|
||||
{
|
||||
/// Maps /health/ready (tag "ready"), /health/active (tag "active"), /healthz (tag "live").
|
||||
/// Does NOT call services.AddHealthChecks() — caller is responsible for probe registration.
|
||||
public static IEndpointConventionBuilder MapZbHealth(
|
||||
this IEndpointRouteBuilder endpoints,
|
||||
ZbHealthEndpointOptions? options = null);
|
||||
|
||||
/// Maps /health/ready (tag "ready"), /health/active (tag "active"), /healthz (tag "live").
|
||||
public static IEndpointConventionBuilder MapZbHealth(
|
||||
this IEndpointRouteBuilder endpoints,
|
||||
Action<ZbHealthEndpointOptions> configure);
|
||||
}
|
||||
|
||||
/// Canonical JSON response writer. Shape: { status, totalDurationMs, entries: { name: { status, description, duration } } }.
|
||||
public static class ZbHealthWriter
|
||||
{
|
||||
public static Task WriteJsonAsync(HttpContext context, HealthReport report);
|
||||
}
|
||||
|
||||
/// Single-property seam: is this node the active/leader node?
|
||||
/// Attach to route groups via RequireActiveNode(). Implement with AkkaActiveNodeGate (Health.Akka)
|
||||
/// or a project-specific implementation for non-Akka nodes.
|
||||
public interface IActiveNodeGate
|
||||
{
|
||||
bool IsActiveNode { get; }
|
||||
}
|
||||
|
||||
/// Route convention that returns 503 on standby nodes. DI-resolves IActiveNodeGate.
|
||||
public static class ActiveNodeGateExtensions
|
||||
{
|
||||
public static IEndpointConventionBuilder RequireActiveNode(
|
||||
this IEndpointConventionBuilder builder);
|
||||
}
|
||||
|
||||
/// Checks that a downstream gRPC channel is reachable.
|
||||
public sealed class GrpcDependencyHealthCheck : IHealthCheck
|
||||
{
|
||||
public GrpcDependencyHealthCheck(GrpcChannel channel, GrpcDependencyOptions? options = null);
|
||||
|
||||
public Task<HealthCheckResult> CheckHealthAsync(
|
||||
HealthCheckContext context,
|
||||
CancellationToken cancellationToken = default);
|
||||
}
|
||||
|
||||
/// Options for GrpcDependencyHealthCheck.
|
||||
public sealed class GrpcDependencyOptions
|
||||
{
|
||||
/// Override the default probe (GrpcChannel.ConnectAsync).
|
||||
/// Return true = reachable, false = unreachable.
|
||||
public Func<GrpcChannel, CancellationToken, Task<bool>>? Probe { get; set; }
|
||||
|
||||
/// Human-readable name of the dependency, used in the HealthCheckResult description.
|
||||
public string? DependencyName { get; set; }
|
||||
|
||||
public TimeSpan Timeout { get; set; } = TimeSpan.FromSeconds(5);
|
||||
}
|
||||
```
|
||||
|
||||
## `ZB.MOM.WW.Health.Akka`
|
||||
|
||||
```csharp
|
||||
namespace ZB.MOM.WW.Health.Akka;
|
||||
|
||||
/// Checks the local node's Akka cluster membership status.
|
||||
/// Register to tag ZbHealthTags.Ready.
|
||||
public sealed class AkkaClusterHealthCheck : IHealthCheck
|
||||
{
|
||||
public AkkaClusterHealthCheck(
|
||||
ActorSystem system,
|
||||
AkkaClusterStatusPolicy policy);
|
||||
|
||||
public Task<HealthCheckResult> CheckHealthAsync(
|
||||
HealthCheckContext context,
|
||||
CancellationToken cancellationToken = default);
|
||||
}
|
||||
|
||||
/// Maps Akka MemberStatus values to HealthStatus.
|
||||
/// Two named presets cover the two existing implementations; construct a custom instance for
|
||||
/// project-specific overrides.
|
||||
public sealed class AkkaClusterStatusPolicy
|
||||
{
|
||||
public AkkaClusterStatusPolicy(Func<MemberStatus, HealthStatus> evaluate);
|
||||
|
||||
/// ScadaBridge origin: Up/Joining→Healthy, Leaving/Exiting→Degraded, else Unhealthy.
|
||||
/// Convergence target for all projects.
|
||||
public static AkkaClusterStatusPolicy Default { get; }
|
||||
|
||||
/// OtOpcUa origin: self-Up-among-reachable-members→Healthy, else Degraded.
|
||||
/// Provided for backward compatibility during OtOpcUa migration.
|
||||
public static AkkaClusterStatusPolicy OtOpcUaCompat { get; }
|
||||
}
|
||||
|
||||
/// Checks whether this node is the designated leader / active node.
|
||||
/// Optional role parameter scopes the check to nodes carrying that role.
|
||||
/// Register to tag ZbHealthTags.Active.
|
||||
public sealed class ActiveNodeHealthCheck : IHealthCheck
|
||||
{
|
||||
/// Role-less constructor: Healthy = node is Up AND cluster leader (ScadaBridge ActiveNode pattern).
|
||||
public ActiveNodeHealthCheck(ActorSystem system);
|
||||
|
||||
/// Role-filtered constructor: Healthy = (node lacks the role) OR (node carries role AND is role-singleton leader).
|
||||
/// Degraded = node carries role but is not the role-singleton leader (OtOpcUa AdminRoleLeader pattern).
|
||||
public ActiveNodeHealthCheck(ActorSystem system, string role);
|
||||
|
||||
public Task<HealthCheckResult> CheckHealthAsync(
|
||||
HealthCheckContext context,
|
||||
CancellationToken cancellationToken = default);
|
||||
}
|
||||
|
||||
/// IActiveNodeGate implementation backed by ActiveNodeHealthCheck.
|
||||
/// Register as a singleton; resolves ActiveNodeHealthCheck from DI.
|
||||
public sealed class AkkaActiveNodeGate : IActiveNodeGate
|
||||
{
|
||||
public AkkaActiveNodeGate(ActiveNodeHealthCheck check);
|
||||
|
||||
public bool IsActiveNode { get; }
|
||||
}
|
||||
```
|
||||
|
||||
## `ZB.MOM.WW.Health.EntityFrameworkCore`
|
||||
|
||||
```csharp
|
||||
namespace ZB.MOM.WW.Health.EntityFrameworkCore;
|
||||
|
||||
/// Checks database reachability via an EF Core DbContext.
|
||||
/// Default probe: context.Database.CanConnectAsync() (ScadaBridge pattern).
|
||||
/// Supply a custom probe delegate for query-based validation (OtOpcUa "query Deployments" pattern).
|
||||
/// Register to tag ZbHealthTags.Ready.
|
||||
public sealed class DatabaseHealthCheck<TContext> : IHealthCheck
|
||||
where TContext : DbContext
|
||||
{
|
||||
public DatabaseHealthCheck(
|
||||
IDbContextFactory<TContext> factory,
|
||||
DatabaseHealthCheckOptions<TContext>? options = null);
|
||||
|
||||
public Task<HealthCheckResult> CheckHealthAsync(
|
||||
HealthCheckContext context,
|
||||
CancellationToken cancellationToken = default);
|
||||
}
|
||||
|
||||
/// Options for DatabaseHealthCheck<TContext>.
|
||||
public sealed class DatabaseHealthCheckOptions<TContext>
|
||||
where TContext : DbContext
|
||||
{
|
||||
/// Override the default CanConnectAsync() probe.
|
||||
/// Throw to signal failure; return normally to signal success.
|
||||
public Func<TContext, CancellationToken, Task>? Probe { get; set; }
|
||||
|
||||
public TimeSpan Timeout { get; set; } = TimeSpan.FromSeconds(10);
|
||||
}
|
||||
```
|
||||
|
||||
## Consumer matrix summary
|
||||
|
||||
| Consumer | Packages | Notes |
|
||||
|---|---|---|
|
||||
| **MxGateway** | `ZB.MOM.WW.Health` (core only) | `GrpcDependencyHealthCheck` on the worker channel; all three tiers via `MapZbHealth()`; `IActiveNodeGate` not needed (not Akka-based) |
|
||||
| **OtOpcUa** | All three | `AkkaClusterHealthCheck` + `OtOpcUaCompat` preset → `Default` on convergence; `ActiveNodeHealthCheck(role: "admin")`; `DatabaseHealthCheck<T>` with custom probe delegate |
|
||||
| **ScadaBridge** | All three | `AkkaClusterHealthCheck` + `Default` policy; `ActiveNodeHealthCheck` (role-less); `DatabaseHealthCheck<T>` default probe; `AkkaActiveNodeGate` replaces inline `ActiveNodeGate` |
|
||||
|
||||
## Open contract questions
|
||||
|
||||
1. **`IActiveNodeGate` for non-Akka nodes:** MxGateway does not need active-node gating today.
|
||||
If a future MxGateway cluster requires it, the interface is in the core package and can be
|
||||
implemented without an Akka dependency. Validate whether a stub `AlwaysActiveGate` (returns
|
||||
`true`) should ship in core for single-node deployments.
|
||||
2. **DI helpers:** decide whether `services.AddZbHealthChecks()` (a DI-registered convenience
|
||||
that pre-registers gRPC + DB + Akka probes via options) is worth adding, or whether explicit
|
||||
`services.AddHealthChecks().AddCheck<...>()` calls per project are clearer. The spec currently
|
||||
leaves probe registration entirely per-project.
|
||||
3. **`AkkaActiveNodeGate` caching:** `IsActiveNode` is a synchronous property; the underlying
|
||||
`ActiveNodeHealthCheck.CheckHealthAsync` is async. Validate whether the gate should cache the
|
||||
last probe result on a short TTL (e.g. 5 s) or drive a background refresh, to avoid blocking
|
||||
synchronous callers.
|
||||
|
||||
See [`../GAPS.md`](../GAPS.md) for the adoption order and effort/risk.
|
||||
@@ -0,0 +1,184 @@
|
||||
# Health — normalized target spec
|
||||
|
||||
Status: **Draft**. The single design the sister projects converge on. Derived from the
|
||||
three code-verified current-state docs (`../current-state/`). Goal is *path to shared code*
|
||||
(`../shared-contract/ZB.MOM.WW.Health.md`), so each normalized section maps to a shared library seam.
|
||||
|
||||
## 0. Scope
|
||||
|
||||
**Normalized here:** the three-tier endpoint convention (`/health/ready`, `/health/active`,
|
||||
`/healthz`) with canonical tags `ready` / `active` / `live` and their semantics; the canonical
|
||||
JSON response shape; the `IActiveNodeGate` request-gating seam; a configurable
|
||||
`AkkaClusterHealthCheck` with two named policy presets that reconcile the diverging Akka logic in
|
||||
OtOpcUa and ScadaBridge; a role-filtered `ActiveNodeHealthCheck` that unifies OtOpcUa's
|
||||
`AdminRoleLeaderHealthCheck` and ScadaBridge's `ActiveNodeHealthCheck`; a generic
|
||||
`DatabaseHealthCheck<TContext>` that covers both apps' EF Core probe patterns; a
|
||||
`GrpcDependencyHealthCheck` for downstream gRPC reachability.
|
||||
|
||||
**Explicitly NOT normalized** (domain-specific — keep per project): which probes each app
|
||||
registers and how it wires them to tags; orchestrator / Traefik routing rules and routing priorities;
|
||||
ScadaBridge's `HealthMonitoring/` domain-aggregation pipeline — this is a distributed, actor-based
|
||||
domain-health telemetry system (background services + Akka actors that aggregate site-cluster signals
|
||||
into a central health picture) and is **not** an ASP.NET health-probe; it is an independent concern
|
||||
that happens to share the word "health".
|
||||
|
||||
## 1. Tier convention
|
||||
|
||||
Three tiers, always served in this order, each filtered to a named tag:
|
||||
|
||||
| Tier | Endpoint | Tag | Semantics | Healthy→ | Degraded→ | Unhealthy→ |
|
||||
|---|---|---|---|---|---|---|
|
||||
| Ready | `/health/ready` | `ready` | Can this node serve its dependencies? Fails if a DB, gRPC dependency, or cluster membership check is unhealthy. Orchestrators use this to gate traffic. | 200 | 200 | 503 |
|
||||
| Active | `/health/active` | `active` | Is this the leader / active node? Fails (503) on a standby or role-member-but-not-leader node. Used to route write traffic or admin requests to exactly one node. | 200 | 200 | 503 |
|
||||
| Live | `/healthz` | `live` | Bare process liveness — is the process alive and not deadlocked? **No probes registered to this tag** (predicate `_ => false`). Always 200 as long as the process can handle HTTP. | 200 | 200 | 200 |
|
||||
|
||||
Notes:
|
||||
|
||||
- The `live` tier intentionally carries no probes. Registering a probe to `live` is an error —
|
||||
a liveness failure that kills the pod should be reserved for total process hangs, not probe failures.
|
||||
- `Degraded` maps to HTTP 200 (not 503) for the `ready` and `active` tiers. Orchestrators use 503
|
||||
to remove a node from load-balancing; Degraded means "still up but degraded" — remove the node
|
||||
only on hard failure.
|
||||
- The tag names (`ready`, `active`, `live`) are declared as constants in `ZbHealthTags` and used
|
||||
consistently across all three apps. Per-project probe registrations must filter by these tags.
|
||||
|
||||
## 2. Probe catalog
|
||||
|
||||
### 2.1 Database probe — `DatabaseHealthCheck<TContext>`
|
||||
|
||||
Wraps an EF Core `DbContext` to verify database reachability. Default behavior calls
|
||||
`context.Database.CanConnectAsync()` — matches ScadaBridge's pattern. An optional delegate
|
||||
(`Func<TContext, CancellationToken, Task>`) overrides the default for more specific validation
|
||||
(matches OtOpcUa's "query `Deployments`" pattern). Registered to the `ready` tag.
|
||||
|
||||
### 2.2 Akka cluster probe — `AkkaClusterHealthCheck`
|
||||
|
||||
Checks the local node's cluster membership status via Akka.Cluster. The status-to-health
|
||||
mapping is **configurable** through `AkkaClusterStatusPolicy`.
|
||||
|
||||
**Two named policy presets reconcile the existing divergence:**
|
||||
|
||||
| Preset | Origin | `Up` / `Joining` | `Leaving` / `Exiting` | Other (`WeaklyUp`, `Down`, `Removed`, `Unknown`) |
|
||||
|---|---|---|---|---|
|
||||
| `AkkaClusterStatusPolicy.Default` | ScadaBridge `AkkaClusterHealthCheck.cs` | Healthy | Degraded | Unhealthy |
|
||||
| `AkkaClusterStatusPolicy.OtOpcUaCompat` | OtOpcUa `AkkaClusterHealthCheck.cs` | Healthy (if self is `Up` among reachable members) | — | Degraded |
|
||||
|
||||
The `Default` preset is the convergence target. `OtOpcUaCompat` is provided for backward
|
||||
compatibility during OtOpcUa's migration; it maps any non-`Up`-among-members state to Degraded
|
||||
rather than Unhealthy. Registered to the `ready` tag.
|
||||
|
||||
### 2.3 Active / leader probe — `ActiveNodeHealthCheck`
|
||||
|
||||
Checks whether this node is the designated leader (active node). Accepts an optional Akka
|
||||
cluster role name that scopes the check to nodes carrying that role.
|
||||
|
||||
**Two behaviors unify the existing divergence:**
|
||||
|
||||
| Mode | Role param | Origin | Healthy | Degraded | Unhealthy |
|
||||
|---|---|---|---|---|---|
|
||||
| Role-less | `null` | ScadaBridge `ActiveNodeHealthCheck` | Node is `Up` **and** cluster leader | — | Otherwise |
|
||||
| Role-filtered | e.g. `"admin"` | OtOpcUa `AdminRoleLeaderHealthCheck` | Node does **not** carry the role (not a participant — ignore it) **or** node carries the role and is the role-singleton leader | Carries the role but is **not** the role-singleton leader (role member, not leader) | — |
|
||||
|
||||
The role-filtered variant maps "not a member of the role" to Healthy (transparent — the probe
|
||||
is irrelevant for this node). This is the correct behavior for heterogeneous clusters where not
|
||||
every node carries every role. Registered to the `active` tag.
|
||||
|
||||
### 2.4 gRPC dependency probe — `GrpcDependencyHealthCheck`
|
||||
|
||||
Checks that a downstream gRPC channel is reachable by invoking a caller-supplied probe
|
||||
delegate (`Func<GrpcChannel, CancellationToken, Task<bool>>`). The default probe calls
|
||||
`GrpcChannel.ConnectAsync`. Used by:
|
||||
|
||||
- OtOpcUa — checks the MxAccessGateway gRPC channel.
|
||||
- MxGateway — checks the x86 worker gRPC channel.
|
||||
|
||||
Registered to the `ready` tag.
|
||||
|
||||
## 3. Response-writer contract
|
||||
|
||||
All health endpoints share one canonical JSON serializer. The shape is lifted from ScadaBridge's
|
||||
`HealthChecks.UI.Client` style and becomes the library default (replacing per-project divergence).
|
||||
|
||||
**Content-type:** `application/json`
|
||||
|
||||
**Shape:**
|
||||
|
||||
```json
|
||||
{
|
||||
"status": "Healthy",
|
||||
"totalDurationMs": 12,
|
||||
"entries": {
|
||||
"database": {
|
||||
"status": "Healthy",
|
||||
"description": "SQL Server reachable",
|
||||
"duration": "00:00:00.0120000"
|
||||
},
|
||||
"akka-cluster": {
|
||||
"status": "Healthy",
|
||||
"description": "Member status: Up",
|
||||
"duration": "00:00:00.0001000"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Field rules:**
|
||||
|
||||
| Field | Type | Notes |
|
||||
|---|---|---|
|
||||
| `status` | string | `"Healthy"` \| `"Degraded"` \| `"Unhealthy"` — the aggregate across all filtered checks |
|
||||
| `totalDurationMs` | long | Total wall-clock time for all probes in this tier, milliseconds |
|
||||
| `entries` | object | Keyed by check registration name |
|
||||
| `entries.<name>.status` | string | Per-check status |
|
||||
| `entries.<name>.description` | string? | Human-readable detail (may be null) |
|
||||
| `entries.<name>.duration` | string | TimeSpan `ToString()` — per-check elapsed time |
|
||||
|
||||
The writer is exposed as a static `Task WriteJsonAsync(HttpContext, HealthReport)` so consumers can
|
||||
plug it into `MapHealthChecks` options and also call it from custom endpoints.
|
||||
|
||||
## 4. Active-node gating seam — `IActiveNodeGate`
|
||||
|
||||
`IActiveNodeGate` is a single-property interface (`bool IsActiveNode { get; }`) that expresses
|
||||
whether the current node should accept write / active-role requests. The default implementation,
|
||||
`AkkaActiveNodeGate`, delegates to `ActiveNodeHealthCheck`. A `RequireActiveNode()` extension on
|
||||
`IEndpointConventionBuilder` attaches a policy that short-circuits with `503 Service Unavailable`
|
||||
on standby nodes.
|
||||
|
||||
This seam is generalized from ScadaBridge's `ActiveNodeGate.cs`. It is in the core `ZB.MOM.WW.Health`
|
||||
package (not the Akka satellite) so MxGateway can implement it without an Akka dependency if needed.
|
||||
|
||||
## 5. Endpoint registration
|
||||
|
||||
`app.MapZbHealth()` maps all three tiers in one call:
|
||||
|
||||
```csharp
|
||||
app.MapZbHealth(); // all three tiers, defaults
|
||||
app.MapZbHealth(o => {
|
||||
o.ReadyPath = "/health/ready"; // override paths if needed
|
||||
o.ActivePath = "/health/active";
|
||||
o.LivePath = "/healthz";
|
||||
o.ResponseWriter = ZbHealthWriter.WriteJsonAsync;
|
||||
});
|
||||
```
|
||||
|
||||
The library does **not** call `services.AddHealthChecks()` — that is the app's responsibility, as
|
||||
the probe set is per-project. `MapZbHealth` only maps the three endpoints with the correct tag
|
||||
predicates and response writer.
|
||||
|
||||
## 6. Migration notes
|
||||
|
||||
| Project | Current state | Gap | What normalizes |
|
||||
|---|---|---|---|
|
||||
| **OtOpcUa** | All three tiers present (`/health/ready`, `/health/active`, `/healthz`); `DatabaseHealthCheck`, `AkkaClusterHealthCheck`, `AdminRoleLeaderHealthCheck` inline. | Inline probes diverge from the shared policy model; no `IActiveNodeGate`. | Replace inline `AkkaClusterHealthCheck` with shared + `OtOpcUaCompat` preset; replace `AdminRoleLeaderHealthCheck` with shared `ActiveNodeHealthCheck(role: "admin")`; replace inline `DatabaseHealthCheck` with shared generic; call `app.MapZbHealth()`. |
|
||||
| **ScadaBridge** | `/health/ready` + `/health/active` present; no `/healthz`; `DatabaseHealthCheck`, `AkkaClusterHealthCheck`, `ActiveNodeHealthCheck`, `ActiveNodeGate` inline. | Missing `/healthz` live tier; inline implementations. | Add `/healthz` via `MapZbHealth()`; replace inline probes with shared equivalents (Default policy); replace inline `ActiveNodeGate` with `AkkaActiveNodeGate`. |
|
||||
| **MxGateway** | Only `/health/live` (custom `GatewayHealthReply`); `AddHealthChecks()` called but zero probes registered. | Missing `ready` and `active` tiers; no probes; not using standard health middleware. | Replace custom endpoint with `app.MapZbHealth()`; register `GrpcDependencyHealthCheck` for the x86 worker channel on the `ready` tag. |
|
||||
|
||||
## 7. Acceptance (what "converged" means)
|
||||
|
||||
A project is converged when: (a) it calls `app.MapZbHealth()` and exposes all three canonical
|
||||
endpoints; (b) its Akka probes (if applicable) use the `AkkaClusterHealthCheck` + `ActiveNodeHealthCheck`
|
||||
from `ZB.MOM.WW.Health.Akka` with the Default policy; (c) its DB probe uses `DatabaseHealthCheck<TContext>`
|
||||
from `ZB.MOM.WW.Health.EntityFrameworkCore`; (d) its gRPC-dependency probe (if applicable) uses
|
||||
`GrpcDependencyHealthCheck`; (e) its `IActiveNodeGate` implementation is `AkkaActiveNodeGate`
|
||||
(or a project-specific implementation of the shared interface); (f) all health endpoints return the
|
||||
canonical JSON shape defined in §3.
|
||||
Reference in New Issue
Block a user