docs: implementation plan for ZB.MOM.WW.Health adoption across the 3 sister apps

Detailed task-by-task plan (publish to Gitea, then per-repo behaviour-preserving
probe swaps) incorporating recon findings that revised the design: MxGateway worker
IPC is named pipes (custom SQLite readiness probe instead of gRPC), ScadaBridge
ActorSystem is not in DI (transient bridge), downstream gRPC probes + IDbContextFactory
switch + ScadaBridge seam unification deferred.
This commit is contained in:
Joseph Doherty
2026-06-01 13:15:48 -04:00
parent f72403d6f0
commit 5a965639f9
2 changed files with 853 additions and 0 deletions
@@ -0,0 +1,837 @@
# ZB.MOM.WW.Health Adoption Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.
**Goal:** Adopt the shared `ZB.MOM.WW.Health` library into all three sister apps (OtOpcUa,
MxAccessGateway, ScadaBridge), replacing each app's bespoke health-check wiring with the shared
probes, canonical three-tier endpoints (`/health/ready`, `/health/active`, `/healthz`), and JSON
writer — behaviour-preserving.
**Architecture:** Distribution is via the Gitea NuGet registry (`dohertj2-gitea` feed). The shared
checks are registered with `AddTypeActivatedCheck<T>` (DI supplies `IServiceProvider`; extra
constructor args — policy / role / options — passed positionally) and tagged with `ZbHealthTags`;
`MapZbHealth()` routes each tier by tag. Each sister repo is its **own git repo** — branch, commit,
and (optionally) PR happen inside that repo, not in scadaproj. The three repo phases are mutually
independent after publish and may proceed in parallel.
**Tech Stack:** .NET 10, ASP.NET Core health checks (`Microsoft.Extensions.Diagnostics.HealthChecks`),
Akka.NET cluster, EF Core, `Microsoft.Data.Sqlite`, NuGet Central Package Management, Gitea NuGet feed.
---
## Context the executor MUST know
**This plan edits FOUR repos:**
- `~/Desktop/scadaproj` — only Phase 0 (verify publish) and Phase 4 (GAPS bookkeeping).
- `~/Desktop/MxAccessGateway` — Phase 1 (core package only).
- `~/Desktop/OtOpcUa` — Phase 2 (all three packages).
- `~/Desktop/ScadaBridge` — Phase 3 (all three packages).
**Per-repo git discipline:** each sister repo is independent. Before editing a sister repo, create a
branch `feat/adopt-zb-health`. Commit inside that repo. Never commit sister-repo changes from
scadaproj. Never skip hooks; never force-push.
**Shared registration idiom (used in every phase).** The shared checks need constructor args DI
can't supply alone, so register them with `AddTypeActivatedCheck<T>`:
```csharp
using Microsoft.Extensions.DependencyInjection; // AddTypeActivatedCheck
using Microsoft.Extensions.Diagnostics.HealthChecks; // HealthStatus
using ZB.MOM.WW.Health; // ZbHealthTags, MapZbHealth, ZbHealthWriter
// + ZB.MOM.WW.Health.Akka / .EntityFrameworkCore where used
```
`AddTypeActivatedCheck<T>(name, failureStatus, tags, params object[] args)` builds the check via
`ActivatorUtilities.CreateInstance`: `IServiceProvider` constructor params are satisfied from DI;
anything else (an `AkkaClusterStatusPolicy`, a role string, a `DatabaseHealthCheckOptions<T>`) is
taken from `args` by type. This is the canonical way to wire the shared checks.
**Library public API (verified, do not re-derive):**
- `endpoints.MapZbHealth(ZbHealthEndpointOptions? = null)` — maps ready/active/live; defaults
`/health/ready`, `/health/active`, `/healthz`; ready+active use `ZbHealthWriter.WriteJsonAsync`;
all anonymous. Does NOT call `AddHealthChecks()`.
- `ZbHealthTags.Ready` = `"ready"`, `.Active` = `"active"`, `.Live` = `"live"`.
- `DatabaseHealthCheck<TContext>(IServiceProvider, DatabaseHealthCheckOptions<TContext>? )`
default probe `CanConnectAsync`; `options.ProbeQuery = Func<TContext,CancellationToken,Task>` for
the stricter query probe; resolves an `IDbContextFactory<TContext>` if registered, else a scoped
`TContext` from a fresh scope (pool-safe).
- `AkkaClusterHealthCheck(IServiceProvider, AkkaClusterStatusPolicy)` — presets
`AkkaClusterStatusPolicy.Default` and `.OtOpcUaCompat`. Resolves `ActorSystem` from DI.
- `ActiveNodeHealthCheck(IServiceProvider)` (role-less) / `(IServiceProvider, string role)`.
Resolves `ActorSystem` from DI lazily; Degraded if not yet available.
- `AkkaActiveNodeGate(IServiceProvider) : IActiveNodeGate` — not used in this plan (ScadaBridge seam
unification is deferred).
**Scope deferrals (settled — do NOT implement here):** downstream gRPC dependency probes (no
host-level `GrpcChannel` exists in OtOpcUa or MxGateway); ScadaBridge `IDbContextFactory` switch
(the shared check self-scopes); ScadaBridge `IActiveNodeGate` seam unification (its interface is
`...InboundAPI.IActiveNodeGate`, wired into inbound-API gating — out of scope). These are recorded
as follow-ups in Phase 4.
---
## Phase 0 — Publish the Health packages (prerequisite)
### Task 0: Verify the three Health nupkgs are on the Gitea feed (publish if absent)
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** none (gates all other phases)
**Files:**
- Read: `~/Desktop/scadaproj/ZB.MOM.WW.Health/ZB.MOM.WW.Health.slnx`
- (No source edits — this is a pack/push/verify task.)
**Step 1: Check whether the packages already resolve from Gitea**
The library CLAUDE.md claims they are "published to the Gitea NuGet feed." Verify:
```bash
curl -s "https://gitea.dohertylan.com/api/packages/dohertj2/nuget/v3/registration/ZB.MOM.WW.Health/index.json" -o /dev/null -w "%{http_code}\n"
```
Expected: `200` if already published. If `404`/`401`, publish (Steps 23). If credentials are
needed for the query, skip to Step 2 and rely on the push result.
**Step 2: Pack (only if not already published)**
```bash
cd ~/Desktop/scadaproj/ZB.MOM.WW.Health
dotnet pack ZB.MOM.WW.Health.slnx -c Release -o ./artifacts
ls artifacts/*.nupkg
```
Expected: `ZB.MOM.WW.Health.0.1.0.nupkg`, `ZB.MOM.WW.Health.Akka.0.1.0.nupkg`,
`ZB.MOM.WW.Health.EntityFrameworkCore.0.1.0.nupkg`.
**Step 3: Push to the Gitea feed**
Credentials are NOT in the repo. The developer/CI provides them. Push each package:
```bash
dotnet nuget push "artifacts/ZB.MOM.WW.Health*.0.1.0.nupkg" \
--source "https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json" \
--api-key "$GITEA_NUGET_TOKEN"
```
Expected: `Your package was pushed.` for each (or `409 Conflict` = already present = fine).
**Fallback (if Gitea is unreachable):** STOP and surface it. Do not silently switch mechanisms —
the fallback (local folder feed) changes only each repo's `nuget.config` source line, but that is a
plan amendment the user should approve.
**Step 4: Commit (none in scadaproj for this task)** — no source changed; proceed to Phase 1.
---
## Phase 1 — MxAccessGateway (core package only)
Repo: `~/Desktop/MxAccessGateway`. Branch: `feat/adopt-zb-health`. This repo has **no CPM and no
`nuget.config`** today. Readiness probe = a custom `AuthStoreHealthCheck` over the SQLite auth store
(the gateway authenticates every gRPC call against it).
### Task 1: Reference wiring — create `nuget.config`, add the package reference
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Task 4, Task 7 (different repos)
**Files:**
- Create: `~/Desktop/MxAccessGateway/nuget.config`
- Modify: `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj` (ItemGroup, after line 13)
**Step 1: Create `nuget.config`** (this repo's first; nuget.org for everything, Gitea for Health)
```xml
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<packageSources>
<clear />
<add key="nuget.org" value="https://api.nuget.org/v3/index.json" />
<add key="dohertj2-gitea" value="https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json" />
</packageSources>
<!-- nuget.org serves everything; the Gitea feed serves only the ZB.MOM.WW.* shared libs.
Credentials are NOT committed: provide them per-developer via `dotnet nuget add source`
(username + access token) or NuGet credential env vars in CI. -->
<packageSourceMapping>
<packageSource key="nuget.org">
<package pattern="*" />
</packageSource>
<packageSource key="dohertj2-gitea">
<package pattern="ZB.MOM.WW.Health.*" />
</packageSource>
</packageSourceMapping>
</configuration>
```
**Step 2: Add the package reference** to the Server `.csproj`. Insert into the first `<ItemGroup>`
(the one ending at the current line 14):
```xml
<PackageReference Include="ZB.MOM.WW.Health" Version="0.1.0" />
```
(Direct versioned reference — this repo has no CPM. Do not introduce CPM.)
**Step 3: Restore to verify the feed resolves**
```bash
cd ~/Desktop/MxAccessGateway
dotnet restore src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj
```
Expected: restore succeeds and pulls `ZB.MOM.WW.Health 0.1.0` from `dohertj2-gitea`. If it 401s,
the developer must add the Gitea source credentials (`dotnet nuget add source … -u … -p … --store-password-in-clear-text`).
**Step 4: Commit**
```bash
cd ~/Desktop/MxAccessGateway && git checkout -b feat/adopt-zb-health
git add nuget.config src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj
git commit -m "build: reference ZB.MOM.WW.Health from the Gitea feed"
```
### Task 2: Write the custom `AuthStoreHealthCheck` (TDD)
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 4, Task 7
**Files:**
- Create: `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/Diagnostics/AuthStoreHealthCheck.cs`
- Test: `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/AuthStoreHealthCheckTests.cs`
**Step 1: Write the failing tests**
```csharp
using Microsoft.Extensions.Diagnostics.HealthChecks;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.MxGateway.Server.Configuration;
using ZB.MOM.WW.MxGateway.Server.Diagnostics;
using ZB.MOM.WW.MxGateway.Server.Security.Authentication;
namespace ZB.MOM.WW.MxGateway.Tests.Diagnostics;
public sealed class AuthStoreHealthCheckTests
{
private static AuthSqliteConnectionFactory FactoryFor(string sqlitePath)
{
var options = new GatewayOptions();
options.Authentication.SqlitePath = sqlitePath;
return new AuthSqliteConnectionFactory(Options.Create(options));
}
[Fact]
public async Task Healthy_WhenStoreReachable()
{
var path = Path.Combine(Path.GetTempPath(), $"authcheck-{Guid.NewGuid():N}.db");
try
{
var check = new AuthStoreHealthCheck(FactoryFor(path));
var result = await check.CheckHealthAsync(new HealthCheckContext());
Assert.Equal(HealthStatus.Healthy, result.Status);
}
finally { if (File.Exists(path)) File.Delete(path); }
}
[Fact]
public async Task Unhealthy_WhenPathUnusable()
{
// A path whose parent cannot be created (a file used as a directory) forces open to fail.
var bogus = Path.Combine(Path.GetTempPath(), $"authcheck-{Guid.NewGuid():N}");
await File.WriteAllTextAsync(bogus, "x");
try
{
var check = new AuthStoreHealthCheck(FactoryFor(Path.Combine(bogus, "store.db")));
var result = await check.CheckHealthAsync(new HealthCheckContext());
Assert.Equal(HealthStatus.Unhealthy, result.Status);
}
finally { if (File.Exists(bogus)) File.Delete(bogus); }
}
}
```
**Step 2: Run, expect failure** (type does not exist)
```bash
cd ~/Desktop/MxAccessGateway
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~AuthStoreHealthCheckTests"
```
Expected: COMPILE ERROR / FAIL — `AuthStoreHealthCheck` not found.
**Step 3: Implement the check**
```csharp
using Microsoft.Data.Sqlite;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using ZB.MOM.WW.MxGateway.Server.Security.Authentication;
namespace ZB.MOM.WW.MxGateway.Server.Diagnostics;
/// <summary>
/// Readiness probe: verifies the SQLite authentication store is reachable. The gateway
/// authenticates every gRPC call against this store, so its reachability gates readiness.
/// </summary>
public sealed class AuthStoreHealthCheck : IHealthCheck
{
private readonly AuthSqliteConnectionFactory _connectionFactory;
public AuthStoreHealthCheck(AuthSqliteConnectionFactory connectionFactory) =>
_connectionFactory = connectionFactory ?? throw new ArgumentNullException(nameof(connectionFactory));
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
await using SqliteConnection connection =
await _connectionFactory.OpenConnectionAsync(cancellationToken).ConfigureAwait(false);
await using SqliteCommand command = connection.CreateCommand();
command.CommandText = "SELECT 1;";
await command.ExecuteScalarAsync(cancellationToken).ConfigureAwait(false);
return HealthCheckResult.Healthy("Auth store is reachable.");
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
throw;
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy("Auth store is unreachable.", ex);
}
}
}
```
**Step 4: Run, expect pass**
```bash
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~AuthStoreHealthCheckTests"
```
Expected: PASS (2 tests). If the `GatewayOptions.Authentication.SqlitePath` accessor differs, adjust
the test helper to match the real options shape (read `Configuration/GatewayOptions.cs` first).
**Step 5: Commit**
```bash
git add src/ZB.MOM.WW.MxGateway.Server/Diagnostics/AuthStoreHealthCheck.cs \
src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/AuthStoreHealthCheckTests.cs
git commit -m "feat: add AuthStoreHealthCheck readiness probe"
```
### Task 3: Rewire `GatewayApplication` to the canonical tiers; fix the route test
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 4, Task 7
**Files:**
- Modify: `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs:63-66` (the `AddHealthChecks()` line) and `:172-178` (the `/health/live` block)
- Modify: `~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs:14-27`
**Step 1: Replace the bare `AddHealthChecks()` (line 66) with the tagged readiness probe**
```csharp
builder.Services.AddHealthChecks()
.AddTypeActivatedCheck<AuthStoreHealthCheck>(
"auth-store",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready });
```
Add `using ZB.MOM.WW.Health;` and `using ZB.MOM.WW.MxGateway.Server.Diagnostics;` (Diagnostics is
already imported at line 9).
**Step 2: Delete the `/health/live` block (lines 172-178) and map the canonical tiers**
Remove:
```csharp
endpoints.MapGet(
"/health/live",
() => Results.Ok(new GatewayHealthReply(
Status: "Healthy",
DefaultBackend: GatewayContractInfo.DefaultBackendName,
WorkerProtocolVersion: GatewayContractInfo.WorkerProtocolVersion)))
.WithName("LiveHealth");
```
Replace with:
```csharp
endpoints.MapZbHealth();
```
(`/health/ready` runs `auth-store`; `/health/active` runs no checks → 200; `/healthz` is bare
liveness. The `GatewayHealthReply` type may now be unused — if so, the C# compiler won't flag it;
leave it unless a "remove dead code" reviewer asks, to keep this change tight.)
**Step 3: Update the route test** (`GatewayApplicationTests.cs:14-27`) to assert the three tiers
instead of `/health/live`:
```csharp
/// <summary>Verifies that Build maps the canonical three health tiers.</summary>
[Fact]
public async Task Build_MapsCanonicalHealthEndpoints()
{
await using WebApplication app = GatewayApplication.Build([]);
var paths = ((IEndpointRouteBuilder)app).DataSources
.SelectMany(dataSource => dataSource.Endpoints)
.OfType<RouteEndpoint>()
.Select(e => e.RoutePattern.RawText)
.ToHashSet();
Assert.Contains("/health/ready", paths);
Assert.Contains("/health/active", paths);
Assert.Contains("/healthz", paths);
Assert.DoesNotContain("/health/live", paths);
}
```
**Step 4: Build + test the whole gateway**
```bash
cd ~/Desktop/MxAccessGateway
dotnet build src/MxGateway.sln
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj
```
Expected: build clean; all tests pass (the old `Build_MapsLiveHealthEndpoint` is replaced). If any
other test references `/health/live` or `LiveHealth`, update it the same way.
**Step 5: Commit**
```bash
git add src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs \
src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs
git commit -m "feat: map canonical ZB health tiers; replace bypassing /health/live"
```
---
## Phase 2 — OtOpcUa (all three packages)
Repo: `~/Desktop/OtOpcUa`. Branch: `feat/adopt-zb-health`. CPM present; `NuGet.config` has nuget.org
+ `local-mxgw` folder feed, NO source mapping. `ActorSystem` IS in DI (the bespoke
`AkkaClusterHealthCheck` injects it directly). This is the cleanest of the three.
### Task 4: Reference wiring — add Gitea source + mapping + CPM versions + package refs
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 1, Task 7
**Files:**
- Modify: `~/Desktop/OtOpcUa/NuGet.config`
- Modify: `~/Desktop/OtOpcUa/Directory.Packages.props` (near line 99-100)
- Modify: `~/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj` (ItemGroup lines 16-30)
**Step 1: Add the Gitea source + source mapping** to `NuGet.config`. Because adding a mapping makes
ALL sources mapped explicitly, map the existing feeds too:
```xml
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<packageSources>
<add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" />
<add key="local-mxgw" value="./nuget-packages" />
<add key="dohertj2-gitea" value="https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json" />
</packageSources>
<packageSourceMapping>
<packageSource key="nuget.org">
<package pattern="*" />
</packageSource>
<packageSource key="local-mxgw">
<package pattern="ZB.MOM.WW.MxGateway.*" />
</packageSource>
<packageSource key="dohertj2-gitea">
<package pattern="ZB.MOM.WW.Health.*" />
</packageSource>
</packageSourceMapping>
</configuration>
```
**Step 2: Add CPM versions** to `Directory.Packages.props` next to the existing `ZB.MOM.WW.*` lines:
```xml
<PackageVersion Include="ZB.MOM.WW.Health" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Health.Akka" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Health.EntityFrameworkCore" Version="0.1.0" />
```
**Step 3: Add package references** (no version — CPM) to the Host `.csproj` ItemGroup:
```xml
<PackageReference Include="ZB.MOM.WW.Health" />
<PackageReference Include="ZB.MOM.WW.Health.Akka" />
<PackageReference Include="ZB.MOM.WW.Health.EntityFrameworkCore" />
```
**Step 4: Restore**
```bash
cd ~/Desktop/OtOpcUa && git checkout -b feat/adopt-zb-health
dotnet restore ZB.MOM.WW.OtOpcUa.slnx
```
Expected: restore succeeds; the three Health packages come from `dohertj2-gitea`, MxGateway stays on
`local-mxgw`.
**Step 5: Commit**
```bash
git add NuGet.config Directory.Packages.props src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj
git commit -m "build: reference ZB.MOM.WW.Health packages from the Gitea feed"
```
### Task 5: Swap the three checks to shared probes; map tiers via `MapZbHealth`
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 1, Task 7
**Files:**
- Rewrite: `~/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/HealthEndpoints.cs`
- Delete: `Health/DatabaseHealthCheck.cs`, `Health/AkkaClusterHealthCheck.cs`, `Health/AdminRoleLeaderHealthCheck.cs`
- Verify call sites unchanged: `Program.cs:137` (`AddOtOpcUaHealth`), `Program.cs:159` (`MapOtOpcUaHealth`)
**Step 1: Rewrite `HealthEndpoints.cs`** to register the shared checks (preserving names + tags) and
map via `MapZbHealth()`:
```csharp
using Microsoft.AspNetCore.Routing;
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.DependencyInjection;
using ZB.MOM.WW.Health;
using ZB.MOM.WW.Health.Akka;
using ZB.MOM.WW.Health.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
namespace ZB.MOM.WW.OtOpcUa.Host.Health;
public static class HealthEndpoints
{
/// <summary>
/// Registers the shared ZB.MOM.WW health probes. Tier semantics preserved from the bespoke
/// implementation: configdb + akka on ready+active; admin-leader on active only.
/// </summary>
public static IServiceCollection AddOtOpcUaHealth(this IServiceCollection services)
{
services.AddHealthChecks()
.AddTypeActivatedCheck<DatabaseHealthCheck<OtOpcUaConfigDbContext>>(
"configdb",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready, ZbHealthTags.Active },
args: new DatabaseHealthCheckOptions<OtOpcUaConfigDbContext>
{
// Preserve OtOpcUa's stricter schema-touching probe.
ProbeQuery = static (db, ct) => db.Deployments.AsNoTracking().Take(1).ToListAsync(ct),
})
.AddTypeActivatedCheck<AkkaClusterHealthCheck>(
"akka",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready, ZbHealthTags.Active },
args: AkkaClusterStatusPolicy.OtOpcUaCompat)
.AddTypeActivatedCheck<ActiveNodeHealthCheck>(
"admin-leader",
failureStatus: null,
tags: new[] { ZbHealthTags.Active },
args: "admin");
return services;
}
/// <summary>Maps the canonical three-tier health endpoints.</summary>
public static IEndpointRouteBuilder MapOtOpcUaHealth(this IEndpointRouteBuilder app)
{
app.MapZbHealth(); // /health/ready, /health/active, /healthz — all AllowAnonymous
return app;
}
}
```
Note: `args:` is the `params object[]` — pass a single options object / policy / string. If the
compiler binds the single-array overload oddly, wrap as `args: new object[] { … }`.
**Step 2: Delete the three bespoke check files**
```bash
cd ~/Desktop/OtOpcUa
git rm src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/DatabaseHealthCheck.cs \
src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AkkaClusterHealthCheck.cs \
src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AdminRoleLeaderHealthCheck.cs
```
(`IClusterRoleInfo` may now be unused by Health; leave its definition — it may be used elsewhere.)
**Step 3: Build**
```bash
dotnet build ZB.MOM.WW.OtOpcUa.slnx
```
Expected: clean. Fix any now-dangling `using ...Host.Health` references to the deleted types.
**Step 4: Run health-related tests**
```bash
dotnet test ZB.MOM.WW.OtOpcUa.slnx --filter "FullyQualifiedName~Health"
```
Expected: pass. **Behaviour-parity checks the executor must confirm** (add/keep tests if missing):
- akka tier: self `Up` → Healthy; self not Up → Degraded (the `OtOpcUaCompat` preset reproduces the
self-Up scan).
- admin-leader: node without `admin` role → Healthy; admin member non-leader → Degraded; admin
leader → Healthy. (Shared check reads `Cluster.Get(system).SelfMember` + `RoleLeader("admin")`,
vs the old `IClusterRoleInfo`; verify equivalence on a formed test cluster or via the library's
own `ActiveNodeDecision` table — already covered in the librarys tests.)
**Step 5: Commit**
```bash
git add src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/HealthEndpoints.cs
git commit -m "feat: adopt shared ZB.MOM.WW.Health probes (preserve tiers + OtOpcUaCompat policy)"
```
---
## Phase 3 — ScadaBridge (all three packages)
Repo: `~/Desktop/ScadaBridge`. Branch: `feat/adopt-zb-health`. CPM + Gitea feed already wired (just
extend mapping). **`ActorSystem` is NOT in DI** (owned by `AkkaHostedService`) — add a transient DI
bridge so the shared checks can resolve it. Keep the existing `ActiveNodeGate` (seam unification
deferred). No `IDbContextFactory` switch (shared check self-scopes).
### Task 6: Reference wiring — extend mapping + CPM versions + package refs
**Classification:** small
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 1, Task 4
**Files:**
- Modify: `~/Desktop/ScadaBridge/nuget.config` (source-mapping block, lines 13-20)
- Modify: `~/Desktop/ScadaBridge/Directory.Packages.props` (near lines 76-77)
- Modify: `~/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj` (ItemGroup lines 14-31)
**Step 1: Extend the Gitea source mapping** — add a second pattern under `dohertj2-gitea`:
```xml
<packageSource key="dohertj2-gitea">
<package pattern="ZB.MOM.WW.MxGateway.*" />
<package pattern="ZB.MOM.WW.Health.*" />
</packageSource>
```
**Step 2: Add CPM versions** next to the existing `ZB.MOM.WW.*` lines in `Directory.Packages.props`:
```xml
<PackageVersion Include="ZB.MOM.WW.Health" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Health.Akka" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Health.EntityFrameworkCore" Version="0.1.0" />
```
**Step 3: Add package references** to the Host `.csproj` ItemGroup (no version — CPM):
```xml
<PackageReference Include="ZB.MOM.WW.Health" />
<PackageReference Include="ZB.MOM.WW.Health.Akka" />
<PackageReference Include="ZB.MOM.WW.Health.EntityFrameworkCore" />
```
**Step 4: Restore + commit**
```bash
cd ~/Desktop/ScadaBridge && git checkout -b feat/adopt-zb-health
dotnet restore ZB.MOM.WW.ScadaBridge.slnx
git add nuget.config Directory.Packages.props src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj
git commit -m "build: reference ZB.MOM.WW.Health packages from the Gitea feed"
```
### Task 7: Add the transient `ActorSystem` DI bridge (TDD)
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** Task 1, Task 4
**Files:**
- Modify: `~/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/Program.cs` (near the Akka registration)
- Test: `~/Desktop/ScadaBridge/tests/ZB.MOM.WW.ScadaBridge.Host.Tests/ActorSystemBridgeTests.cs`
**Why transient:** the shared checks call `sp.GetService<ActorSystem>()` **per probe** and treat
`null` as "not ready yet" (Degraded). A transient factory re-reads `AkkaHostedService.ActorSystem`
each resolve, returning `null` before startup and the live system after. A singleton would cache the
startup `null` forever.
**Step 1: Write the failing test**
```csharp
using Akka.Actor;
using Microsoft.Extensions.DependencyInjection;
using ZB.MOM.WW.ScadaBridge.Host.Actors;
namespace ZB.MOM.WW.ScadaBridge.Host.Tests;
public sealed class ActorSystemBridgeTests
{
[Fact]
public void ActorSystem_ResolvesNull_BeforeHostedServiceStarts()
{
var services = new ServiceCollection();
services.AddSingleton<AkkaHostedService>(); // ActorSystem property is null pre-start
services.AddTransient(sp => sp.GetRequiredService<AkkaHostedService>().ActorSystem!);
using var provider = services.BuildServiceProvider();
Assert.Null(provider.GetService<ActorSystem>()); // transient re-reads → null, not cached
}
}
```
If `AkkaHostedService` cannot be constructed without dependencies, register a minimal stub instead;
the assertion that matters is "transient bridge yields null before start." Read
`Actors/AkkaHostedService.cs` constructor first and adapt.
**Step 2: Run, expect failure**
```bash
cd ~/Desktop/ScadaBridge
dotnet test tests/ZB.MOM.WW.ScadaBridge.Host.Tests/ZB.MOM.WW.ScadaBridge.Host.Tests.csproj --filter "FullyQualifiedName~ActorSystemBridgeTests"
```
Expected: FAIL (no `ActorSystem` registration → `GetService` returns null already, OR compile gap).
Adjust so the test meaningfully exercises the bridge registration you add in Step 3.
**Step 3: Register the bridge in `Program.cs`** (right after `AkkaHostedService` is registered):
```csharp
// The shared ZB.MOM.WW.Health Akka checks resolve ActorSystem from DI. ScadaBridge owns the
// ActorSystem inside AkkaHostedService (not a DI singleton), so bridge it as TRANSIENT: each
// resolve re-reads the current value — null while warming up (checks → Degraded), live afterwards.
builder.Services.AddTransient(sp =>
sp.GetRequiredService<AkkaHostedService>().ActorSystem
?? throw new InvalidOperationException("ActorSystem not yet started."));
```
**Caution:** the shared checks use `GetService<ActorSystem>()` (returns null on failure to resolve)
NOT `GetRequiredService`. A transient factory that THROWS still surfaces as null from
`GetService`? No — `GetService` propagates factory exceptions. Therefore the factory must NOT throw;
return null instead. Use:
```csharp
builder.Services.AddTransient<ActorSystem>(sp =>
sp.GetRequiredService<AkkaHostedService>().ActorSystem!); // null before start; '!' is a hint only
```
`GetService<ActorSystem>()` then returns `null` pre-start (Degraded) and the live system post-start.
Make the Step-1 test assert exactly this.
**Step 4: Run, expect pass**
```bash
dotnet test tests/ZB.MOM.WW.ScadaBridge.Host.Tests/ZB.MOM.WW.ScadaBridge.Host.Tests.csproj --filter "FullyQualifiedName~ActorSystemBridgeTests"
```
Expected: PASS.
**Step 5: Commit**
```bash
git add src/ZB.MOM.WW.ScadaBridge.Host/Program.cs \
tests/ZB.MOM.WW.ScadaBridge.Host.Tests/ActorSystemBridgeTests.cs
git commit -m "feat: bridge ActorSystem into DI (transient) for shared health checks"
```
### Task 8: Swap checks to shared probes; add `/healthz`; canonical writer
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** none (depends on Task 6 + Task 7)
**Files:**
- Modify: `~/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/Program.cs:114-117` (registration) and `:222-233` (endpoint mapping)
- Delete: `Health/DatabaseHealthCheck.cs`, `Health/AkkaClusterHealthCheck.cs`, `Health/ActiveNodeHealthCheck.cs`
- Keep: `Health/ActiveNodeGate.cs` (unchanged — seam unification deferred)
- Adjust: `tests/ZB.MOM.WW.ScadaBridge.Host.Tests/HealthCheckTests.cs`
**Step 1: Replace the registration block** (Program.cs lines 114-117):
```csharp
builder.Services.AddHealthChecks()
.AddTypeActivatedCheck<DatabaseHealthCheck<ScadaBridgeDbContext>>(
"database",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready }) // default CanConnectAsync probe; self-scopes
.AddTypeActivatedCheck<AkkaClusterHealthCheck>(
"akka-cluster",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready },
args: AkkaClusterStatusPolicy.Default) // Up/Joining=Healthy, Leaving/Exiting=Degraded
.AddTypeActivatedCheck<ActiveNodeHealthCheck>(
"active-node",
failureStatus: null,
tags: new[] { ZbHealthTags.Active }); // role-less leader check
```
Add usings: `ZB.MOM.WW.Health`, `ZB.MOM.WW.Health.Akka`, `ZB.MOM.WW.Health.EntityFrameworkCore`,
`ZB.MOM.WW.ScadaBridge.ConfigurationDatabase` (for `ScadaBridgeDbContext`). Tag mapping preserves
the prior split: `database` + `akka-cluster` on ready; `active-node` on active.
**Step 2: Replace the endpoint mapping** (Program.cs lines 222-233 — the two `MapHealthChecks`
blocks using `UIResponseWriter`) with a single call:
```csharp
app.MapZbHealth(); // /health/ready (database+akka-cluster), /health/active (active-node), /healthz
```
This adds the previously-missing `/healthz` and switches both tiers to the canonical
`ZbHealthWriter`. Remove the now-unused `using` for `HealthChecks.UI.Client` /
`UIResponseWriter` if it becomes dead.
**Step 3: Delete the three bespoke checks**
```bash
cd ~/Desktop/ScadaBridge
git rm src/ZB.MOM.WW.ScadaBridge.Host/Health/DatabaseHealthCheck.cs \
src/ZB.MOM.WW.ScadaBridge.Host/Health/AkkaClusterHealthCheck.cs \
src/ZB.MOM.WW.ScadaBridge.Host/Health/ActiveNodeHealthCheck.cs
```
**Step 4: Build + test**
```bash
dotnet build ZB.MOM.WW.ScadaBridge.slnx
dotnet test tests/ZB.MOM.WW.ScadaBridge.Host.Tests/ZB.MOM.WW.ScadaBridge.Host.Tests.csproj
```
Expected: build clean; tests pass. `HealthCheckTests.cs` likely references the deleted concrete
types or the old endpoint shape — retarget it to assert: `/health/ready`, `/health/active`, AND the
new `/healthz` are mapped; `database`+`akka-cluster` are tagged `ready`; `active-node` is tagged
`active`. The `Default` policy preserves ScadaBridge's `Joining`=Healthy classification — keep any
test asserting that.
**Step 5: Commit**
```bash
git add src/ZB.MOM.WW.ScadaBridge.Host/Program.cs \
tests/ZB.MOM.WW.ScadaBridge.Host.Tests/HealthCheckTests.cs
git commit -m "feat: adopt shared ZB.MOM.WW.Health probes; add /healthz; canonical writer"
```
---
## Phase 4 — Bookkeeping (scadaproj)
### Task 9: Update the Health GAPS backlog to reflect adoption + deferrals
**Classification:** trivial
**Estimated implement time:** ~3 min
**Parallelizable with:** none (do last)
**Files:**
- Modify: `~/Desktop/scadaproj/components/health/GAPS.md` (adoption backlog table + a deferrals note)
**Step 1:** In `components/health/GAPS.md`, annotate the adoption-backlog rows as done for what
shipped (MxGateway tiers + `AuthStoreHealthCheck`; OtOpcUa shared probes; ScadaBridge shared probes
+ `/healthz` + canonical writer + ActorSystem bridge), and add a short "Deferred (verified
ill-fitting on adoption)" subsection capturing: downstream gRPC probes (no host-level channel),
ScadaBridge `IDbContextFactory` switch (shared check self-scopes), ScadaBridge `IActiveNodeGate`
seam unification (different InboundAPI interface), and MxGateway worker probe (named-pipe transport).
**Step 2: Commit (scadaproj)**
```bash
cd ~/Desktop/scadaproj
git add components/health/GAPS.md
git commit -m "docs(health): mark ZB.MOM.WW.Health adoption done; record verified deferrals"
```
---
## Execution notes
- **Order:** Task 0 first (gates everything). Then the three repo phases are independent — Tasks
1-3 (MxGateway), 4-5 (OtOpcUa), 6-8 (ScadaBridge) can run in parallel across repos; within a repo
they are sequential. Task 9 (scadaproj) last.
- **Per-repo green gate:** a phase is "done" only when that sister repo's full `dotnet build` +
`dotnet test` are green — not just the changed area.
- **Behaviour preservation is the acceptance bar:** the presets (`OtOpcUaCompat` / `Default`) and
the role filter (`"admin"` / role-less) exist to keep each app's Healthy/Degraded/Unhealthy
classifications identical. Any classification change is a defect, not an improvement.
- **No secrets in any diff** — the Gitea token / feed credentials are provided out-of-band; verify
no `nuget.config` or csproj change embeds them.
@@ -0,0 +1,16 @@
{
"planPath": "docs/plans/2026-06-01-health-library-adoption.md",
"tasks": [
{"id": 0, "subject": "Task 0: Verify/publish Health nupkgs to Gitea", "status": "pending"},
{"id": 1, "subject": "Task 1: MxGateway reference wiring", "status": "pending", "blockedBy": [0]},
{"id": 2, "subject": "Task 2: MxGateway AuthStoreHealthCheck (TDD)", "status": "pending", "blockedBy": [1]},
{"id": 3, "subject": "Task 3: MxGateway rewire to canonical tiers", "status": "pending", "blockedBy": [2]},
{"id": 4, "subject": "Task 4: OtOpcUa reference wiring", "status": "pending", "blockedBy": [0]},
{"id": 5, "subject": "Task 5: OtOpcUa swap to shared probes", "status": "pending", "blockedBy": [4]},
{"id": 6, "subject": "Task 6: ScadaBridge reference wiring", "status": "pending", "blockedBy": [0]},
{"id": 7, "subject": "Task 7: ScadaBridge ActorSystem DI bridge (TDD)", "status": "pending", "blockedBy": [6]},
{"id": 8, "subject": "Task 8: ScadaBridge swap to shared probes", "status": "pending", "blockedBy": [6, 7]},
{"id": 9, "subject": "Task 9: Update Health GAPS bookkeeping", "status": "pending", "blockedBy": [3, 5, 8]}
],
"lastUpdated": "2026-06-01"
}