35 KiB
ZB.MOM.WW.Health Adoption Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.
Goal: Adopt the shared ZB.MOM.WW.Health library into all three sister apps (OtOpcUa,
MxAccessGateway, ScadaBridge), replacing each app's bespoke health-check wiring with the shared
probes, canonical three-tier endpoints (/health/ready, /health/active, /healthz), and JSON
writer — behaviour-preserving.
Architecture: Distribution is via the Gitea NuGet registry (dohertj2-gitea feed). The shared
checks are registered with AddTypeActivatedCheck<T> (DI supplies IServiceProvider; extra
constructor args — policy / role / options — passed positionally) and tagged with ZbHealthTags;
MapZbHealth() routes each tier by tag. Each sister repo is its own git repo — branch, commit,
and (optionally) PR happen inside that repo, not in scadaproj. The three repo phases are mutually
independent after publish and may proceed in parallel.
Tech Stack: .NET 10, ASP.NET Core health checks (Microsoft.Extensions.Diagnostics.HealthChecks),
Akka.NET cluster, EF Core, Microsoft.Data.Sqlite, NuGet Central Package Management, Gitea NuGet feed.
Context the executor MUST know
This plan edits FOUR repos:
~/Desktop/scadaproj— only Phase 0 (verify publish) and Phase 4 (GAPS bookkeeping).~/Desktop/MxAccessGateway— Phase 1 (core package only).~/Desktop/OtOpcUa— Phase 2 (all three packages).~/Desktop/ScadaBridge— Phase 3 (all three packages).
Per-repo git discipline: each sister repo is independent. Before editing a sister repo, create a
branch feat/adopt-zb-health. Commit inside that repo. Never commit sister-repo changes from
scadaproj. Never skip hooks; never force-push.
Distribution status (Task 0 already done): the three ZB.MOM.WW.Health 0.1.0 packages are
published to the dohertj2-gitea feed, and authenticated read credentials are configured at the
user level (~/.nuget/NuGet/NuGet.Config) — anonymous read is OFF, so restore needs them, and
they are already in place for every subagent. NEVER put the token in a repo file.
Source-mapping gotcha (verified): a ZB.MOM.WW.Health.* pattern does NOT match the core package
id ZB.MOM.WW.Health (no trailing dot). Every repo's packageSourceMapping for the Gitea feed MUST
list BOTH <package pattern="ZB.MOM.WW.Health" /> and <package pattern="ZB.MOM.WW.Health.*" />.
Shared registration idiom (used in every phase). The shared checks need constructor args DI
can't supply alone, so register them with AddTypeActivatedCheck<T>:
using Microsoft.Extensions.DependencyInjection; // AddTypeActivatedCheck
using Microsoft.Extensions.Diagnostics.HealthChecks; // HealthStatus
using ZB.MOM.WW.Health; // ZbHealthTags, MapZbHealth, ZbHealthWriter
// + ZB.MOM.WW.Health.Akka / .EntityFrameworkCore where used
AddTypeActivatedCheck<T>(name, failureStatus, tags, params object[] args) builds the check via
ActivatorUtilities.CreateInstance: IServiceProvider constructor params are satisfied from DI;
anything else (an AkkaClusterStatusPolicy, a role string, a DatabaseHealthCheckOptions<T>) is
taken from args by type. This is the canonical way to wire the shared checks.
Library public API (verified, do not re-derive):
endpoints.MapZbHealth(ZbHealthEndpointOptions? = null)— maps ready/active/live; defaults/health/ready,/health/active,/healthz; ready+active useZbHealthWriter.WriteJsonAsync; all anonymous. Does NOT callAddHealthChecks().ZbHealthTags.Ready="ready",.Active="active",.Live="live".DatabaseHealthCheck<TContext>(IServiceProvider, DatabaseHealthCheckOptions<TContext>? )— default probeCanConnectAsync;options.ProbeQuery = Func<TContext,CancellationToken,Task>for the stricter query probe; resolves anIDbContextFactory<TContext>if registered, else a scopedTContextfrom a fresh scope (pool-safe).AkkaClusterHealthCheck(IServiceProvider, AkkaClusterStatusPolicy)— presetsAkkaClusterStatusPolicy.Defaultand.OtOpcUaCompat. ResolvesActorSystemfrom DI.ActiveNodeHealthCheck(IServiceProvider)(role-less) /(IServiceProvider, string role). ResolvesActorSystemfrom DI lazily; Degraded if not yet available.AkkaActiveNodeGate(IServiceProvider) : IActiveNodeGate— not used in this plan (ScadaBridge seam unification is deferred).
Scope deferrals (settled — do NOT implement here): downstream gRPC dependency probes (no
host-level GrpcChannel exists in OtOpcUa or MxGateway); ScadaBridge IDbContextFactory switch
(the shared check self-scopes); ScadaBridge IActiveNodeGate seam unification (its interface is
...InboundAPI.IActiveNodeGate, wired into inbound-API gating — out of scope). These are recorded
as follow-ups in Phase 4.
Phase 0 — Publish the Health packages (prerequisite)
Task 0: Verify the three Health nupkgs are on the Gitea feed (publish if absent)
Classification: small Estimated implement time: ~3 min Parallelizable with: none (gates all other phases)
Files:
- Read:
~/Desktop/scadaproj/ZB.MOM.WW.Health/ZB.MOM.WW.Health.slnx - (No source edits — this is a pack/push/verify task.)
Step 1: Check whether the packages already resolve from Gitea
The library CLAUDE.md claims they are "published to the Gitea NuGet feed." Verify:
curl -s "https://gitea.dohertylan.com/api/packages/dohertj2/nuget/v3/registration/ZB.MOM.WW.Health/index.json" -o /dev/null -w "%{http_code}\n"
Expected: 200 if already published. If 404/401, publish (Steps 2–3). If credentials are
needed for the query, skip to Step 2 and rely on the push result.
Step 2: Pack (only if not already published)
cd ~/Desktop/scadaproj/ZB.MOM.WW.Health
dotnet pack ZB.MOM.WW.Health.slnx -c Release -o ./artifacts
ls artifacts/*.nupkg
Expected: ZB.MOM.WW.Health.0.1.0.nupkg, ZB.MOM.WW.Health.Akka.0.1.0.nupkg,
ZB.MOM.WW.Health.EntityFrameworkCore.0.1.0.nupkg.
Step 3: Push to the Gitea feed
Credentials are NOT in the repo. The developer/CI provides them. Push each package:
dotnet nuget push "artifacts/ZB.MOM.WW.Health*.0.1.0.nupkg" \
--source "https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json" \
--api-key "$GITEA_NUGET_TOKEN"
Expected: Your package was pushed. for each (or 409 Conflict = already present = fine).
Fallback (if Gitea is unreachable): STOP and surface it. Do not silently switch mechanisms —
the fallback (local folder feed) changes only each repo's nuget.config source line, but that is a
plan amendment the user should approve.
Step 4: Commit (none in scadaproj for this task) — no source changed; proceed to Phase 1.
Phase 1 — MxAccessGateway (core package only)
Repo: ~/Desktop/MxAccessGateway. Branch: feat/adopt-zb-health. This repo has no CPM and no
nuget.config today. Readiness probe = a custom AuthStoreHealthCheck over the SQLite auth store
(the gateway authenticates every gRPC call against it).
Task 1: Reference wiring — create nuget.config, add the package reference
Classification: small Estimated implement time: ~3 min Parallelizable with: Task 4, Task 7 (different repos)
Files:
- Create:
~/Desktop/MxAccessGateway/nuget.config - Modify:
~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj(ItemGroup, after line 13)
Step 1: Create nuget.config (this repo's first; nuget.org for everything, Gitea for Health)
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<packageSources>
<clear />
<add key="nuget.org" value="https://api.nuget.org/v3/index.json" />
<add key="dohertj2-gitea" value="https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json" />
</packageSources>
<!-- nuget.org serves everything; the Gitea feed serves only the ZB.MOM.WW.* shared libs.
Credentials are NOT committed: provide them per-developer via `dotnet nuget add source`
(username + access token) or NuGet credential env vars in CI. -->
<packageSourceMapping>
<packageSource key="nuget.org">
<package pattern="*" />
</packageSource>
<packageSource key="dohertj2-gitea">
<package pattern="ZB.MOM.WW.Health" />
<package pattern="ZB.MOM.WW.Health.*" />
</packageSource>
</packageSourceMapping>
</configuration>
Step 2: Add the package reference to the Server .csproj. Insert into the first <ItemGroup>
(the one ending at the current line 14):
<PackageReference Include="ZB.MOM.WW.Health" Version="0.1.0" />
(Direct versioned reference — this repo has no CPM. Do not introduce CPM.)
Step 3: Restore to verify the feed resolves
cd ~/Desktop/MxAccessGateway
dotnet restore src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj
Expected: restore succeeds and pulls ZB.MOM.WW.Health 0.1.0 from dohertj2-gitea. If it 401s,
the developer must add the Gitea source credentials (dotnet nuget add source … -u … -p … --store-password-in-clear-text).
Step 4: Commit
cd ~/Desktop/MxAccessGateway && git checkout -b feat/adopt-zb-health
git add nuget.config src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj
git commit -m "build: reference ZB.MOM.WW.Health from the Gitea feed"
Task 2: Write the custom AuthStoreHealthCheck (TDD)
Classification: standard Estimated implement time: ~5 min Parallelizable with: Task 4, Task 7
Files:
- Create:
~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/Diagnostics/AuthStoreHealthCheck.cs - Test:
~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/AuthStoreHealthCheckTests.cs
Step 1: Write the failing tests
using Microsoft.Extensions.Diagnostics.HealthChecks;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.MxGateway.Server.Configuration;
using ZB.MOM.WW.MxGateway.Server.Diagnostics;
using ZB.MOM.WW.MxGateway.Server.Security.Authentication;
namespace ZB.MOM.WW.MxGateway.Tests.Diagnostics;
public sealed class AuthStoreHealthCheckTests
{
private static AuthSqliteConnectionFactory FactoryFor(string sqlitePath)
{
var options = new GatewayOptions();
options.Authentication.SqlitePath = sqlitePath;
return new AuthSqliteConnectionFactory(Options.Create(options));
}
[Fact]
public async Task Healthy_WhenStoreReachable()
{
var path = Path.Combine(Path.GetTempPath(), $"authcheck-{Guid.NewGuid():N}.db");
try
{
var check = new AuthStoreHealthCheck(FactoryFor(path));
var result = await check.CheckHealthAsync(new HealthCheckContext());
Assert.Equal(HealthStatus.Healthy, result.Status);
}
finally { if (File.Exists(path)) File.Delete(path); }
}
[Fact]
public async Task Unhealthy_WhenPathUnusable()
{
// A path whose parent cannot be created (a file used as a directory) forces open to fail.
var bogus = Path.Combine(Path.GetTempPath(), $"authcheck-{Guid.NewGuid():N}");
await File.WriteAllTextAsync(bogus, "x");
try
{
var check = new AuthStoreHealthCheck(FactoryFor(Path.Combine(bogus, "store.db")));
var result = await check.CheckHealthAsync(new HealthCheckContext());
Assert.Equal(HealthStatus.Unhealthy, result.Status);
}
finally { if (File.Exists(bogus)) File.Delete(bogus); }
}
}
Step 2: Run, expect failure (type does not exist)
cd ~/Desktop/MxAccessGateway
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~AuthStoreHealthCheckTests"
Expected: COMPILE ERROR / FAIL — AuthStoreHealthCheck not found.
Step 3: Implement the check
using Microsoft.Data.Sqlite;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using ZB.MOM.WW.MxGateway.Server.Security.Authentication;
namespace ZB.MOM.WW.MxGateway.Server.Diagnostics;
/// <summary>
/// Readiness probe: verifies the SQLite authentication store is reachable. The gateway
/// authenticates every gRPC call against this store, so its reachability gates readiness.
/// </summary>
public sealed class AuthStoreHealthCheck : IHealthCheck
{
private readonly AuthSqliteConnectionFactory _connectionFactory;
public AuthStoreHealthCheck(AuthSqliteConnectionFactory connectionFactory) =>
_connectionFactory = connectionFactory ?? throw new ArgumentNullException(nameof(connectionFactory));
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
await using SqliteConnection connection =
await _connectionFactory.OpenConnectionAsync(cancellationToken).ConfigureAwait(false);
await using SqliteCommand command = connection.CreateCommand();
command.CommandText = "SELECT 1;";
await command.ExecuteScalarAsync(cancellationToken).ConfigureAwait(false);
return HealthCheckResult.Healthy("Auth store is reachable.");
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
throw;
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy("Auth store is unreachable.", ex);
}
}
}
Step 4: Run, expect pass
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~AuthStoreHealthCheckTests"
Expected: PASS (2 tests). If the GatewayOptions.Authentication.SqlitePath accessor differs, adjust
the test helper to match the real options shape (read Configuration/GatewayOptions.cs first).
Step 5: Commit
git add src/ZB.MOM.WW.MxGateway.Server/Diagnostics/AuthStoreHealthCheck.cs \
src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/AuthStoreHealthCheckTests.cs
git commit -m "feat: add AuthStoreHealthCheck readiness probe"
Task 3: Rewire GatewayApplication to the canonical tiers; fix the route test
Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 4, Task 7
Files:
- Modify:
~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs:63-66(theAddHealthChecks()line) and:172-178(the/health/liveblock) - Modify:
~/Desktop/MxAccessGateway/src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs:14-27
Step 1: Replace the bare AddHealthChecks() (line 66) with the tagged readiness probe
builder.Services.AddHealthChecks()
.AddTypeActivatedCheck<AuthStoreHealthCheck>(
"auth-store",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready });
Add using ZB.MOM.WW.Health; and using ZB.MOM.WW.MxGateway.Server.Diagnostics; (Diagnostics is
already imported at line 9).
Step 2: Delete the /health/live block (lines 172-178) and map the canonical tiers
Remove:
endpoints.MapGet(
"/health/live",
() => Results.Ok(new GatewayHealthReply(
Status: "Healthy",
DefaultBackend: GatewayContractInfo.DefaultBackendName,
WorkerProtocolVersion: GatewayContractInfo.WorkerProtocolVersion)))
.WithName("LiveHealth");
Replace with:
endpoints.MapZbHealth();
(/health/ready runs auth-store; /health/active runs no checks → 200; /healthz is bare
liveness. The GatewayHealthReply type may now be unused — if so, the C# compiler won't flag it;
leave it unless a "remove dead code" reviewer asks, to keep this change tight.)
Step 3: Update the route test (GatewayApplicationTests.cs:14-27) to assert the three tiers
instead of /health/live:
/// <summary>Verifies that Build maps the canonical three health tiers.</summary>
[Fact]
public async Task Build_MapsCanonicalHealthEndpoints()
{
await using WebApplication app = GatewayApplication.Build([]);
var paths = ((IEndpointRouteBuilder)app).DataSources
.SelectMany(dataSource => dataSource.Endpoints)
.OfType<RouteEndpoint>()
.Select(e => e.RoutePattern.RawText)
.ToHashSet();
Assert.Contains("/health/ready", paths);
Assert.Contains("/health/active", paths);
Assert.Contains("/healthz", paths);
Assert.DoesNotContain("/health/live", paths);
}
Step 4: Build + test the whole gateway
cd ~/Desktop/MxAccessGateway
dotnet build src/MxGateway.sln
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj
Expected: build clean; all tests pass (the old Build_MapsLiveHealthEndpoint is replaced). If any
other test references /health/live or LiveHealth, update it the same way.
Step 5: Commit
git add src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs \
src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayApplicationTests.cs
git commit -m "feat: map canonical ZB health tiers; replace bypassing /health/live"
Phase 2 — OtOpcUa (all three packages)
Repo: ~/Desktop/OtOpcUa. Branch: feat/adopt-zb-health. CPM present; NuGet.config has nuget.org
local-mxgwfolder feed, NO source mapping.ActorSystemIS in DI (the bespokeAkkaClusterHealthCheckinjects it directly). This is the cleanest of the three.
Task 4: Reference wiring — add Gitea source + mapping + CPM versions + package refs
Classification: small Estimated implement time: ~4 min Parallelizable with: Task 1, Task 7
Files:
- Modify:
~/Desktop/OtOpcUa/NuGet.config - Modify:
~/Desktop/OtOpcUa/Directory.Packages.props(near line 99-100) - Modify:
~/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj(ItemGroup lines 16-30)
Step 1: Add the Gitea source + source mapping to NuGet.config. Because adding a mapping makes
ALL sources mapped explicitly, map the existing feeds too:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<packageSources>
<add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" />
<add key="local-mxgw" value="./nuget-packages" />
<add key="dohertj2-gitea" value="https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json" />
</packageSources>
<packageSourceMapping>
<packageSource key="nuget.org">
<package pattern="*" />
</packageSource>
<packageSource key="local-mxgw">
<package pattern="ZB.MOM.WW.MxGateway.*" />
</packageSource>
<packageSource key="dohertj2-gitea">
<package pattern="ZB.MOM.WW.Health" />
<package pattern="ZB.MOM.WW.Health.*" />
</packageSource>
</packageSourceMapping>
</configuration>
Step 2: Add CPM versions to Directory.Packages.props next to the existing ZB.MOM.WW.* lines:
<PackageVersion Include="ZB.MOM.WW.Health" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Health.Akka" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Health.EntityFrameworkCore" Version="0.1.0" />
Step 3: Add package references (no version — CPM) to the Host .csproj ItemGroup:
<PackageReference Include="ZB.MOM.WW.Health" />
<PackageReference Include="ZB.MOM.WW.Health.Akka" />
<PackageReference Include="ZB.MOM.WW.Health.EntityFrameworkCore" />
Step 4: Restore
cd ~/Desktop/OtOpcUa && git checkout -b feat/adopt-zb-health
dotnet restore ZB.MOM.WW.OtOpcUa.slnx
Expected: restore succeeds; the three Health packages come from dohertj2-gitea, MxGateway stays on
local-mxgw.
Step 5: Commit
git add NuGet.config Directory.Packages.props src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj
git commit -m "build: reference ZB.MOM.WW.Health packages from the Gitea feed"
Task 5: Swap the three checks to shared probes; map tiers via MapZbHealth
Classification: high-risk Estimated implement time: ~5 min Parallelizable with: Task 1, Task 7
Files:
- Rewrite:
~/Desktop/OtOpcUa/src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/HealthEndpoints.cs - Delete:
Health/DatabaseHealthCheck.cs,Health/AkkaClusterHealthCheck.cs,Health/AdminRoleLeaderHealthCheck.cs - Verify call sites unchanged:
Program.cs:137(AddOtOpcUaHealth),Program.cs:159(MapOtOpcUaHealth)
Step 1: Rewrite HealthEndpoints.cs to register the shared checks (preserving names + tags) and
map via MapZbHealth():
using Microsoft.AspNetCore.Routing;
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.DependencyInjection;
using ZB.MOM.WW.Health;
using ZB.MOM.WW.Health.Akka;
using ZB.MOM.WW.Health.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
namespace ZB.MOM.WW.OtOpcUa.Host.Health;
public static class HealthEndpoints
{
/// <summary>
/// Registers the shared ZB.MOM.WW health probes. Tier semantics preserved from the bespoke
/// implementation: configdb + akka on ready+active; admin-leader on active only.
/// </summary>
public static IServiceCollection AddOtOpcUaHealth(this IServiceCollection services)
{
services.AddHealthChecks()
.AddTypeActivatedCheck<DatabaseHealthCheck<OtOpcUaConfigDbContext>>(
"configdb",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready, ZbHealthTags.Active },
args: new DatabaseHealthCheckOptions<OtOpcUaConfigDbContext>
{
// Preserve OtOpcUa's stricter schema-touching probe.
ProbeQuery = static (db, ct) => db.Deployments.AsNoTracking().Take(1).ToListAsync(ct),
})
.AddTypeActivatedCheck<AkkaClusterHealthCheck>(
"akka",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready, ZbHealthTags.Active },
args: AkkaClusterStatusPolicy.OtOpcUaCompat)
.AddTypeActivatedCheck<ActiveNodeHealthCheck>(
"admin-leader",
failureStatus: null,
tags: new[] { ZbHealthTags.Active },
args: "admin");
return services;
}
/// <summary>Maps the canonical three-tier health endpoints.</summary>
public static IEndpointRouteBuilder MapOtOpcUaHealth(this IEndpointRouteBuilder app)
{
app.MapZbHealth(); // /health/ready, /health/active, /healthz — all AllowAnonymous
return app;
}
}
Note: args: is the params object[] — pass a single options object / policy / string. If the
compiler binds the single-array overload oddly, wrap as args: new object[] { … }.
Step 2: Delete the three bespoke check files
cd ~/Desktop/OtOpcUa
git rm src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/DatabaseHealthCheck.cs \
src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AkkaClusterHealthCheck.cs \
src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/AdminRoleLeaderHealthCheck.cs
(IClusterRoleInfo may now be unused by Health; leave its definition — it may be used elsewhere.)
Step 3: Build
dotnet build ZB.MOM.WW.OtOpcUa.slnx
Expected: clean. Fix any now-dangling using ...Host.Health references to the deleted types.
Step 4: Run health-related tests
dotnet test ZB.MOM.WW.OtOpcUa.slnx --filter "FullyQualifiedName~Health"
Expected: pass. Behaviour-parity checks the executor must confirm (add/keep tests if missing):
- akka tier: self
Up→ Healthy; self not Up → Degraded (theOtOpcUaCompatpreset reproduces the self-Up scan). - admin-leader: node without
adminrole → Healthy; admin member non-leader → Degraded; admin leader → Healthy. (Shared check readsCluster.Get(system).SelfMember+RoleLeader("admin"), vs the oldIClusterRoleInfo; verify equivalence on a formed test cluster or via the library's ownActiveNodeDecisiontable — already covered in the library’s tests.)
Step 5: Commit
git add src/Server/ZB.MOM.WW.OtOpcUa.Host/Health/HealthEndpoints.cs
git commit -m "feat: adopt shared ZB.MOM.WW.Health probes (preserve tiers + OtOpcUaCompat policy)"
Phase 3 — ScadaBridge (all three packages)
Repo: ~/Desktop/ScadaBridge. Branch: feat/adopt-zb-health. CPM + Gitea feed already wired (just
extend mapping). ActorSystem is NOT in DI (owned by AkkaHostedService) — add a transient DI
bridge so the shared checks can resolve it. Keep the existing ActiveNodeGate (seam unification
deferred). No IDbContextFactory switch (shared check self-scopes).
Task 6: Reference wiring — extend mapping + CPM versions + package refs
Classification: small Estimated implement time: ~4 min Parallelizable with: Task 1, Task 4
Files:
- Modify:
~/Desktop/ScadaBridge/nuget.config(source-mapping block, lines 13-20) - Modify:
~/Desktop/ScadaBridge/Directory.Packages.props(near lines 76-77) - Modify:
~/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj(ItemGroup lines 14-31)
Step 1: Extend the Gitea source mapping — add a second pattern under dohertj2-gitea:
<packageSource key="dohertj2-gitea">
<package pattern="ZB.MOM.WW.MxGateway.*" />
<package pattern="ZB.MOM.WW.Health" />
<package pattern="ZB.MOM.WW.Health.*" />
</packageSource>
Step 2: Add CPM versions next to the existing ZB.MOM.WW.* lines in Directory.Packages.props:
<PackageVersion Include="ZB.MOM.WW.Health" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Health.Akka" Version="0.1.0" />
<PackageVersion Include="ZB.MOM.WW.Health.EntityFrameworkCore" Version="0.1.0" />
Step 3: Add package references to the Host .csproj ItemGroup (no version — CPM):
<PackageReference Include="ZB.MOM.WW.Health" />
<PackageReference Include="ZB.MOM.WW.Health.Akka" />
<PackageReference Include="ZB.MOM.WW.Health.EntityFrameworkCore" />
Step 4: Restore + commit
cd ~/Desktop/ScadaBridge && git checkout -b feat/adopt-zb-health
dotnet restore ZB.MOM.WW.ScadaBridge.slnx
git add nuget.config Directory.Packages.props src/ZB.MOM.WW.ScadaBridge.Host/ZB.MOM.WW.ScadaBridge.Host.csproj
git commit -m "build: reference ZB.MOM.WW.Health packages from the Gitea feed"
Task 7: Add the transient ActorSystem DI bridge (TDD)
Classification: standard Estimated implement time: ~4 min Parallelizable with: Task 1, Task 4
Files:
- Modify:
~/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/Program.cs(near the Akka registration) - Test:
~/Desktop/ScadaBridge/tests/ZB.MOM.WW.ScadaBridge.Host.Tests/ActorSystemBridgeTests.cs
Why transient: the shared checks call sp.GetService<ActorSystem>() per probe and treat
null as "not ready yet" (Degraded). A transient factory re-reads AkkaHostedService.ActorSystem
each resolve, returning null before startup and the live system after. A singleton would cache the
startup null forever.
Step 1: Write the failing test
using Akka.Actor;
using Microsoft.Extensions.DependencyInjection;
using ZB.MOM.WW.ScadaBridge.Host.Actors;
namespace ZB.MOM.WW.ScadaBridge.Host.Tests;
public sealed class ActorSystemBridgeTests
{
[Fact]
public void ActorSystem_ResolvesNull_BeforeHostedServiceStarts()
{
var services = new ServiceCollection();
services.AddSingleton<AkkaHostedService>(); // ActorSystem property is null pre-start
services.AddTransient(sp => sp.GetRequiredService<AkkaHostedService>().ActorSystem!);
using var provider = services.BuildServiceProvider();
Assert.Null(provider.GetService<ActorSystem>()); // transient re-reads → null, not cached
}
}
If AkkaHostedService cannot be constructed without dependencies, register a minimal stub instead;
the assertion that matters is "transient bridge yields null before start." Read
Actors/AkkaHostedService.cs constructor first and adapt.
Step 2: Run, expect failure
cd ~/Desktop/ScadaBridge
dotnet test tests/ZB.MOM.WW.ScadaBridge.Host.Tests/ZB.MOM.WW.ScadaBridge.Host.Tests.csproj --filter "FullyQualifiedName~ActorSystemBridgeTests"
Expected: FAIL (no ActorSystem registration → GetService returns null already, OR compile gap).
Adjust so the test meaningfully exercises the bridge registration you add in Step 3.
Step 3: Register the bridge in Program.cs (right after AkkaHostedService is registered):
// The shared ZB.MOM.WW.Health Akka checks resolve ActorSystem from DI. ScadaBridge owns the
// ActorSystem inside AkkaHostedService (not a DI singleton), so bridge it as TRANSIENT: each
// resolve re-reads the current value — null while warming up (checks → Degraded), live afterwards.
builder.Services.AddTransient(sp =>
sp.GetRequiredService<AkkaHostedService>().ActorSystem
?? throw new InvalidOperationException("ActorSystem not yet started."));
Caution: the shared checks use GetService<ActorSystem>() (returns null on failure to resolve)
NOT GetRequiredService. A transient factory that THROWS still surfaces as null from
GetService? No — GetService propagates factory exceptions. Therefore the factory must NOT throw;
return null instead. Use:
builder.Services.AddTransient<ActorSystem>(sp =>
sp.GetRequiredService<AkkaHostedService>().ActorSystem!); // null before start; '!' is a hint only
GetService<ActorSystem>() then returns null pre-start (Degraded) and the live system post-start.
Make the Step-1 test assert exactly this.
Step 4: Run, expect pass
dotnet test tests/ZB.MOM.WW.ScadaBridge.Host.Tests/ZB.MOM.WW.ScadaBridge.Host.Tests.csproj --filter "FullyQualifiedName~ActorSystemBridgeTests"
Expected: PASS.
Step 5: Commit
git add src/ZB.MOM.WW.ScadaBridge.Host/Program.cs \
tests/ZB.MOM.WW.ScadaBridge.Host.Tests/ActorSystemBridgeTests.cs
git commit -m "feat: bridge ActorSystem into DI (transient) for shared health checks"
Task 8: Swap checks to shared probes; add /healthz; canonical writer
Classification: high-risk Estimated implement time: ~5 min Parallelizable with: none (depends on Task 6 + Task 7)
Files:
- Modify:
~/Desktop/ScadaBridge/src/ZB.MOM.WW.ScadaBridge.Host/Program.cs:114-117(registration) and:222-233(endpoint mapping) - Delete:
Health/DatabaseHealthCheck.cs,Health/AkkaClusterHealthCheck.cs,Health/ActiveNodeHealthCheck.cs - Keep:
Health/ActiveNodeGate.cs(unchanged — seam unification deferred) - Adjust:
tests/ZB.MOM.WW.ScadaBridge.Host.Tests/HealthCheckTests.cs
Step 1: Replace the registration block (Program.cs lines 114-117):
builder.Services.AddHealthChecks()
.AddTypeActivatedCheck<DatabaseHealthCheck<ScadaBridgeDbContext>>(
"database",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready }) // default CanConnectAsync probe; self-scopes
.AddTypeActivatedCheck<AkkaClusterHealthCheck>(
"akka-cluster",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready },
args: AkkaClusterStatusPolicy.Default) // Up/Joining=Healthy, Leaving/Exiting=Degraded
.AddTypeActivatedCheck<ActiveNodeHealthCheck>(
"active-node",
failureStatus: null,
tags: new[] { ZbHealthTags.Active }); // role-less leader check
Add usings: ZB.MOM.WW.Health, ZB.MOM.WW.Health.Akka, ZB.MOM.WW.Health.EntityFrameworkCore,
ZB.MOM.WW.ScadaBridge.ConfigurationDatabase (for ScadaBridgeDbContext). Tag mapping preserves
the prior split: database + akka-cluster on ready; active-node on active.
Step 2: Replace the endpoint mapping (Program.cs lines 222-233 — the two MapHealthChecks
blocks using UIResponseWriter) with a single call:
app.MapZbHealth(); // /health/ready (database+akka-cluster), /health/active (active-node), /healthz
This adds the previously-missing /healthz and switches both tiers to the canonical
ZbHealthWriter. Remove the now-unused using for HealthChecks.UI.Client /
UIResponseWriter if it becomes dead.
Step 3: Delete the three bespoke checks
cd ~/Desktop/ScadaBridge
git rm src/ZB.MOM.WW.ScadaBridge.Host/Health/DatabaseHealthCheck.cs \
src/ZB.MOM.WW.ScadaBridge.Host/Health/AkkaClusterHealthCheck.cs \
src/ZB.MOM.WW.ScadaBridge.Host/Health/ActiveNodeHealthCheck.cs
Step 4: Build + test
dotnet build ZB.MOM.WW.ScadaBridge.slnx
dotnet test tests/ZB.MOM.WW.ScadaBridge.Host.Tests/ZB.MOM.WW.ScadaBridge.Host.Tests.csproj
Expected: build clean; tests pass. HealthCheckTests.cs likely references the deleted concrete
types or the old endpoint shape — retarget it to assert: /health/ready, /health/active, AND the
new /healthz are mapped; database+akka-cluster are tagged ready; active-node is tagged
active. The Default policy preserves ScadaBridge's Joining=Healthy classification — keep any
test asserting that.
Step 5: Commit
git add src/ZB.MOM.WW.ScadaBridge.Host/Program.cs \
tests/ZB.MOM.WW.ScadaBridge.Host.Tests/HealthCheckTests.cs
git commit -m "feat: adopt shared ZB.MOM.WW.Health probes; add /healthz; canonical writer"
Phase 4 — Bookkeeping (scadaproj)
Task 9: Update the Health GAPS backlog to reflect adoption + deferrals
Classification: trivial Estimated implement time: ~3 min Parallelizable with: none (do last)
Files:
- Modify:
~/Desktop/scadaproj/components/health/GAPS.md(adoption backlog table + a deferrals note)
Step 1: In components/health/GAPS.md, annotate the adoption-backlog rows as done for what
shipped (MxGateway tiers + AuthStoreHealthCheck; OtOpcUa shared probes; ScadaBridge shared probes
/healthz+ canonical writer + ActorSystem bridge), and add a short "Deferred (verified ill-fitting on adoption)" subsection capturing: downstream gRPC probes (no host-level channel), ScadaBridgeIDbContextFactoryswitch (shared check self-scopes), ScadaBridgeIActiveNodeGateseam unification (different InboundAPI interface), and MxGateway worker probe (named-pipe transport).
Step 2: Commit (scadaproj)
cd ~/Desktop/scadaproj
git add components/health/GAPS.md
git commit -m "docs(health): mark ZB.MOM.WW.Health adoption done; record verified deferrals"
Execution notes
- Order: Task 0 first (gates everything). Then the three repo phases are independent — Tasks 1-3 (MxGateway), 4-5 (OtOpcUa), 6-8 (ScadaBridge) can run in parallel across repos; within a repo they are sequential. Task 9 (scadaproj) last.
- Per-repo green gate: a phase is "done" only when that sister repo's full
dotnet build+dotnet testare green — not just the changed area. - Behaviour preservation is the acceptance bar: the presets (
OtOpcUaCompat/Default) and the role filter ("admin"/ role-less) exist to keep each app's Healthy/Degraded/Unhealthy classifications identical. Any classification change is a defect, not an improvement. - No secrets in any diff — the Gitea token / feed credentials are provided out-of-band; verify
no
nuget.configor csproj change embeds them.