fix(host): register ActorSystem as DI singleton so health-probe scopes don't dispose it (HOST-021)
Per-probe health-check child scopes were disposing the AddTransient-bridged ActorSystem (IDisposable), terminating the live cluster node ~4s after boot and leaving every singleton-proxy Ask to hang the full 30s QueryTimeout — the central report pages (/notifications, /site-calls, /monitoring/health) loaded in ~30s. Bridge it as a singleton via a new lazy AkkaHostedService.GetOrCreateActorSystem() so child-scope disposal never touches it. Verified: 0 post-startup terminates, healthy active/standby, report pages ~0.05s, Playwright 68 passed / 0 failed.
This commit is contained in:
@@ -204,12 +204,17 @@ try
|
||||
builder.Services.AddSingleton<AkkaHostedService>();
|
||||
builder.Services.AddHostedService(sp => sp.GetRequiredService<AkkaHostedService>());
|
||||
|
||||
// The shared ZB.MOM.WW.Health Akka checks resolve ActorSystem from DI. ScadaBridge owns the
|
||||
// ActorSystem inside AkkaHostedService (not a DI singleton), so bridge it as TRANSIENT: each
|
||||
// resolve re-reads the current value — null while warming up (checks → Degraded), live after.
|
||||
// The factory must NOT throw: GetService<ActorSystem>() must return null (not raise) pre-start.
|
||||
builder.Services.AddTransient<Akka.Actor.ActorSystem>(sp =>
|
||||
sp.GetRequiredService<AkkaHostedService>().ActorSystem!);
|
||||
// HOST-021: bridge the AkkaHostedService-owned ActorSystem to DI as a SINGLETON via
|
||||
// GetOrCreateActorSystem(). The shared ZB.MOM.WW.Health Akka checks resolve ActorSystem
|
||||
// from DI, per probe, inside a child scope. ActorSystem is IDisposable, so a TRANSIENT
|
||||
// (or scoped) bridge is captured-and-disposed by each probe's scope — disposing the live
|
||||
// system mid-flight (CoordinatedShutdown/ActorSystemTerminateReason) and wedging the
|
||||
// central report pages at the 30s Ask timeout. A singleton is resolved from the root and
|
||||
// never disposed by a child scope; routing through GetOrCreateActorSystem (instead of a
|
||||
// plain singleton factory over .ActorSystem) means the first resolve CREATES the system
|
||||
// rather than caching a null if a probe wins the startup race.
|
||||
builder.Services.AddSingleton<Akka.Actor.ActorSystem>(sp =>
|
||||
sp.GetRequiredService<AkkaHostedService>().GetOrCreateActorSystem());
|
||||
|
||||
// InboundAPI-022: register the production IActiveNodeGate implementation so
|
||||
// standby-node gating is actually enforced (the InboundApiEndpointFilter
|
||||
|
||||
Reference in New Issue
Block a user