RouteDebugSnapshot and RouteDebugViewSubscribe on DeploymentManagerActor
previously returned an empty DebugViewSnapshot for unknown instances,
indistinguishable from a deployed-but-empty instance. Callers had no way
to differentiate "not deployed here" from "deployed, no data yet."
Approach — additive field on existing message contract:
Added `bool InstanceNotFound = false` as an optional trailing parameter
to DebugViewSnapshot (Commons). All existing positional constructor calls
and serialized wire frames are unaffected (default = false). A dedicated
new message type was considered but rejected: the ClusterClient channel
and DebugStreamService TCS are already typed on DebugViewSnapshot, and a
second reply union would require wider changes for zero additive-safety
gain.
Changes:
- Commons/DebugViewSnapshot: add InstanceNotFound = false (additive)
- DeploymentManagerActor: set InstanceNotFound=true in both unknown-
instance branches (RouteDebugViewSubscribe, RouteDebugSnapshot)
- DebugStreamBridgeActor: when snapshot.InstanceNotFound, forward it to
_onEvent (resolves the TCS) then stop cleanly; no gRPC stream opened
- DebugView.razor: check session.InitialSnapshot.InstanceNotFound after
connect and show a clear "not deployed on this site" error toast
- 3 new tests in DeploymentManagerActorTests covering: unknown→snapshot,
unknown→subscribe, known-empty→InstanceNotFound stays false
The UI script editor has no ExecutionTimeoutSeconds control (authoring deferred),
so a body edit silently cleared a timeout set via Transport import. Round-trip the
loaded value so UI edits preserve it. Add the missing AlarmExecutionActor null/<=0
fallback tests for symmetry with ScriptExecutionActor.
Spec promised a per-script timeout but only the global ScriptExecutionTimeoutSeconds
existed. Add nullable TemplateScript.ExecutionTimeoutSeconds threaded through EF +
flattening (ResolvedScript) to ScriptExecutionActor/AlarmExecutionActor, which use
perScript ?? global for the execution CTS. Includes the EF migration for the new column.
ScriptExecutionActor previously emitted only an Error 'script' event on failure.
It now also fire-and-forgets an Info 'script' event when execution starts (right
before RunAsync) and when it completes successfully — giving the operational log
the full started/completed/failed lifecycle. Uses the already-resolved
siteEventLogger; fire-and-forget so the event log can never block or fault the
script's own run.
Extends the SingleServiceProvider test helper to also serve IServiceScopeFactory
(returning a self-scope) so ScriptExecutionActor's serviceProvider.CreateScope()
reaches the logging hot path in tests instead of throwing into the catch.
DeploymentManagerActor now fire-and-forgets a 'deployment' site operational
event on deploy/enable/disable/delete outcomes (Info on success, Error on
failure), source 'DeploymentManagerActor'. The disable/delete events are emitted
from the existing PipeTo continuations (safe: reads only the immutable
_serviceProvider and fire-and-forgets).
InstanceActor now emits an 'instance_lifecycle' Info event in PreStart (started)
and a new PostStop (stopped) — covering start/stop/enable/disable/redeploy/
failover transitions from the instance's own vantage point. Both actors already
hold _serviceProvider; no ctor change.
Resolution is optional and LogEventAsync is fire-and-forget so a logging failure
never affects the deployment pipeline or instance lifecycle.
AlarmActor (computed) and NativeAlarmActor (native mirror) now fire-and-forget
an 'alarm' site operational event on every state transition:
- raise/activate: Error (priority/severity >= 700) or Warning
- clear/return-to-normal, ack, inter-band transition: Info
Both actors take a new optional IServiceProvider? ctor param (default null so
existing direct-construction tests still compile); InstanceActor passes its
_serviceProvider at the two Props.Create sites. Resolution is optional and the
LogEventAsync call is fire-and-forget, so a logging failure never affects alarm
evaluation. Rehydration replays are not re-logged.
Adds a capturing FakeSiteEventLogger test helper + SingleServiceProvider.