A routed inbound-API call (Route.To(inst).Call(script)) runs the script on
the Site and returns its value to Central inside RouteToCallResponse, which
crosses the Central<->Site PROCESS boundary. A script's natural
'return new { ... }' is a compiler-generated anonymous type that Akka's
cross-process serializer cannot reconstruct on the receiving node, so the
reply was silently dropped and the caller's Route.To().Call() Ask timed out
at 30s with 'Script execution timed out' -- even though the script completed
and all device writes committed.
DeploymentManagerActor.RouteInboundApiCall now projects the routed return
value to a plain CLR graph (Dictionary/List/string/long/double/bool/null)
via a JSON round-trip before placing it in RouteToCallResponse. The graph
round-trips the wire and re-serializes to the same JSON shape the inbound
API expects for the HTTP body / ReturnDefinition validation.
Diagnosed live: IpsenMESMoveIn writes committed + site_events showed the
IpsenMoveIn script completed in ~0.6s, yet the inbound POST returned 500 at
30s; Central's Akka serializer logged 'Writing value of type
<>f__AnonymousType0`1 as Json' at the timeout moment.
379/379 SiteRuntime tests green.
- NJ-3: widen per-row catch to Exception (an STJ encode failure can't abort startup); drop dead null-guard already excluded by the SQL filter
- NJ-4: capture logger/instanceName in locals for the fire-and-forget normalize continuation (match the sibling pattern in this actor)
- NJ-5: emit a warn-log when a malformed List value is imported verbatim; thread an optional ILogger<BundleImporter> to the sync re-import site
Inject ISiteEventLogger into ScriptRuntimeContext (additive optional ctor
param, defaulted null, all existing callers source-compatible). Add a single
private EmitRecursionLimitEventAsync helper that fires-and-forgets a
"script"/Error site event; called at both recursion guard sites (CallScript
at ~:332 and ScriptCallHelper.CallShared at ~:499). ScriptExecutionActor
threads the already-resolved siteEventLogger singleton into the context;
AlarmExecutionActor leaves it null (no siteEventLogger wired there).
Existing _logger.LogError + throw behaviour unchanged.
Tests: RecursionLimitSiteEventTests — 5 tests covering both CallScript and
CallShared (ISiteEventLogger.LogEventAsync called once with category "script",
severity "Error"; null logger path does not throw).
RouteDebugSnapshot and RouteDebugViewSubscribe on DeploymentManagerActor
previously returned an empty DebugViewSnapshot for unknown instances,
indistinguishable from a deployed-but-empty instance. Callers had no way
to differentiate "not deployed here" from "deployed, no data yet."
Approach — additive field on existing message contract:
Added `bool InstanceNotFound = false` as an optional trailing parameter
to DebugViewSnapshot (Commons). All existing positional constructor calls
and serialized wire frames are unaffected (default = false). A dedicated
new message type was considered but rejected: the ClusterClient channel
and DebugStreamService TCS are already typed on DebugViewSnapshot, and a
second reply union would require wider changes for zero additive-safety
gain.
Changes:
- Commons/DebugViewSnapshot: add InstanceNotFound = false (additive)
- DeploymentManagerActor: set InstanceNotFound=true in both unknown-
instance branches (RouteDebugViewSubscribe, RouteDebugSnapshot)
- DebugStreamBridgeActor: when snapshot.InstanceNotFound, forward it to
_onEvent (resolves the TCS) then stop cleanly; no gRPC stream opened
- DebugView.razor: check session.InitialSnapshot.InstanceNotFound after
connect and show a clear "not deployed on this site" error toast
- 3 new tests in DeploymentManagerActorTests covering: unknown→snapshot,
unknown→subscribe, known-empty→InstanceNotFound stays false
Spec promised a per-script timeout but only the global ScriptExecutionTimeoutSeconds
existed. Add nullable TemplateScript.ExecutionTimeoutSeconds threaded through EF +
flattening (ResolvedScript) to ScriptExecutionActor/AlarmExecutionActor, which use
perScript ?? global for the execution CTS. Includes the EF migration for the new column.
ScriptExecutionActor previously emitted only an Error 'script' event on failure.
It now also fire-and-forgets an Info 'script' event when execution starts (right
before RunAsync) and when it completes successfully — giving the operational log
the full started/completed/failed lifecycle. Uses the already-resolved
siteEventLogger; fire-and-forget so the event log can never block or fault the
script's own run.
Extends the SingleServiceProvider test helper to also serve IServiceScopeFactory
(returning a self-scope) so ScriptExecutionActor's serviceProvider.CreateScope()
reaches the logging hot path in tests instead of throwing into the catch.
DeploymentManagerActor now fire-and-forgets a 'deployment' site operational
event on deploy/enable/disable/delete outcomes (Info on success, Error on
failure), source 'DeploymentManagerActor'. The disable/delete events are emitted
from the existing PipeTo continuations (safe: reads only the immutable
_serviceProvider and fire-and-forgets).
InstanceActor now emits an 'instance_lifecycle' Info event in PreStart (started)
and a new PostStop (stopped) — covering start/stop/enable/disable/redeploy/
failover transitions from the instance's own vantage point. Both actors already
hold _serviceProvider; no ctor change.
Resolution is optional and LogEventAsync is fire-and-forget so a logging failure
never affects the deployment pipeline or instance lifecycle.
AlarmActor (computed) and NativeAlarmActor (native mirror) now fire-and-forget
an 'alarm' site operational event on every state transition:
- raise/activate: Error (priority/severity >= 700) or Warning
- clear/return-to-normal, ack, inter-band transition: Info
Both actors take a new optional IServiceProvider? ctor param (default null so
existing direct-construction tests still compile); InstanceActor passes its
_serviceProvider at the two Props.Create sites. Resolution is optional and the
LogEventAsync call is fire-and-forget, so a logging failure never affects alarm
evaluation. Rehydration replays are not re-logged.
Adds a capturing FakeSiteEventLogger test helper + SingleServiceProvider.
- DeploymentManagerActor.HandleDeployArtifacts read the Self property inside its
Task.Run lambda (line dispatching ApplyArtifactDataConnectionsToDcl). Self is
backed by the ambient ActorCell, null on a thread-pool thread, so it threw
'no active ActorContext' — surfaced the first time a data connection is
deployed via deploy-artifacts. Capture Self into a local first (as Sender
already was).
- seed-sites.sh: create a shared MxGateway data connection (10.100.0.48:5120)
on each site and deploy artifacts so the DCL establishes them.
- build.sh: nounset-safe empty-array expansion (bash 3.2).
DataConnectionFactory registers 'MxGateway' -> MxGatewayDataConnection over the
real client; AddDataConnectionLayer binds MxGatewayGlobalOptions; DeploymentManager
FlattenConnectionConfig gains an MxGateway arm using the typed serializer. Factory
test confirms Create("MxGateway") returns the adapter.
Adds a Test Bindings button to the Connection Bindings table on the Configure
Instance page that opens a modal showing the live current value of every bound
attribute. Reuses the routing path that the OPC UA tag browser landed on:
Central: TestBindingsDialog → IBindingTester → CommunicationService
→ ReadTagValuesCommand → SiteEnvelope (Ask)
Site: SiteCommunicationActor → DeploymentManagerActor singleton
→ DataConnectionManagerActor → child DataConnectionActor
→ _adapter.ReadBatchAsync
Split mirrors the browse handler:
• Manager owns ConnectionNotFound (only it sees the per-site connection set).
• Child owns ConnectionNotConnected (pre-call status check, never stash —
read is interactive design-time), Timeout (OperationCanceledException),
ServerError (any other exception). Per-tag failures from ReadBatchAsync
become failure TagReadOutcomes without aborting the batch.
CentralUI:
• IBindingTester / BindingTester — Design-role guard via HasClaim against
JwtTokenService.RoleClaimType (not IsInRole — see c1e16cf), typed
transport-failure translation.
• TestBindingsDialog — ShowAsync(siteId, rows, instanceLabel) method-arg
pattern (no Razor parameter race; see 2c138b6), groups rows by connection
and issues one ReadAsync per connection in parallel, per-row error subline
+ per-connection banner, Refresh button re-issues the reads.
• InstanceConfigure.razor — Test Bindings button next to Save Bindings,
disabled when no testable rows. OPC UA only today (other protocols have
no ReadTagValuesCommand wiring yet).
Tests:
• Commons: ReadTagValuesCommand discovered by ManagementCommandRegistry.
• DataConnectionLayer: unknown connection → ConnectionNotFound,
not-connected adapter → ConnectionNotConnected (ReadBatchAsync NOT called),
success-path mapping (Good/Bad + per-tag error), cancellation → Timeout.
• CentralUI: register IBindingTester (and the previously-missing
IOpcUaBrowseService) on the existing InstanceConfigureAuditDrillinTests
Bunit container so the page renders cleanly with the new dialog.