Commit Graph

29 Commits

Author SHA1 Message Date
Joseph Doherty af54c8ad11 merge: integrate WaitAsync/M5-audit (parallel session) with galaxy array-write + inbound-timeout fixes 2026-06-17 09:28:15 -04:00
Joseph Doherty bf2f481bb4 fix(siteruntime): normalize routed script return value for cross-process transport
A routed inbound-API call (Route.To(inst).Call(script)) runs the script on
the Site and returns its value to Central inside RouteToCallResponse, which
crosses the Central<->Site PROCESS boundary. A script's natural
'return new { ... }' is a compiler-generated anonymous type that Akka's
cross-process serializer cannot reconstruct on the receiving node, so the
reply was silently dropped and the caller's Route.To().Call() Ask timed out
at 30s with 'Script execution timed out' -- even though the script completed
and all device writes committed.

DeploymentManagerActor.RouteInboundApiCall now projects the routed return
value to a plain CLR graph (Dictionary/List/string/long/double/bool/null)
via a JSON round-trip before placing it in RouteToCallResponse. The graph
round-trips the wire and re-serializes to the same JSON shape the inbound
API expects for the HTTP body / ReturnDefinition validation.

Diagnosed live: IpsenMESMoveIn writes committed + site_events showed the
IpsenMoveIn script completed in ~0.6s, yet the inbound POST returned 500 at
30s; Central's Akka serializer logged 'Writing value of type
<>f__AnonymousType0`1 as Json' at the timeout moment.

379/379 SiteRuntime tests green.
2026-06-17 09:19:12 -04:00
Joseph Doherty c482cac110 feat(siteruntime): unpack routed RouteToWaitForAttributeRequest into InstanceActor (spec §6 site half) 2026-06-17 09:10:08 -04:00
Joseph Doherty 61048a4ecf feat(siteruntime): WaitForAsync/WaitResult + quality-gated WaitAsync (spec §3, §4.2) 2026-06-17 09:05:12 -04:00
Joseph Doherty 04e97f4a87 fix(siteruntime): harden WaitAsync — no spurious match on quality republish, guard throwing predicate, Ask-timeout returns false 2026-06-17 08:44:03 -04:00
Joseph Doherty 75ffa09b8f feat(siteruntime): event-driven Attributes.WaitAsync attribute-change helper
Adds InstanceActor one-shot waiter registry (fast-path + change-match + scheduled
timeout self-eviction), threads per-script timeout token through ScriptRuntimeContext,
and exposes Attributes.WaitAsync(value|predicate, timeout). Replaces handshake busy-poll.
Implements spec docs/plans/2026-06-17-waitfor-attribute-change-helper-spec.md §3-§5;
§6 routed variant + WaitForAsync + quality-only mode deferred.
2026-06-17 08:25:06 -04:00
Joseph Doherty 20760014c2 feat(audit): M5.4 ParentExecutionId tag-cascade for alarm + nested calls (T4) 2026-06-16 21:42:14 -04:00
Joseph Doherty feeae1371e fix(multivalue): NJ-3/NJ-4/NJ-5 review fixes
- NJ-3: widen per-row catch to Exception (an STJ encode failure can't abort startup); drop dead null-guard already excluded by the SQL filter
- NJ-4: capture logger/instanceName in locals for the fire-and-forget normalize continuation (match the sibling pattern in this actor)
- NJ-5: emit a warn-log when a malformed List value is imported verbatim; thread an optional ILogger<BundleImporter> to the sync re-import site
2026-06-16 18:25:42 -04:00
Joseph Doherty 5841cec958 feat(siteruntime): normalize old-form List static overrides to native JSON on load 2026-06-16 17:49:21 -04:00
Joseph Doherty 94be5e813b fix(siteruntime): decode List value to typed array before DCL write (OPC UA array write path) 2026-06-16 16:48:28 -04:00
Joseph Doherty ad6bfc8af9 fix(siteruntime): reject SetStaticAttribute with malformed list value (no silent poison persist) 2026-06-16 15:59:30 -04:00
Joseph Doherty 7f97780bb3 feat(siteruntime): decode static List attributes to typed lists in InstanceActor (load/override/set) 2026-06-16 15:52:29 -04:00
Joseph Doherty 96e817a7e1 fix(siteruntime): MV-8 review fixes (construct list inside try; dictionary attr lookup; test hygiene) 2026-06-16 15:48:25 -04:00
Joseph Doherty 4765706e94 feat(dcl): coerce OPC UA array reads to typed list attributes; Bad quality on element mismatch 2026-06-16 15:39:19 -04:00
Joseph Doherty f08038db23 feat(siteruntime): M2.12 (#25) — emit script Error site event on recursion-limit violation
Inject ISiteEventLogger into ScriptRuntimeContext (additive optional ctor
param, defaulted null, all existing callers source-compatible). Add a single
private EmitRecursionLimitEventAsync helper that fires-and-forgets a
"script"/Error site event; called at both recursion guard sites (CallScript
at ~:332 and ScriptCallHelper.CallShared at ~:499). ScriptExecutionActor
threads the already-resolved siteEventLogger singleton into the context;
AlarmExecutionActor leaves it null (no siteEventLogger wired there).

Existing _logger.LogError + throw behaviour unchanged.

Tests: RecursionLimitSiteEventTests — 5 tests covering both CallScript and
CallShared (ISiteEventLogger.LogEventAsync called once with category "script",
severity "Error"; null logger path does not throw).
2026-06-16 06:20:58 -04:00
Joseph Doherty dbf44b9e10 fix(siteruntime): M2.11 — unknown-instance debug snapshot returns InstanceNotFound=true (#24)
RouteDebugSnapshot and RouteDebugViewSubscribe on DeploymentManagerActor
previously returned an empty DebugViewSnapshot for unknown instances,
indistinguishable from a deployed-but-empty instance. Callers had no way
to differentiate "not deployed here" from "deployed, no data yet."

Approach — additive field on existing message contract:
  Added `bool InstanceNotFound = false` as an optional trailing parameter
  to DebugViewSnapshot (Commons). All existing positional constructor calls
  and serialized wire frames are unaffected (default = false). A dedicated
  new message type was considered but rejected: the ClusterClient channel
  and DebugStreamService TCS are already typed on DebugViewSnapshot, and a
  second reply union would require wider changes for zero additive-safety
  gain.

Changes:
  - Commons/DebugViewSnapshot: add InstanceNotFound = false (additive)
  - DeploymentManagerActor: set InstanceNotFound=true in both unknown-
    instance branches (RouteDebugViewSubscribe, RouteDebugSnapshot)
  - DebugStreamBridgeActor: when snapshot.InstanceNotFound, forward it to
    _onEvent (resolves the TCS) then stop cleanly; no gRPC stream opened
  - DebugView.razor: check session.InitialSnapshot.InstanceNotFound after
    connect and show a clear "not deployed on this site" error toast
  - 3 new tests in DeploymentManagerActorTests covering: unknown→snapshot,
    unknown→subscribe, known-empty→InstanceNotFound stays false
2026-06-16 06:08:21 -04:00
Joseph Doherty 3edef09f51 feat(runtime): per-script execution timeout overriding the global default (#9)
Spec promised a per-script timeout but only the global ScriptExecutionTimeoutSeconds
existed. Add nullable TemplateScript.ExecutionTimeoutSeconds threaded through EF +
flattening (ResolvedScript) to ScriptExecutionActor/AlarmExecutionActor, which use
perScript ?? global for the execution CTS. Includes the EF migration for the new column.
2026-06-15 14:40:38 -04:00
Joseph Doherty e5534fddca fix(siteeventlog): suppress snapshot-resync alarm re-emit + coverage + hardening (review) 2026-06-15 12:45:00 -04:00
Joseph Doherty e74c3aef23 feat(siteeventlog): emit script started/completed Info events (M1.8)
ScriptExecutionActor previously emitted only an Error 'script' event on failure.
It now also fire-and-forgets an Info 'script' event when execution starts (right
before RunAsync) and when it completes successfully — giving the operational log
the full started/completed/failed lifecycle. Uses the already-resolved
siteEventLogger; fire-and-forget so the event log can never block or fault the
script's own run.

Extends the SingleServiceProvider test helper to also serve IServiceScopeFactory
(returning a self-scope) so ScriptExecutionActor's serviceProvider.CreateScope()
reaches the logging hot path in tests instead of throwing into the catch.
2026-06-15 12:33:31 -04:00
Joseph Doherty 09b9e8f259 feat(siteeventlog): emit deployment + instance_lifecycle events (M1.6)
DeploymentManagerActor now fire-and-forgets a 'deployment' site operational
event on deploy/enable/disable/delete outcomes (Info on success, Error on
failure), source 'DeploymentManagerActor'. The disable/delete events are emitted
from the existing PipeTo continuations (safe: reads only the immutable
_serviceProvider and fire-and-forgets).

InstanceActor now emits an 'instance_lifecycle' Info event in PreStart (started)
and a new PostStop (stopped) — covering start/stop/enable/disable/redeploy/
failover transitions from the instance's own vantage point. Both actors already
hold _serviceProvider; no ctor change.

Resolution is optional and LogEventAsync is fire-and-forget so a logging failure
never affects the deployment pipeline or instance lifecycle.
2026-06-15 12:26:54 -04:00
Joseph Doherty a00e43c4f9 feat(siteeventlog): emit alarm-category events on alarm transitions (M1.5)
AlarmActor (computed) and NativeAlarmActor (native mirror) now fire-and-forget
an 'alarm' site operational event on every state transition:
- raise/activate: Error (priority/severity >= 700) or Warning
- clear/return-to-normal, ack, inter-band transition: Info

Both actors take a new optional IServiceProvider? ctor param (default null so
existing direct-construction tests still compile); InstanceActor passes its
_serviceProvider at the two Props.Create sites. Resolution is optional and the
LogEventAsync call is fire-and-forget, so a logging failure never affects alarm
evaluation. Rehydration replays are not re-logged.

Adds a capturing FakeSiteEventLogger test helper + SingleServiceProvider.
2026-06-15 12:23:04 -04:00
Joseph Doherty 6d318586d1 feat(siteruntime): InstanceActor spawns NativeAlarmActors + enriched alarm snapshot; clear native state on redeploy/undeploy 2026-05-31 02:06:39 -04:00
Joseph Doherty fda7ac9c50 feat(siteruntime): NativeAlarmActor mirrors source alarms (snapshot swap, retention, persistence) 2026-05-31 01:49:28 -04:00
Joseph Doherty bfd8b25108 fix(siteruntime): capture Self before Task.Run in artifact deploy; seed MxGateway connections
- DeploymentManagerActor.HandleDeployArtifacts read the Self property inside its
  Task.Run lambda (line dispatching ApplyArtifactDataConnectionsToDcl). Self is
  backed by the ambient ActorCell, null on a thread-pool thread, so it threw
  'no active ActorContext' — surfaced the first time a data connection is
  deployed via deploy-artifacts. Capture Self into a local first (as Sender
  already was).
- seed-sites.sh: create a shared MxGateway data connection (10.100.0.48:5120)
  on each site and deploy artifacts so the DCL establishes them.
- build.sh: nounset-safe empty-array expansion (bash 3.2).
2026-05-29 08:26:39 -04:00
Joseph Doherty 5461e4968e feat(dcl): register MxGateway protocol in factory + config flatten + options
DataConnectionFactory registers 'MxGateway' -> MxGatewayDataConnection over the
real client; AddDataConnectionLayer binds MxGatewayGlobalOptions; DeploymentManager
FlattenConnectionConfig gains an MxGateway arm using the typed serializer. Factory
test confirms Create("MxGateway") returns the adapter.
2026-05-29 07:58:51 -04:00
Joseph Doherty 9b7916bb2e refactor(browse): rename BrowseOpcUaNode* to protocol-agnostic BrowseNode*
Renames BrowseOpcUaNodeCommand/Result -> BrowseNodeCommand/Result and
CommunicationService.BrowseOpcUaNodeAsync -> BrowseNodeAsync across Commons,
Communication, SiteRuntime, DCL actors, and CentralUI. Wire manifest name
follows (BrowseOpcUaNode -> BrowseNode). Browse regression tests green.
2026-05-29 07:57:36 -04:00
Joseph Doherty 2a7dee4afa feat(centralui+dcl): Test Bindings popup — one-shot live read of bound tags
Adds a Test Bindings button to the Connection Bindings table on the Configure
Instance page that opens a modal showing the live current value of every bound
attribute. Reuses the routing path that the OPC UA tag browser landed on:

  Central:  TestBindingsDialog → IBindingTester → CommunicationService
            → ReadTagValuesCommand → SiteEnvelope (Ask)
  Site:     SiteCommunicationActor → DeploymentManagerActor singleton
            → DataConnectionManagerActor → child DataConnectionActor
            → _adapter.ReadBatchAsync

Split mirrors the browse handler:
  • Manager owns ConnectionNotFound (only it sees the per-site connection set).
  • Child owns ConnectionNotConnected (pre-call status check, never stash —
    read is interactive design-time), Timeout (OperationCanceledException),
    ServerError (any other exception). Per-tag failures from ReadBatchAsync
    become failure TagReadOutcomes without aborting the batch.

CentralUI:
  • IBindingTester / BindingTester — Design-role guard via HasClaim against
    JwtTokenService.RoleClaimType (not IsInRole — see c1e16cf), typed
    transport-failure translation.
  • TestBindingsDialog — ShowAsync(siteId, rows, instanceLabel) method-arg
    pattern (no Razor parameter race; see 2c138b6), groups rows by connection
    and issues one ReadAsync per connection in parallel, per-row error subline
    + per-connection banner, Refresh button re-issues the reads.
  • InstanceConfigure.razor — Test Bindings button next to Save Bindings,
    disabled when no testable rows. OPC UA only today (other protocols have
    no ReadTagValuesCommand wiring yet).

Tests:
  • Commons: ReadTagValuesCommand discovered by ManagementCommandRegistry.
  • DataConnectionLayer: unknown connection → ConnectionNotFound,
    not-connected adapter → ConnectionNotConnected (ReadBatchAsync NOT called),
    success-path mapping (Good/Bad + per-tag error), cancellation → Timeout.
  • CentralUI: register IBindingTester (and the previously-missing
    IOpcUaBrowseService) on the existing InstanceConfigureAuditDrillinTests
    Bunit container so the page renders cleanly with the new dialog.
2026-05-28 13:25:48 -04:00
Joseph Doherty f401a9ea0e fix(comm+site): route BrowseOpcUaNodeCommand via DeploymentManagerActor singleton 2026-05-28 12:51:45 -04:00
Joseph Doherty 7b0b9c7365 refactor: rename ScadaLink → ZB.MOM.WW.ScadaBridge (code + projects + namespaces)
Solution + 23 src projects + 26 test projects renamed; folders, csproj,
namespaces, and ScadaLinkDbContext/ScadaBridgeDbContext class updated.
ActorSystem "scadalink" → "scadabridge", Akka seed-node URLs migrated.
SQL roles/logins, LDAP domains, CLI command name, and CLI config dir
(~/.scadalink → ~/.scadabridge) also renamed.

Build green; 5 Host.Tests fail awaiting SQL login rename in next commit.
Pre-existing StaleTagMonitor timing flakes unchanged.

Rename script committed at tools/rename-to-scadabridge.sh.
2026-05-28 09:37:45 -04:00