Wraps IExternalSystemClient.CallAsync inside ScriptRuntimeContext's
ExternalSystemHelper so every script-initiated ExternalSystem.Call
produces exactly one ApiOutbound/ApiCall AuditEvent via IAuditWriter.
- Captures duration with Stopwatch.GetTimestamp() around the call.
- Builds the audit event with full provenance (SiteId, InstanceId,
SourceScript) and a fresh EventId; ForwardState=Pending.
- Maps Success → AuditStatus.Delivered, Failure (or thrown) → Failed;
parses HTTP {code} out of the ExternalSystemClient's error message
to populate HttpStatus.
- Audit emission is fully best-effort: event-build failures, sync
WriteAsync throws, AND async WriteAsync faults are all logged at
Warning and swallowed so the script's call path is never aborted
by an audit-write failure (alog.md §7).
- Original ExternalCallResult or original exception flows back to the
caller unchanged.
ScriptExecutionActor resolves IAuditWriter from DI and threads it
into ScriptRuntimeContext alongside the existing site identity.
Adds ExternalSystemCallAuditEmissionTests covering: success →
Delivered, HTTP 500 → Failed+httpStatus, HTTP 400 → Failed+httpStatus,
client-thrown network exception → Failed with original exception
re-thrown, audit-writer throw → original result returned, provenance
populated from context, DurationMs recorded.
Refs Audit Log #23 M2 Bundle F.
FU3: thread the executing script identifier from the script-execution
context down to the Notify outbox API so NotifyTarget.Send stamps
NotificationSubmit.SourceScript instead of leaving it null.
- ScriptRuntimeContext / NotifyHelper / NotifyTarget take an optional
sourceScript value, carried through to NotificationSubmit.SourceScript.
- ScriptExecutionActor supplies "ScriptActor:<scriptName>", matching the
Site Event Logging "Source" convention used for script error events.
- AlarmExecutionActor builds the context without the S&F engine, so its
Notify API is inert; sourceScript defaults to null there.
The script-analysis sandbox Notify surface was stale after the Notification
Outbox change: SandboxNotifyTarget.Send returned Task<NotificationResult> and
there was no Status method, while production NotifyTarget.Send returns
Task<string> (a NotificationId) plus NotifyHelper.Status. A script that
test-ran cleanly in the sandbox would not compile against the real site
runtime.
- Move the NotificationDeliveryStatus record from ScadaLink.SiteRuntime.Scripts
into ScadaLink.Commons.Messages.Notification so both production and the
CentralUI sandbox reference the exact same type (CentralUI does not, and
should not, reference SiteRuntime). Production NotifyHelper.Status is
otherwise untouched.
- Rewrite SandboxNotifyHelper/SandboxNotifyTarget to be a signature-faithful
no-op fake: Send returns Task<string> (a fake NotificationId), Status returns
Task<NotificationDeliveryStatus>. Production now enqueues into the site S&F
engine, which has no central-side equivalent in the sandbox, so the fake no
longer carries an INotificationDeliveryService.
- Add script-analysis tests proving a script using the new Notify shape both
diagnoses clean and runs in the sandbox.
Notify.To(list).Send(subject,body) now generates a NotificationId GUID,
enqueues a Notification-category message into the site Store-and-Forward
Engine, and returns the NotificationId immediately (Task<string>). The
NotificationId is the single idempotency key end-to-end: it is the S&F
message Id, it is carried inside the buffered NotificationSubmit payload,
and it is the id the forwarder submits to central.
NotificationForwarder now deserializes the buffered payload as a
NotificationSubmit and reads NotificationId from it (re-stamping only the
site-owned SourceSiteId / SourceInstanceId), instead of deriving the id
from StoreAndForwardMessage.Id.
Adds NotifyHelper.Status(id): queries central via the site communication
actor; reports the site-local Forwarding state while the notification is
still buffered at the site, maps central's response when found, and
Unknown otherwise. Adds a NotificationDeliveryStatus record.
SiteCommunicationActor gains a NotificationStatusQuery forwarding handler
mirroring NotificationSubmit. StoreAndForwardService.EnqueueAsync gains an
optional messageId parameter and exposes GetMessageByIdAsync.
Conditional and Expression script triggers gain an optional `mode` field
in their TriggerConfiguration JSON:
- OnTrue (default): unchanged edge/per-change firing. An absent mode field
parses as OnTrue, so every existing trigger config behaves identically.
- WhileTrue: fires on the false->true edge, then re-fires on a periodic
timer while the condition holds; stops on the true->false edge. The
re-fire cadence is the script's MinTimeBetweenRuns; with none configured
the trigger degrades to a single edge fire and logs a warning.
ScriptActor tracks condition truth state and manages a dedicated
"whiletrue-trigger" timer. ScriptTriggerConfigCodec and ScriptTriggerEditor
round-trip the mode and expose an OnTrue/WhileTrue selector for the two
trigger kinds. Design: docs/plans/2026-05-18-whiletrue-trigger-mode-design.md
Tests: 7 ScriptActor runtime tests (edge fire, timer re-fire, stop,
re-arm, no-MinTimeBetweenRuns degrade, OnTrue regressions) + 14 codec /
editor tests. SiteRuntime suite 206 green, CentralUI suite 295 green.
InstanceActor._tagPathToAttribute was a Dictionary<string,string> — one tag
path mapped to a single attribute. When two attributes reference the same PLC
node (e.g. two composed cooling-tank modules both reading ns=3;s=Tank.Level,
or a pump's TempSensor and AlarmSensor both reading ns=3;s=Sensor.Reading),
SubscribeToDcl's map assignment overwrote, so only the last-registered
attribute ever received values — the rest stayed permanently Uncertain.
The map is now Dictionary<string,List<string>>; HandleTagValueUpdate fans each
update out to every attribute referencing the tag path, and each distinct tag
path is still subscribed only once per connection.
Adds DeploymentStateQuery request/response contracts (Commons), a site-side
handler (SiteRuntime), a CommunicationService query method (Communication), and
reconciliation in DeploymentService: when a prior record is InProgress or
Failed-on-timeout, query the site; if it already holds the target revision hash
mark the record Success without re-sending; on query failure fall through to a
normal deploy (site-side stale-rejection is the safety net).
Move all package versions into Directory.Packages.props so every project
resolves a single consistent version. Consolidates the Roslyn packages
(Microsoft.CodeAnalysis.CSharp.Scripting/Workspaces) onto 5.0.0, which
resolves the pre-existing NU1608 version-skew error in the test projects.
The Test Run sandbox and Monaco analysis modelled a script API that had
drifted from the site runtime's ScriptGlobals, so real scripts failed to
compile in Test Run. Realign both to the runtime surface
(Instance/Scripts/ExternalSystem/Attributes/Children/Parent) and drop the
duplicate ScriptHost stub so the two cannot diverge again.
- Script calls (Scripts.CallShared, Instance.CallScript, Route.To().Call)
accept an anonymous object instead of a hand-built dictionary, via a
shared ScriptArgs normalizer; existing dictionary calls still compile.
- Test Run can optionally bind to a deployed instance, so Instance/
Attributes/CallScript route to it cross-site; adds site-side
RouteToGetAttributes/RouteToSetAttributes handlers.
- Adds Test Run panels to the API method and template script editors.
- Fixes the TestDatabaseQuery seed script, which queried a table that
never existed.
Also commits unrelated in-progress work already in the tree: the health
monitoring report loop, site streaming changes, and the Admin/Design
data-connection and SMTP page reorganization.
Script execution failures were only written to Serilog, never to the
site event log — SiteRuntime did not reference the SiteEventLogging
project. ScriptExecutionActor now resolves ISiteEventLogger and emits a
'script'/'Error' event on timeout and exception.
The event-log query handler was a per-node actor bound to that node's
local SQLite. A ClusterClient query could land on the standby (which
records no events) and return nothing. The handler is now a cluster
singleton with a proxy, so queries always reach the active node.
Adds a new HiLo alarm trigger type with four configurable setpoints
(LoLo / Lo / Hi / HiHi). Each setpoint carries an optional priority,
deadband (for hysteresis), and operator message. The site runtime emits
AlarmStateChanged with an AlarmLevel field so consumers can differentiate
warning vs critical bands.
Plumbing:
- new AlarmLevel enum + AlarmStateChanged.Level/Message init properties
- AlarmTriggerEditor (Blazor) gets a HiLo render with severity tinting
- AlarmTriggerConfigCodec extracted from the editor for testability
- sitestream.proto carries level + message over gRPC
- SemanticValidator enforces numeric attribute, setpoint ordering,
non-negative deadband
- on-trigger scripts get an Alarm global (Name/Level/Priority/Message)
so notification routing can branch by severity
- per-instance InstanceAlarmOverride entity + EF migration + flattening
step + CLI commands; HiLo overrides merge setpoint-by-setpoint, binary
types whole-replace
- DebugView shows a Level badge + per-band message tooltip
- App.razor auto-reloads on permanent Blazor circuit failure
- docker/regen-proto.sh automates the proto regen workflow (the linux/arm64
protoc segfault means generated files are checked in for now)
Replace raw-JSON text inputs with rich UI: script parameter/return types use
a JSON Schema builder (SchemaBuilder + JsonSchemaShapeParser, with a migration
to convert existing definitions); alarm trigger config uses a type-aware
editor with a flattened attribute picker (AlarmTriggerEditor). AlarmActor
gains optional direction (rising/falling/either) on RateOfChange triggers.
Phases 1+2 of the design at
docs/plans/2026-05-12-script-scope-access-design.md.
Adds ergonomic scope-aware accessors to compiled scripts. A script
on a composed TempSensor reads its own attribute via
Attributes["Temperature"]; reaches up to the parent via
Parent.Attributes["SpeedRPM"]; invokes a child script via
Children["TempSensor"].CallScript("Sample"). All resolve to the
existing flat Instance.GetAttribute / SetAttribute / CallScript
delegates by prepending the script's canonical path prefix.
Runtime types (SiteRuntime.Scripts.ScopeAccessors):
AttributeAccessor sync indexer + GetAsync / SetAsync
CompositionAccessor Attributes + CallScript
ChildrenAccessor Children["name"] => CompositionAccessor
ScriptGlobals gains Scope, Attributes, Children, Parent properties.
Sync indexer blocks on the Instance Actor Ask; explicit GetAsync /
SetAsync are also available for callers that want to await.
Plumbing:
- Commons.Types.Scripts.ScriptScope record (SelfPath / ParentPath).
- ResolvedScript.Scope (defaults to ScriptScope.Root for back-compat).
- FlatteningService emits new ScriptScope(prefix, "") for each
composed script so a script defined on TempSensor composed under
a parent gets SelfPath = "TempSensor".
- ScriptActor reads the Scope from its ResolvedScript and forwards
it through ScriptExecutionActor into ScriptGlobals on each call.
RevisionHashService not touched: the per-script canonical name
already encodes the composition path, so any structural change
already flips the hash.
10 new unit tests on the path arithmetic. Site/Template engine
suites stay green (129 + 199).
Editor surface (Phase 3: metadata fetch, Phase 4: completion +
SCADA006 / SCADA007 diagnostics) follows in the next commits.
Restore inside the docker build was failing because TreatWarningsAsErrors
promotes NU1902/NU1903/NU1904 (vulnerable package warnings) to errors.
Bump the flagged packages to advisory-free versions:
- MailKit 4.15.1 -> 4.16.0 (GHSA-9j88-vvj5-vhgr)
- Microsoft.AspNetCore.DataProtection.EFCore 10.0.5 -> 10.0.7 (GHSA-9mv3-2cwr-p262, transitively pulls fixed System.Security.Cryptography.Xml — GHSA-37gx-xxp4-5rgx, GHSA-w3x6-4m5h-cxqf)
- OpenTelemetry.Api (transitive via Akka.Hosting) 1.9.0 -> 1.15.3 (GHSA-g94r-2vxg-569j, GHSA-8785-wc3w-h8q6) — added as a direct PackageReference in ScadaLink.Host to override the Akka.Hosting pin
To resolve the NU1605 downgrade chain triggered by DataProtection.EFCore
10.0.7 (which transitively requires Microsoft.EntityFrameworkCore >= 10.0.7
and friends), bump every Microsoft.* 10.0.5 reference across src/ and
tests/ to 10.0.7 in lockstep.
HandleConnectionQualityChanged now publishes AttributeValueChanged events
to the SiteStreamManager for all affected attributes. This ensures the
central UI debug view updates in real-time when a data connection
disconnects and attributes go bad quality.
Only publishes to the stream — does NOT notify script or alarm actors,
since the value hasn't changed and firing scripts/alarms on quality-only
changes would cause spurious evaluations.
Existing site databases created before the primary/backup data connections
feature lack the backup_configuration and failover_retry_count columns.
Added TryAddColumnAsync migration that runs on startup after table creation.
Replace raw dictionary casting with ScriptParameters wrapper that provides
Get<T>, Get<T?>, Get<T[]>, and Get<List<T>> with clear error messages,
numeric conversion, and JsonElement support for Inbound API parameters.
Thread backup data connection fields through management command messages,
ManagementActor handlers, SiteService, site-side SQLite storage, and
deployment/replication actors. The old --configuration CLI flag is kept
as a hidden alias for backwards compatibility.
Update CreateConnectionCommand to carry PrimaryConnectionDetails,
BackupConnectionDetails, and FailoverRetryCount. Update all callers:
DataConnectionManagerActor, DataConnectionActor, DeploymentManagerActor,
FlatteningService, and ConnectionConfig. The actor stores both configs
but continues using primary only — failover logic comes in Task 3.
Switch site host to WebApplicationBuilder with Kestrel HTTP/2 gRPC server,
add GrpcPort/keepalive config, wire SiteStreamManager as ISiteStreamSubscriber,
expose gRPC ports in docker-compose, add site seed script, update all 10
requirement docs + CLAUDE.md + README.md for the new dual-transport architecture.
ClusterClient Sender refs are temporary proxies — valid for immediate reply
but not durable for future Tells. Events now flow as DebugStreamEvent through
SiteCommunicationActor → ClusterClient → CentralCommunicationActor → bridge
actor (same pattern as health reports). Also fix DebugStreamHub to use
IHubContext for long-lived callbacks instead of transient hub instance.
- Add JoeAppEngine folder to OPC UA nodes.json (BTCS, AlarmCntsBySeverity, Scheduler/ScanTime)
- Fix DataConnectionActor: capture Self in PreStart for use from non-actor threads,
preventing Self.Tell failure in Disconnected event handler
- Implement InstanceActor.HandleConnectionQualityChanged to mark attributes Bad on disconnect
- Fix LmxFakeProxy TagMapper to serialize arrays as JSON instead of "System.Int32[]"
- Allow DataType and DataSourceReference updates in TemplateService.UpdateAttributeAsync
- Update test_infra_opcua.md with JoeAppEngine documentation
NotificationRepository.GetAllNotificationListsAsync() was missing
.Include(Recipients), causing artifact deployments to push empty recipient
lists to sites. Also load shared scripts from SQLite on DeploymentManager
startup so they're available before Instance Actors compile their scripts.
Completes the Inbound API → site script call chain by adding RouteToCallRequest
handlers in SiteCommunicationActor and DeploymentManagerActor. Also replaces the
placeholder dispatch table in InboundScriptExecutor with Roslyn compilation of
API method scripts at startup, enabling user-defined inbound API methods to call
instance scripts across the cluster.
Add SiteReplicationActor (runs on every site node) to replicate deployed
configs and store-and-forward buffer operations to the standby peer via
cluster member discovery and fire-and-forget Tell. Wire ReplicationService
handler and pass replication actor to DeploymentManagerActor singleton.
Fix 5 pre-existing ConfigurationDatabase test failures: RowVersion NOT NULL
on SQLite, stale migration name assertion, and seed data count mismatch.
Three runtime bugs fixed:
- DataConnectionActor: TagValueReceived/TagResolutionSucceeded/Failed not
handled in any Become state — OPC UA values went to dead letters. Added
initial read after subscribe to seed current values immediately.
- AlarmActor: ParseEvalConfig expected "attributeName"/"matchValue"/"min"/
"max" keys but seed data uses "attribute"/"value"/"high"/"low". Added
support for both conventions and !=prefix for not-equal matching.
- InstanceActor: snapshots reported all alarms (including unevaluated) with
correct priorities and source timestamps instead of current UTC. Removed
bogus Vibration template attribute that shadowed Speed's tag mapping.
Adds `debug snapshot --id <int>` to query a running instance's current
attribute values and alarm states without the subscribe/stream overhead
of the debug view. Routes through ManagementActor → CommunicationService
→ site DeploymentManager → InstanceActor using the existing remote query
pattern.
IServiceProvider now flows through the actor chain (DeploymentManagerActor
→ InstanceActor → ScriptActor → ScriptExecutionActor) so scripts can
resolve IExternalSystemClient, IDatabaseGateway, and
INotificationDeliveryService from DI. ScriptGlobals exposes ExternalSystem,
Database, Notify, and Scripts as top-level properties so scripts can use
them without the Instance. prefix.
Both nodes of a site cluster were sending health reports. The standby
node (without the DeploymentManager singleton) reported 0 instances and
no connections, overwriting the active node's data in the aggregator.
Added IsActiveNode flag to ISiteHealthCollector, set by
DeploymentManagerActor on PreStart/PostStop. HealthReportSender skips
sending when the node is not active. Also ensured EnsureDclConnections
is called during startup batch creation so data connections survive
container restarts.
UpdateInstanceCounts() was only called before Instance Actors were
created (in HandleStartupConfigsLoaded), showing 0 enabled on the
health dashboard. Now also called after each batch in
HandleStartNextBatch to reflect actual running actor count.
Wired ISiteHealthCollector calls for script errors (ScriptExecutionActor),
alarm eval errors (AlarmActor), dead letters (DeadLetterMonitorActor), and
S&F buffer depth placeholder. Added instance count tracking (deployed/
enabled/disabled) to SiteHealthReport via DeploymentManagerActor. Updated
Health Dashboard UI to show instance counts per site. All metrics flow
through the existing health report pipeline via ClusterClient.
DeploymentManagerActor deserialized connection config JSON as
Dictionary<string, string>, which silently failed on non-string values
like {"publishInterval":1000}. The OPC UA adapter then fell back to
localhost:4840 (unreachable in Docker). Now uses JsonDocument to handle
any JSON value type. OPC PLC Simulator connects successfully.
Attributes bound to data connections now initialize with "Uncertain" quality,
distinguishing "never received a value" from "known good" or "connection lost."
Quality is tracked per attribute and included in GetAttributeResponse.
- SiteExternalSystemRepository and SiteNotificationRepository registered in Site DI
- Removed AddConfigurationDatabase from Site role in Program.cs
- Removed ConfigurationDb from appsettings.Site.json
- ArtifactDeploymentService collects all 6 artifact types including data connections and SMTP
- Add TagValueUpdate/ConnectionQualityChanged handlers to InstanceActor
- InstanceActor subscribes to DCL on PreStart based on DataSourceReference
- DeploymentManagerActor creates DCL connections on deploy and passes DCL ref
- AkkaHostedService creates DCL Manager Actor for tag subscriptions
- Move CreateConnectionCommand to Commons for cross-project access
- Add ConnectionConfig to FlattenedConfiguration for deployment packaging
DeploymentManagerActor now handles SubscribeDebugViewRequest and
UnsubscribeDebugViewRequest by forwarding to the appropriate Instance Actor.
This completes the debug view data flow from Central UI through to the site's
Instance Actor snapshot. Reduced refresh interval to 2s for responsiveness.