Files
ScadaBridge/docs/plans/2026-06-17-waitfor-attribute-change-helper-spec.md
T

14 KiB
Raw Blame History

Patch request — event-driven "wait for attribute change (with timeout)" script helper

Date: 2026-06-17 Type: Source enhancement (small, additive) to the SiteRuntime script surface Why now: the DELMIA/MES receiver re-implementation (2026-06-17-delmia-mes-receiver-templates-design.md, §9 risk #1) currently has to busy-poll for the handshake completion flag. This spec describes the gap and a precise, patch-ready design for a host-provided WaitAsync helper so scripts can wait event-driven for a tag/attribute to reach a value, bounded by a timeout.

All file paths, line numbers, message records, and signatures below were read from source on 2026-06-17. Treat line numbers as guides (they drift); the type/method names are the anchors.


1. The gap

The receiver handshake (and any request/response tag interaction) needs to wait until a data-sourced attribute reaches a value — e.g. wait up to 30 s for RecipeProcessedFlag == true or MoveInCompleteFlag == true after setting the trigger flag.

ScadaBridge's script surface today has read (Attributes.GetAsync / indexer) and write (Attributes.SetAsync / indexer), but no "wait for value" primitive. The only way to wait is a manual poll loop:

// current workaround — every handshake script repeats this
var deadline = DateTime.UtcNow.AddSeconds(30);
while (DateTime.UtcNow < deadline && !CancellationToken.IsCancellationRequested)
{
    if ((bool?)(await Attributes.GetAsync("RecipeProcessedFlag")) == true) break;
    await Task.Delay(200, CancellationToken);
}

Why this is unsatisfactory:

  • Latency — completion is detected up to one poll interval late (200 ms here).
  • Wasted work — each iteration is an actor Ask (GetAttributeRequest round-trip to the InstanceActor); N handshakes × M polls = a lot of needless messages.
  • Boilerplate — the same loop is copy-pasted into every handshake script, easy to get wrong (forgetting CancellationToken, off-by-one on the deadline, not handling quality).
  • No quality awareness — the poll reads whatever value is cached regardless of OPC/MX quality.

Crucially, the data is already being pushed to the actor that owns it. A data-sourced attribute's value arrives from the DCL and is applied in the InstanceActor, which then raises AttributeValueChanged. So an event-driven waiter is natural and removes the poll entirely.


2. Where the change goes (verified wiring)

Concern Type / file Notes
Change notification AttributeValueChanged(InstanceUniqueName, AttributePath, AttributeName, Value, Quality, Timestamp)src/ZB.MOM.WW.ScadaBridge.Commons/Messages/Streaming/AttributeValueChanged.cs raised on every change
Single choke point InstanceActor.HandleAttributeValueChanged(...)src/…/SiteRuntime/Actors/InstanceActor.cs both static writes (HandleSetStaticAttributeCore) and DCL/subscription updates (HandleTagValueUpdateTagValueUpdate) funnel through here, then PublishAndNotifyChildren
Owner of state InstanceActor (_attributes, _attributeQualities, _attributeTimestamps) single-threaded — registration + current-value check is atomic here
Script read path AttributeAccessor (ScopeAccessors.cs) → ScriptRuntimeContext.GetAttributeAsk<GetAttributeResponse>(GetAttributeRequest) the helper mirrors this
Script globals build ScriptExecutionActor (src/…/SiteRuntime/Actors/ScriptExecutionActor.cs) builds ScriptRuntimeContext (passes instanceActor, self, _askTimeout) and ScriptGlobals (CancellationToken = cts.Token from the per-script timeout) the script timeout token is NOT currently passed into ScriptRuntimeContext — this patch must thread it in
Helper idiom ScriptRuntimeContext nested helpers (e.g. ExternalSystemHelper) — ctor deps stored as readonly fields, exposed via an on-demand property follow this idiom
Trust model ScriptTrustPolicy (src/…/ScriptAnalysis/) System.Threading.Tasks + CancellationToken/CancellationTokenSource are in AllowedExceptions; lambdas/Func<> are fine. No trust change needed — the wait runs in host code; the script just awaits a provided method.

Design principle: do the wait inside the InstanceActor as a one-shot registered waiter, not in the script via polling. Because the actor is single-threaded and HandleAttributeValueChanged is the one place every change passes, a waiter that (a) checks the current value on registration and (b) is re-evaluated on each change cannot miss the edge between "read current" and "subscribe".


3. Proposed API (script-facing)

Add to the Attributes accessor (AttributeAccessor in ScopeAccessors.cs), so scope/composition path resolution (Resolve(name)) applies just like get/set:

// Wait until `name` equals targetValue (value-equality, codec-normalized). Returns true if matched
// within the timeout, false if it timed out. Honors the script CancellationToken.
Task<bool> Attributes.WaitAsync(string name, object? targetValue, TimeSpan timeout);

// Predicate form — site-local template scripts only (predicate is an in-process delegate).
Task<bool> Attributes.WaitAsync(string name, Func<object?, bool> predicate, TimeSpan timeout);

// Optional richer overload that also returns the matched value + quality.
Task<WaitResult> Attributes.WaitForAsync(string name, object? targetValue, TimeSpan timeout);
// record WaitResult(bool Matched, object? Value, string Quality, bool TimedOut);

Return bool (not throw) for the common case — the handshake wants matched/timed-out, not an exception. The value-equality overload is the one the handshake needs and is the one that can also be exposed on the inbound/routed side (§6), because a value serializes and a delegate does not.

Handshake, rewritten (replaces the §1 poll loop):

await Attributes.SetAsync("RecipeDownloadFlag", true);                 // trigger
var ok = await Attributes.WaitAsync("RecipeProcessedFlag", true, TimeSpan.FromSeconds(30));
if (!ok) return new { Result = false, ResultText = "Timeout waiting for recipe to be processed" };
return new {
    Result     = (bool?)(await Attributes.GetAsync("RecipeProcessResult")) ?? false,
    ResultText = (string?)(await Attributes.GetAsync("RecipeProcessResultText")) ?? ""
};
await Attributes.SetAsync("MoveInFlag", true);
var ok = await Attributes.WaitAsync("MoveInCompleteFlag", true, TimeSpan.FromSeconds(30));
// … read MoveInSuccessfulFlag / MoveInErrorText / MoveInBatchID …

4. Implementation outline (the patch)

4.1 New messages (src/ZB.MOM.WW.ScadaBridge.Commons/Messages/…)

// actor protocol (site-local; delegate is fine because messaging is in-process)
public record WaitForAttributeRequest(
    string  CorrelationId,
    string  InstanceName,
    string  AttributeName,            // already scope-resolved by the accessor
    string? TargetValueEncoded,       // AttributeValueCodec.Encode(targetValue); null = "any change"
    Func<object?, bool>? Predicate,   // local-only; null when TargetValueEncoded is used
    TimeSpan Timeout,
    DateTimeOffset OccurredAtUtc);

public record WaitForAttributeResponse(
    string CorrelationId,
    bool   Matched,
    object? Value,
    string Quality,
    bool   TimedOut,
    string? ErrorMessage = null);

// internal self-message used to fire the timeout
public record WaitForAttributeTimeout(string CorrelationId);

4.2 InstanceActor (src/…/SiteRuntime/Actors/InstanceActor.cs)

  • Add a registry: Dictionary<string, PendingWait> _attributeWaiters keyed by CorrelationId, where PendingWait holds the attribute name, the match test (decoded target value or predicate), the original Sender (IActorRef), and the scheduled ICancelable timeout handle.
  • Handle WaitForAttributeRequest:
    1. Build the match test (decode TargetValueEncoded via AttributeValueCodec → equality test, or use Predicate).
    2. Fast path: if the current _attributes[name] already satisfies the test, reply WaitForAttributeResponse(Matched: true, Value, Quality) immediately and return.
    3. Otherwise register the waiter and schedule the timeout: Context.System.Scheduler.ScheduleTellOnce(effectiveTimeout, Self, new WaitForAttributeTimeout(cid), Self), storing the returned ICancelable. Capture Sender now (it is invalid later).
    4. Bound effectiveTimeout = min(request.Timeout, requestDeadlineFromCaller) (the caller's Ask already carries the script token; see §4.3). Optionally cap the number of concurrent waiters per instance (defensive; reply with ErrorMessage if exceeded).
  • In HandleAttributeValueChanged (after state is updated): iterate _attributeWaiters whose attribute matches the changed AttributeName; for any whose test now passes, cancel its timeout, reply WaitForAttributeResponse(Matched: true, …), and remove it. (Iterate over a snapshot to allow removal during enumeration.)
  • Handle WaitForAttributeTimeout: if still registered, reply WaitForAttributeResponse(Matched: false, TimedOut: true) and remove.
  • Optional: a quality == "Good"-only mode (parameter on the request) if a handshake must ignore Bad-quality transients.

4.3 ScriptRuntimeContext (src/…/SiteRuntime/Scripts/ScriptRuntimeContext.cs)

  • Thread the script timeout token in. Add a CancellationToken scriptTimeoutToken constructor parameter (today only _askTimeout is available to helpers; the per-script cts.Token is not passed). ScriptExecutionActor already has cts.Token — pass it when constructing the context.
  • Add a method that the accessor calls:
    public async Task<bool> WaitAttribute(string name, string? targetValueEncoded,
                                          Func<object?,bool>? predicate, TimeSpan timeout)
    {
        var cid = Guid.NewGuid().ToString();
        var req = new WaitForAttributeRequest(cid, _instanceName, name, targetValueEncoded,
                                              predicate, timeout, DateTimeOffset.UtcNow);
        // Ask bounded by the script timeout token so a script-deadline abort cancels the await.
        var resp = await _instanceActor.Ask<WaitForAttributeResponse>(
                       req, timeout + _askTimeout /* small slack */, _scriptTimeoutToken);
        return resp.Matched;
    }
    

4.4 ScriptExecutionActor (src/…/SiteRuntime/Actors/ScriptExecutionActor.cs)

  • Pass cts.Token (the per-script timeout, created at the new CancellationTokenSource(timeout) site) into the new ScriptRuntimeContext constructor parameter from §4.3.

4.5 AttributeAccessor (src/…/SiteRuntime/Scripts/ScopeAccessors.cs)

public Task<bool> WaitAsync(string key, object? targetValue, TimeSpan timeout)
    => _ctx.WaitAttribute(Resolve(key), AttributeValueCodec.Encode(targetValue), null, timeout);

public Task<bool> WaitAsync(string key, Func<object?, bool> predicate, TimeSpan timeout)
    => _ctx.WaitAttribute(Resolve(key), null, predicate, timeout);

4.6 Trust model — no change

WaitAsync is a host-provided async method; the wait/scheduling happens in host code. The script only awaits it and may pass a Func<> (a normal closure, not reflection). System.Threading.Tasks

  • CancellationToken are already in ScriptTrustPolicy.AllowedExceptions. Verify the new helper type/members don't collide with ForbiddenIdentifiers (dynamic, Activator) — they don't.

5. Correctness notes

  • No missed edge. Registration (current-value check) and change-handling both run on the InstanceActor's single thread, so a value that flips between "set trigger" and "register waiter" is caught by the fast-path check; a value that flips after registration is caught by HandleAttributeValueChanged. The poll-loop and this design are both correct; this one is event-driven and cheaper.
  • Timeout is authoritative and self-cleaning. The scheduled WaitForAttributeTimeout guarantees the waiter is removed and the caller answered even if the value never changes. Match cancels the scheduled timeout.
  • Cancellation. Bounding the helper Ask with the script timeout token means a script that hits its own ExecutionTimeoutSeconds abandons the wait; pair with a best-effort cancel message to the actor to evict the orphan waiter promptly (otherwise it self-evicts at its own timeout).
  • Concurrency / re-entrancy. Multiple waiters per instance are fine (keyed by CorrelationId). Consider a per-instance cap as a guard against a script leaking waiters in a loop.

6. Optional: inbound / routed variant

For symmetry with RouteTarget.GetAttributes (src/…/InboundAPI/RouteHelper.cs), an inbound script could call Route.To(code).WaitForAttribute(name, targetValue, timeout). Mirror the existing routed pattern: add RouteToWaitForAttributeRequest/Response, an IInstanceRouter.RouteToWaitForAttributeAsync method, and unpack it on the site comms actor into the same WaitForAttributeRequest to the InstanceActor. Value-equality only across the wire — a Func<> predicate cannot be serialized, so the routed form takes the encoded target value (the predicate overload stays site-local). This is optional: the receiver handshake runs inside the template script (site-local), so §3–§5 alone fully cover the DELMIA/MES use case.


7. Acceptance criteria

  1. A template script can await Attributes.WaitAsync("Flag", true, TimeSpan.FromSeconds(30)) and it returns true promptly when the data-sourced attribute reaches true (driven by a DCL update), with no poll loop.
  2. Returns false (no throw) when the value never matches within the timeout.
  3. The wait is bounded by the script's own ExecutionTimeoutSeconds (a shorter script deadline wins).
  4. No AttributeValueChanged edge is missed across the register/change boundary (unit test: flip the value in the same actor step as registration, and one step after).
  5. Waiters are removed on match and on timeout (no leak; assert registry empty afterward).
  6. Scope/composition path resolution works (Children["DelmiaReceiver"]-scoped wait resolves to the composed child's attribute).
  7. Passes ScriptAnalysis trust validation unchanged.
  8. The DELMIA/MES handshake base scripts (design doc §4) compile and pass using WaitAsync in place of the poll loop.

Suggested tests: extend InstanceActor tests (waiter fast-path, change-match, timeout, removal) and the script-surface tests under tests/…/SiteRuntime*.