252 lines
14 KiB
Markdown
252 lines
14 KiB
Markdown
# Patch request — event-driven "wait for attribute change (with timeout)" script helper
|
||
|
||
**Date:** 2026-06-17
|
||
**Type:** Source enhancement (small, additive) to the SiteRuntime script surface
|
||
**Why now:** the DELMIA/MES receiver re-implementation
|
||
([`2026-06-17-delmia-mes-receiver-templates-design.md`](2026-06-17-delmia-mes-receiver-templates-design.md), §9 risk #1)
|
||
currently has to **busy-poll** for the handshake completion flag. This spec describes the gap
|
||
and a precise, patch-ready design for a host-provided `WaitAsync` helper so scripts can wait
|
||
**event-driven** for a tag/attribute to reach a value, bounded by a timeout.
|
||
|
||
> All file paths, line numbers, message records, and signatures below were read from source on
|
||
> 2026-06-17. Treat line numbers as guides (they drift); the type/method names are the anchors.
|
||
|
||
---
|
||
|
||
## 1. The gap
|
||
|
||
The receiver handshake (and any request/response tag interaction) needs to **wait until a
|
||
data-sourced attribute reaches a value** — e.g. wait up to 30 s for `RecipeProcessedFlag == true`
|
||
or `MoveInCompleteFlag == true` after setting the trigger flag.
|
||
|
||
ScadaBridge's script surface today has **read** (`Attributes.GetAsync` / indexer) and **write**
|
||
(`Attributes.SetAsync` / indexer), but **no "wait for value" primitive**. The only way to wait is
|
||
a manual poll loop:
|
||
|
||
```csharp
|
||
// current workaround — every handshake script repeats this
|
||
var deadline = DateTime.UtcNow.AddSeconds(30);
|
||
while (DateTime.UtcNow < deadline && !CancellationToken.IsCancellationRequested)
|
||
{
|
||
if ((bool?)(await Attributes.GetAsync("RecipeProcessedFlag")) == true) break;
|
||
await Task.Delay(200, CancellationToken);
|
||
}
|
||
```
|
||
|
||
Why this is unsatisfactory:
|
||
|
||
- **Latency** — completion is detected up to one poll interval late (200 ms here).
|
||
- **Wasted work** — each iteration is an actor `Ask` (`GetAttributeRequest` round-trip to the
|
||
`InstanceActor`); N handshakes × M polls = a lot of needless messages.
|
||
- **Boilerplate** — the same loop is copy-pasted into every handshake script, easy to get wrong
|
||
(forgetting `CancellationToken`, off-by-one on the deadline, not handling quality).
|
||
- **No quality awareness** — the poll reads whatever value is cached regardless of OPC/MX quality.
|
||
|
||
Crucially, **the data is already being pushed to the actor that owns it.** A data-sourced
|
||
attribute's value arrives from the DCL and is applied in the `InstanceActor`, which then raises
|
||
`AttributeValueChanged`. So an event-driven waiter is natural and removes the poll entirely.
|
||
|
||
---
|
||
|
||
## 2. Where the change goes (verified wiring)
|
||
|
||
| Concern | Type / file | Notes |
|
||
|---|---|---|
|
||
| Change notification | `AttributeValueChanged(InstanceUniqueName, AttributePath, AttributeName, Value, Quality, Timestamp)` — `src/ZB.MOM.WW.ScadaBridge.Commons/Messages/Streaming/AttributeValueChanged.cs` | raised on **every** change |
|
||
| **Single choke point** | `InstanceActor.HandleAttributeValueChanged(...)` — `src/…/SiteRuntime/Actors/InstanceActor.cs` | both static writes (`HandleSetStaticAttributeCore`) **and** DCL/subscription updates (`HandleTagValueUpdate` ← `TagValueUpdate`) funnel through here, then `PublishAndNotifyChildren` |
|
||
| Owner of state | `InstanceActor` (`_attributes`, `_attributeQualities`, `_attributeTimestamps`) | **single-threaded** — registration + current-value check is atomic here |
|
||
| Script read path | `AttributeAccessor` (`ScopeAccessors.cs`) → `ScriptRuntimeContext.GetAttribute` → `Ask<GetAttributeResponse>(GetAttributeRequest)` | the helper mirrors this |
|
||
| Script globals build | `ScriptExecutionActor` (`src/…/SiteRuntime/Actors/ScriptExecutionActor.cs`) builds `ScriptRuntimeContext` (passes `instanceActor`, `self`, `_askTimeout`) and `ScriptGlobals` (`CancellationToken = cts.Token` from the per-script timeout) | **the script timeout token is NOT currently passed into `ScriptRuntimeContext`** — this patch must thread it in |
|
||
| Helper idiom | `ScriptRuntimeContext` nested helpers (e.g. `ExternalSystemHelper`) — ctor deps stored as readonly fields, exposed via an on-demand property | follow this idiom |
|
||
| Trust model | `ScriptTrustPolicy` (`src/…/ScriptAnalysis/`) | `System.Threading.Tasks` + `CancellationToken`/`CancellationTokenSource` are in `AllowedExceptions`; lambdas/`Func<>` are fine. **No trust change needed** — the wait runs in host code; the script just `await`s a provided method. |
|
||
|
||
**Design principle:** do the wait **inside the `InstanceActor`** as a one-shot registered waiter,
|
||
not in the script via polling. Because the actor is single-threaded and `HandleAttributeValueChanged`
|
||
is the one place every change passes, a waiter that (a) checks the current value on registration and
|
||
(b) is re-evaluated on each change **cannot miss the edge** between "read current" and "subscribe".
|
||
|
||
---
|
||
|
||
## 3. Proposed API (script-facing)
|
||
|
||
Add to the `Attributes` accessor (`AttributeAccessor` in `ScopeAccessors.cs`), so scope/composition
|
||
path resolution (`Resolve(name)`) applies just like get/set:
|
||
|
||
```csharp
|
||
// Wait until `name` equals targetValue (value-equality, codec-normalized). Returns true if matched
|
||
// within the timeout, false if it timed out. Honors the script CancellationToken.
|
||
Task<bool> Attributes.WaitAsync(string name, object? targetValue, TimeSpan timeout);
|
||
|
||
// Predicate form — site-local template scripts only (predicate is an in-process delegate).
|
||
Task<bool> Attributes.WaitAsync(string name, Func<object?, bool> predicate, TimeSpan timeout);
|
||
|
||
// Optional richer overload that also returns the matched value + quality.
|
||
Task<WaitResult> Attributes.WaitForAsync(string name, object? targetValue, TimeSpan timeout);
|
||
// record WaitResult(bool Matched, object? Value, string Quality, bool TimedOut);
|
||
```
|
||
|
||
Return **bool** (not throw) for the common case — the handshake wants matched/timed-out, not an
|
||
exception. The value-equality overload is the one the handshake needs and is the one that can also
|
||
be exposed on the inbound/routed side (§6), because a value serializes and a delegate does not.
|
||
|
||
Handshake, rewritten (replaces the §1 poll loop):
|
||
|
||
```csharp
|
||
await Attributes.SetAsync("RecipeDownloadFlag", true); // trigger
|
||
var ok = await Attributes.WaitAsync("RecipeProcessedFlag", true, TimeSpan.FromSeconds(30));
|
||
if (!ok) return new { Result = false, ResultText = "Timeout waiting for recipe to be processed" };
|
||
return new {
|
||
Result = (bool?)(await Attributes.GetAsync("RecipeProcessResult")) ?? false,
|
||
ResultText = (string?)(await Attributes.GetAsync("RecipeProcessResultText")) ?? ""
|
||
};
|
||
```
|
||
|
||
```csharp
|
||
await Attributes.SetAsync("MoveInFlag", true);
|
||
var ok = await Attributes.WaitAsync("MoveInCompleteFlag", true, TimeSpan.FromSeconds(30));
|
||
// … read MoveInSuccessfulFlag / MoveInErrorText / MoveInBatchID …
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Implementation outline (the patch)
|
||
|
||
### 4.1 New messages (`src/ZB.MOM.WW.ScadaBridge.Commons/Messages/…`)
|
||
```csharp
|
||
// actor protocol (site-local; delegate is fine because messaging is in-process)
|
||
public record WaitForAttributeRequest(
|
||
string CorrelationId,
|
||
string InstanceName,
|
||
string AttributeName, // already scope-resolved by the accessor
|
||
string? TargetValueEncoded, // AttributeValueCodec.Encode(targetValue); null = "any change"
|
||
Func<object?, bool>? Predicate, // local-only; null when TargetValueEncoded is used
|
||
TimeSpan Timeout,
|
||
DateTimeOffset OccurredAtUtc);
|
||
|
||
public record WaitForAttributeResponse(
|
||
string CorrelationId,
|
||
bool Matched,
|
||
object? Value,
|
||
string Quality,
|
||
bool TimedOut,
|
||
string? ErrorMessage = null);
|
||
|
||
// internal self-message used to fire the timeout
|
||
public record WaitForAttributeTimeout(string CorrelationId);
|
||
```
|
||
|
||
### 4.2 `InstanceActor` (`src/…/SiteRuntime/Actors/InstanceActor.cs`)
|
||
- Add a registry: `Dictionary<string, PendingWait> _attributeWaiters` keyed by `CorrelationId`, where
|
||
`PendingWait` holds the attribute name, the match test (decoded target value **or** predicate),
|
||
the original `Sender` (`IActorRef`), and the scheduled `ICancelable` timeout handle.
|
||
- **Handle `WaitForAttributeRequest`:**
|
||
1. Build the match test (decode `TargetValueEncoded` via `AttributeValueCodec` → equality test, or
|
||
use `Predicate`).
|
||
2. **Fast path:** if the current `_attributes[name]` already satisfies the test, reply
|
||
`WaitForAttributeResponse(Matched: true, Value, Quality)` immediately and return.
|
||
3. Otherwise register the waiter and schedule the timeout:
|
||
`Context.System.Scheduler.ScheduleTellOnce(effectiveTimeout, Self, new WaitForAttributeTimeout(cid), Self)`,
|
||
storing the returned `ICancelable`. Capture `Sender` now (it is invalid later).
|
||
4. Bound `effectiveTimeout = min(request.Timeout, requestDeadlineFromCaller)` (the caller's `Ask`
|
||
already carries the script token; see §4.3). Optionally cap the number of concurrent waiters
|
||
per instance (defensive; reply with `ErrorMessage` if exceeded).
|
||
- **In `HandleAttributeValueChanged` (after state is updated):** iterate `_attributeWaiters` whose
|
||
attribute matches the changed `AttributeName`; for any whose test now passes, cancel its timeout,
|
||
reply `WaitForAttributeResponse(Matched: true, …)`, and remove it. (Iterate over a snapshot to
|
||
allow removal during enumeration.)
|
||
- **Handle `WaitForAttributeTimeout`:** if still registered, reply
|
||
`WaitForAttributeResponse(Matched: false, TimedOut: true)` and remove.
|
||
- Optional: a `quality == "Good"`-only mode (parameter on the request) if a handshake must ignore
|
||
Bad-quality transients.
|
||
|
||
### 4.3 `ScriptRuntimeContext` (`src/…/SiteRuntime/Scripts/ScriptRuntimeContext.cs`)
|
||
- **Thread the script timeout token in.** Add a `CancellationToken scriptTimeoutToken` constructor
|
||
parameter (today only `_askTimeout` is available to helpers; the per-script `cts.Token` is **not**
|
||
passed). `ScriptExecutionActor` already has `cts.Token` — pass it when constructing the context.
|
||
- Add a method that the accessor calls:
|
||
```csharp
|
||
public async Task<bool> WaitAttribute(string name, string? targetValueEncoded,
|
||
Func<object?,bool>? predicate, TimeSpan timeout)
|
||
{
|
||
var cid = Guid.NewGuid().ToString();
|
||
var req = new WaitForAttributeRequest(cid, _instanceName, name, targetValueEncoded,
|
||
predicate, timeout, DateTimeOffset.UtcNow);
|
||
// Ask bounded by the script timeout token so a script-deadline abort cancels the await.
|
||
var resp = await _instanceActor.Ask<WaitForAttributeResponse>(
|
||
req, timeout + _askTimeout /* small slack */, _scriptTimeoutToken);
|
||
return resp.Matched;
|
||
}
|
||
```
|
||
|
||
### 4.4 `ScriptExecutionActor` (`src/…/SiteRuntime/Actors/ScriptExecutionActor.cs`)
|
||
- Pass `cts.Token` (the per-script timeout, created at the `new CancellationTokenSource(timeout)`
|
||
site) into the new `ScriptRuntimeContext` constructor parameter from §4.3.
|
||
|
||
### 4.5 `AttributeAccessor` (`src/…/SiteRuntime/Scripts/ScopeAccessors.cs`)
|
||
```csharp
|
||
public Task<bool> WaitAsync(string key, object? targetValue, TimeSpan timeout)
|
||
=> _ctx.WaitAttribute(Resolve(key), AttributeValueCodec.Encode(targetValue), null, timeout);
|
||
|
||
public Task<bool> WaitAsync(string key, Func<object?, bool> predicate, TimeSpan timeout)
|
||
=> _ctx.WaitAttribute(Resolve(key), null, predicate, timeout);
|
||
```
|
||
|
||
### 4.6 Trust model — no change
|
||
`WaitAsync` is a host-provided async method; the wait/scheduling happens in host code. The script
|
||
only `await`s it and may pass a `Func<>` (a normal closure, not reflection). `System.Threading.Tasks`
|
||
+ `CancellationToken` are already in `ScriptTrustPolicy.AllowedExceptions`. Verify the new helper
|
||
type/members don't collide with `ForbiddenIdentifiers` (`dynamic`, `Activator`) — they don't.
|
||
|
||
---
|
||
|
||
## 5. Correctness notes
|
||
|
||
- **No missed edge.** Registration (current-value check) and change-handling both run on the
|
||
`InstanceActor`'s single thread, so a value that flips between "set trigger" and "register waiter"
|
||
is caught by the fast-path check; a value that flips after registration is caught by
|
||
`HandleAttributeValueChanged`. The poll-loop and this design are both correct; this one is
|
||
event-driven and cheaper.
|
||
- **Timeout is authoritative and self-cleaning.** The scheduled `WaitForAttributeTimeout` guarantees
|
||
the waiter is removed and the caller answered even if the value never changes. Match cancels the
|
||
scheduled timeout.
|
||
- **Cancellation.** Bounding the helper `Ask` with the script timeout token means a script that hits
|
||
its own `ExecutionTimeoutSeconds` abandons the wait; pair with a best-effort cancel message to the
|
||
actor to evict the orphan waiter promptly (otherwise it self-evicts at its own timeout).
|
||
- **Concurrency / re-entrancy.** Multiple waiters per instance are fine (keyed by `CorrelationId`).
|
||
Consider a per-instance cap as a guard against a script leaking waiters in a loop.
|
||
|
||
---
|
||
|
||
## 6. Optional: inbound / routed variant
|
||
|
||
For symmetry with `RouteTarget.GetAttributes` (`src/…/InboundAPI/RouteHelper.cs`), an inbound script
|
||
could call `Route.To(code).WaitForAttribute(name, targetValue, timeout)`. Mirror the existing routed
|
||
pattern: add `RouteToWaitForAttributeRequest/Response`, an `IInstanceRouter.RouteToWaitForAttributeAsync`
|
||
method, and unpack it on the site comms actor into the same `WaitForAttributeRequest` to the
|
||
`InstanceActor`. **Value-equality only** across the wire — a `Func<>` predicate cannot be serialized,
|
||
so the routed form takes the encoded target value (the predicate overload stays site-local). This is
|
||
optional: the receiver handshake runs **inside** the template script (site-local), so §3–§5 alone
|
||
fully cover the DELMIA/MES use case.
|
||
|
||
---
|
||
|
||
## 7. Acceptance criteria
|
||
|
||
1. A template script can `await Attributes.WaitAsync("Flag", true, TimeSpan.FromSeconds(30))` and it
|
||
returns `true` promptly when the data-sourced attribute reaches `true` (driven by a DCL update),
|
||
with no poll loop.
|
||
2. Returns `false` (no throw) when the value never matches within the timeout.
|
||
3. The wait is bounded by the script's own `ExecutionTimeoutSeconds` (a shorter script deadline wins).
|
||
4. No `AttributeValueChanged` edge is missed across the register/change boundary (unit test: flip the
|
||
value in the same actor step as registration, and one step after).
|
||
5. Waiters are removed on match and on timeout (no leak; assert registry empty afterward).
|
||
6. Scope/composition path resolution works (`Children["DelmiaReceiver"]`-scoped wait resolves to the
|
||
composed child's attribute).
|
||
7. Passes `ScriptAnalysis` trust validation unchanged.
|
||
8. The DELMIA/MES handshake base scripts (design doc §4) compile and pass using `WaitAsync` in place
|
||
of the poll loop.
|
||
|
||
Suggested tests: extend `InstanceActor` tests (waiter fast-path, change-match, timeout, removal) and
|
||
the script-surface tests under `tests/…/SiteRuntime*`.
|
||
```
|