asb: SampleInterval unit fix + F34 followup for Publish-decoder gap
rust / build / test / clippy / fmt (push) Has been cancelled

Investigation triggered by "Publish returns 0 values where .NET sees real
values" against the local AVEVA install.

Three findings:

1. SampleInterval unit: the wire field is **milliseconds**, not 100-ns
   ticks. The .NET reference (MxAsbDataClient.cs:441) defaults to
   `ulong sampleInterval = 1000` and the probe passes `subscribeSampleMs`
   directly through that surface. Sending 10_000_000 (1s in 100-ns ticks)
   makes MxDataProvider schedule the next sample ~2.8 hours out; Publish
   polls always come back empty until the misinterpreted timer expires.
   Fixed in `examples/asb-subscribe.rs` (sample_interval_ticks →
   sample_interval_ms = 1000) and clarified in
   `MinimalMonitoredItem.sample_interval`'s doc comment with the live-2026-05-06
   evidence.

2. result_code=32 is `AsbErrorCode.PublishComplete`
   (`AsbResultMapping.cs:37`) — informational, not a fatal error. .NET's
   `ToResult` (cs:122-129) explicitly treats it like Success.
   `ArchestrAResult.ErrorCode` and `ResultCode` are aliases for the same
   `resultCodeField` (cs:424-434), so `publish[i]_error=0x00000020` in
   the .NET probe trace = `result_code=Some(32)` in our trace = the same
   thing. Already handled correctly via the F26 narrower-bail fix
   (commit 983f029) — no code change needed.

3. **F34 filed** for the residual gap: with both sides seeing
   result_code=32 + success=false, .NET extracts a value but we extract
   zero. Three open hypotheses (wire-shape mismatch / payload-locator
   bug / MonitoredItemValue byte-layout bug); all need a middleman
   asb-relay.rs trace between the .NET probe and MxDataProvider to
   confirm. Adjacent symptom: AddMonitoredItemsResponse Status reads as
   0 items where .NET sees 1 — likely the same root cause; one fix
   should close both.

Live re-runs to validate the new sample-interval unit were blocked by
the documented F32 InvalidConnectionId transient (the
pending-connection table on MxDataProvider fills up after many
back-to-back test cycles; clears after a 30s+ cool-down).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-06 02:28:44 -04:00
parent 983f02921c
commit 0771664092
3 changed files with 56 additions and 3 deletions
+37
View File
@@ -74,6 +74,43 @@ move to `## Resolved` with a date + commit hash.
For the per-step body of every line listed in the cumulative execution log, see the matching commit message — each commit is a single F-number step with its own scope, fixtures, test count delta, and follow-up notes. The detailed per-step write-ups previously inlined here added little beyond what `git show <hash>` provides.
### F34 — Publish response: decoder reads 0 values where .NET sees real values
**Severity:** P2 — only affects the F26 stream's data flow against MxDataProvider; the canonical-XML signing for Publish itself is verified working (server accepts the request, returns a non-fault response).
**Source:** Live `cargo run -p mxaccess --example asb-subscribe` against the local AVEVA install, 2026-05-06.
**Evidence.** Side-by-side comparison after the .NET probe and Rust client both ran `--subscribe` against the same `TestChildObject.TestInt` tag:
```
.NET probe (working):
publish[0]_error=0x00000020 status=0x00000000 specific=0x00000000
published_event[0]=item:TestChildObject.TestInt id:18446462598732840962
type:4 quality:0x00C0 timestamp:... preview:99
publish[0]_value[0]=item: id:... id_specified:True type:4 length:4
payload_len:4 preview:99
Rust client (this followup):
poll 0: 0 value(s); result_code=Some(32) success=Some(false)
poll 1: 0 value(s); result_code=Some(32) success=Some(false)
... (8 polls total, 0 values each)
```
Both sides see the same `result_code=32` (= `AsbErrorCode.PublishComplete`, informational per `AsbResultMapping.cs:37` + `ToResult:122-129` which treats it like `Success`). Both sides see `successField=false`. **But .NET extracts a value while we extract zero.**
**Confirmed not the cause:**
-`SampleInterval` unit (was 10_000_000 in example, fixed to `1000` ms — matches `MxAsbDataClient.cs:441`'s `ulong sampleInterval = 1000` default).
- ✅ Treating `result_code=32` as informational (the F26 narrower-bail fix matches the .NET logic exactly).
- ✅ Canonical-XML signing of the `PublishRequest` (the server returns a typed response, not a SOAP fault).
-`MonitoredItemValue` IS `IAsbCustomSerializableType` (`AsbContracts.cs:1035`), so it travels via the `<Values><ASBIData>{bytes}</ASBIData></Values>` binary fast-path same as `Read`'s Values path which works.
**Open hypotheses (each needs wire-capture verification via `examples/asb-relay.rs` middleman with the .NET probe):**
1. Wire shape mismatch: server may emit Values via a different element/namespace than `<Values>` in this case (some kind of dynamic dict id we don't resolve), so our `collect_asbidata_payloads` walking by name misses it.
2. Status array shape: a 0-length Status (empty array, 4-byte `count=0`) may render as a populated `<ASBIData>` containing just `[0,0,0,0]`; our decoder reads `payloads[0]` as Status (correctly empty) and `payloads[1]` as Values — but if the server's wire only has 1 ASBIData (Status was omitted entirely, not empty-but-present), `payloads[1]` would BE the Values payload getting mis-read as Status.
3. `decode_monitored_item_value_array` (which builds on `MonitoredItemValue::decode` at `contracts.rs:277`) may have a subtle bug — `MonitoredItemValue` carries `ItemIdentity + RuntimeValue + AsbVariant userdata` per `cs:1064-1068`; if the wire bytes don't match that order or include extra fields, the count read at offset 0 would be wrong.
**Resolves when:** A middleman trace via `examples/asb-relay.rs` between the .NET probe and MxDataProvider for a working subscribe-flow exposes the actual wire bytes of a `PublishResponse` carrying values. Compare against `decode_publish_response`'s expectations and reconcile (likely a `MonitoredItemValue` byte-layout fix or a Values-payload-locator fix).
**Adjacent observation worth noting:** `AddMonitoredItemsResponse` shows the same symptom shape — our trace reports `add status: 0 item(s); result_code=Some(0) success=Some(true)` while the .NET probe reports `add_monitored_status[0]=item:TestChildObject.TestInt id:18446462598732840962 ...`. Same `IAsbCustomSerializableType`-wrapped Status array; same "0 items where .NET sees 1". These two decoders likely fail for the same root reason and a single fix should close both.
### F27 — Constant-time DH `mod_exp` (swap `num-bigint` → `crypto-bigint::BoxedUint`)
**Severity:** P2 (security regression vs the long-term Rust target — but at parity with the .NET reference today, so not a release-blocker)
**Source:** F23 (`crates/mxaccess-asb-nettcp/src/auth.rs:179,303`); originally flagged in `design/30-crate-topology.md:269-274` and the project's `review.md` MAJOR finding.
@@ -855,9 +855,19 @@ pub fn build_add_monitored_items_request_body(
/// also has optional Active, TimeDeadband, ValueDeadband, and UserData
/// fields. Those are deferred to a later F25 iteration once a live
/// capture confirms the wire-byte form.
///
/// **`sample_interval` unit is milliseconds**, NOT 100-ns ticks. The
/// .NET reference's `MxAsbDataClient.AddMonitoredItems` defaults to
/// `ulong sampleInterval = 1000` (= 1 second), passed straight to the
/// wire (`MxAsbDataClient.cs:441`). Sending tick-units (e.g.
/// `10_000_000` for "1 second in 100-ns ticks") makes MxDataProvider
/// schedule the next sample ~2.8 hours out — `Publish` polls then
/// always come back empty until the misinterpreted timer expires.
/// Verified live 2026-05-06.
#[derive(Debug, Clone, PartialEq)]
pub struct MinimalMonitoredItem {
pub item: ItemIdentity,
/// Sample interval in **milliseconds** (matches the .NET wire form).
pub sample_interval: u64,
pub buffered: bool,
}
@@ -132,9 +132,15 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
// -- Subscribe-flow ----------------------------------------------------
if env.run_subscribe {
eprintln!("creating subscription [canonical XML CreateSubscription] (max_queue=100, sample=1s)");
let sample_interval_ticks: u64 = 10_000_000; // 1 second
// SampleInterval is in **milliseconds** on the wire — the .NET
// reference's `MxAsbDataClient.CreateSubscription` /
// `AddMonitoredItems` default is `ulong sampleInterval = 1000`
// (`MxAsbDataClient.cs:396,441`). Sending 10_000_000 here would
// be interpreted as ~2.8 hours between samples and the publish
// poll would always come back empty.
let sample_interval_ms: u64 = 1000;
let max_queue_size: i64 = 100;
let sub_response = match client.create_subscription(max_queue_size, sample_interval_ticks).await {
let sub_response = match client.create_subscription(max_queue_size, sample_interval_ms).await {
Ok(r) => r,
Err(e) => {
eprintln!(" create_subscription failed: {e}");
@@ -151,7 +157,7 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
let monitored = vec![MinimalMonitoredItem::new(
ItemIdentity::absolute_by_name(&env.tag),
sample_interval_ticks,
sample_interval_ms,
)];
eprintln!("adding monitored items [canonical XML AddMonitoredItems]");