design/followups: move F34 to Resolved (live-verified closure)

The F34 entry's body had grown into a debugging notebook with five
"Open hypotheses" and a "Resolves when" speculation block — all of
which are now moot since the actual fix landed. Trim to the closure
verdict, the technical evidence (captured fixture dictionary, the
dual-format insight), and the bonus context discovered while
debugging (Active/SampleInterval/result_code 32 quirks). Move from
"## Open" to "## Resolved" with date + commit 101a8b1.

Open section now contains only F18 (M5 meta-tracker, resolved at the
top) and F3 (cross-domain NTLM fixture, permanently external-blocked
on this single-domain dev host).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-06 04:04:37 -04:00
parent 101a8b13f5
commit b1a5f5ff1e
+22 -97
View File
@@ -74,103 +74,6 @@ move to `## Resolved` with a date + commit hash.
For the per-step body of every line listed in the cumulative execution log, see the matching commit message — each commit is a single F-number step with its own scope, fixtures, test count delta, and follow-up notes. The detailed per-step write-ups previously inlined here added little beyond what `git show <hash>` provides.
### F34 — `MonitoredItem` wire format: DataContract field-suffix names, not XmlSerializer property names
**Status:** Resolved 2026-05-06 — `push_monitored_item_body` now emits the `[DataMember(Name = "...Field")]` private-field names from `AsbContracts.cs:940-965` under prefix `b` bound to the DC namespace. Live `cargo run -p mxaccess --example asb-subscribe` against the AVEVA install confirms `AddMonitoredItems` returns 1 status item with `error_code=0x0000`, and a subsequent `Publish` poll delivers the actual tag value (`AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] }`). Both halves of F34 are now closed (response decoder + request body emitter). **Severity:** P2 — only affects the F26 stream's data flow against MxDataProvider; canonical-XML HMAC signing for the operation is verified working (server accepts the request, returns a non-fault response).
**Source:** Live `cargo run -p mxaccess --example asb-subscribe` + `examples/asb-relay.rs` capture, 2026-05-06.
**Two sub-issues, both closed.**
**Closed: `decode_publish_response` filtered empty `<ASBIData/>` placeholders out of the positional payload list.** Captured the full S→C bytes of a working `PublishResponse` via `examples/asb-relay.rs` between the .NET probe and MxDataProvider (fixture stashed at `crates/mxaccess-asb/tests/fixtures/publish-response-with-value.bin`). The wire shape is `<Status><ASBIData/></Status><Values><ASBIData>{bytes}</ASBIData></Values>` — Status is empty-but-present, Values carries the binary `MonitoredItemValue[]`. Our `collect_asbidata_payloads` previously skipped the empty Status, shifting Values down to index `0` where the decoder mis-read it as Status and corrupted the parse. Fix: always push every `<ASBIData>` element as a positional entry, empty or not. `tests/publish_capture.rs` runs the full decode chain over the real wire bytes and asserts `values.len() == 1`. **Verified 2026-05-06.**
**Closed: AddMonitoredItems / DeleteMonitoredItems request bodies now emit DataContract field-suffix names.** Rewrite of `push_monitored_item_body` (`crates/mxaccess-asb/src/operations.rs`) replaces the legacy XmlSerializer property names (`<MonitoredItem>`, `<Item>`, `<SampleInterval>`, `<Active>`, `<Buffered>`) with the WCF DataContract field-suffix names (`<b:MonitoredItem>`, `<b:activeField>`, `<b:activeFieldSpecified>`, `<b:bufferedField>`, `<b:itemField>` (with nested ItemIdentity DC fields), `<b:sampleIntervalField>`, `<b:timeDeadbandField>`, `<b:timeDeadbandFieldSpecified>`, `<b:userDataField>` (Variant), `<b:valueDeadbandField>` (Variant)) emitted under prefix `b` bound to `http://schemas.datacontract.org/2004/07/ArchestrAServices.ASBIDataV2Contract`. The `<Items>` wrapper now declares `xmlns:b` + `xmlns:i` (XSI). Wire-byte choices match the captured fixture: `bool` → Bool record, `ulong` → Zero/One/Chars (decimal text via XmlConvert), `ushort` → Zero/One/Int8/Int16/Int32 (smallest-fit binary), `int32` → same. Empty `string?` and null `byte[]?` emit as empty elements (no `<i:nil>` attribute, matching the wire). Field order follows the `[DataMember(Order = N)]` declarations explicitly. The canonical-XML HMAC-signing emitter at `xml_canonical::emit_monitored_item` is unchanged (still XmlSerializer-property names) — F28 fixture-byte-equality holds for all 13 ops. Verified via:
1. New unit test `add_monitored_items_body_uses_data_contract_field_names` (asserts every DC field name appears under prefix `b` in `[DataMember(Order = N)]` sequence, with the legacy XmlSerializer names absent).
2. Live `cargo run -p mxaccess --example asb-subscribe` against AVEVA: `AddMonitoredItems` now returns 1 status item with `error_code=0x0000` (was 0 items previously); `Publish` poll #4 delivers the tag value `99` over the wire. Workspace `cargo test` 757 → 758 pass; clippy clean.
**Original observation that drove the fix:** Live capture of the .NET probe's `AddMonitoredItems` request exposes a per-session NBFX dictionary declaring these strings *(verbatim, in declaration order)*: `activeField`, `activeFieldSpecified`, `bufferedField`, `itemField`, `contextNameField`, `idField`, `idFieldSpecified`, `nameField`, `referenceTypeField`, `typeField`, `sampleIntervalField`, `timeDeadbandField`, `timeDeadbandFieldSpecified`, `userDataField`, `lengthField`, `payloadField`, `valueDeadbandField`. These are the `[DataMember(Name = "...")]` private-field names from `AsbContracts.cs:940-965` — the wire form chosen by `DataContractSerializer` for non-`IAsbCustomSerializableType` types like `MonitoredItem`. Our `build_add_monitored_items_request_body` (and `build_delete_monitored_items_request_body`) emits XmlSerializer-property names like `<Active>`, `<Buffered>`, `<Item>`, `<SampleInterval>` — those are the *canonical XML for HMAC* shape (XmlSerializer-derived), which is correct for the signing input but wrong for the binary wire payload. MxDataProvider silently fails to register monitored items whose field names don't match its DataContract schema, returns a 0-length `Status` array (`successField=true`, `resultCode=0`, but no items actually registered), and consequently the `Publish` poll loop forever returns empty `Values`.
**The dual-format world**: ASB requests have *two* element-name conventions on the wire:
- **HMAC canonical XML** (input to `AsbAuthenticator::Sign`): XmlSerializer-derived names — `<Active>`, `<Items>`, `<MonitoredItem>` etc. Driven by `[XmlElement(...)]` and property names. Our `xml_canonical` emitter is byte-equal to .NET here (F28 step 2 fixtures verify).
- **Binary NBFX body** (the actual wire request): DataContractSerializer-derived names — `<activeField>`, `<bufferedField>`, etc. Driven by private-field `[DataMember(Name = "...")]`. Our builders for AddMonitoredItems / DeleteMonitoredItems are wrong here.
For ops where the body is purely `IAsbCustomSerializableType` arrays (Read, Register, Unregister), no DataContract names appear — every payload is wrapped as `<Items><ASBIData>{bytes}</ASBIData></Items>` (binary fast-path) and our builders are correct. The DataContract schema only matters for ops carrying non-`IAsbCustomSerializable` types like `MonitoredItem` and `WriteValue`.
**Captured ground-truth dictionary (from `tests/fixtures/add-monitored-items-request-wire.bin` binary header at `tests/add_monitored_items_request_capture.rs`).** The .NET WCF binary writer pre-declares **23 strings** in the session dynamic dictionary at the start of each request, mapping wire id → string:
```
header[ 0] (wire-id 1) = "http://ASB.IDataV2:addMonitoredItemsIn"
header[ 1] (wire-id 3) = "AddMonitoredItemsRequest"
header[ 2] (wire-id 5) = "SubscriptionId"
header[ 3] (wire-id 7) = "Items"
header[ 4] (wire-id 9) = "http://schemas.datacontract.org/2004/07/ArchestrAServices.ASBIDataV2Contract"
header[ 5] (wire-id 11) = "MonitoredItem"
header[ 6] (wire-id 13) = "activeField"
header[ 7] (wire-id 15) = "activeFieldSpecified"
header[ 8] (wire-id 17) = "bufferedField"
header[ 9] (wire-id 19) = "itemField"
header[10] (wire-id 21) = "contextNameField"
header[11] (wire-id 23) = "idField"
header[12] (wire-id 25) = "idFieldSpecified"
header[13] (wire-id 27) = "nameField"
header[14] (wire-id 29) = "referenceTypeField"
header[15] (wire-id 31) = "typeField"
header[16] (wire-id 33) = "sampleIntervalField"
header[17] (wire-id 35) = "timeDeadbandField"
header[18] (wire-id 37) = "timeDeadbandFieldSpecified"
header[19] (wire-id 39) = "userDataField"
header[20] (wire-id 41) = "lengthField"
header[21] (wire-id 43) = "payloadField"
header[22] (wire-id 45) = "valueDeadbandField"
```
That's **the entire DataContract field name set** plus the wrapper / array / namespace / action strings. The body then references these by wire id throughout — no inline strings needed for any of the field names. The `nameField` slot 13 (wire id 27) etc. are exactly what I'd misidentified as resolved namespace URLs in my earlier `decode_envelope` trace; the wire id resolution is actually working — it's just that the body's xmlns slots reference dict ids whose resolution lands on a string our decoder doesn't expect there. Both observations are consistent: WCF reuses the same dynamic dictionary for both element names AND namespace declarations.
**Resolves when:** Two prerequisites:
1. **F30 dynamic-dict resolution + body-dict accounting**`decode_envelope::resolve_dict_names_in_tokens` resolves dict-id-named elements correctly per the captured header; what's missing is **interpretation of which records auto-intern new strings into the dict** as the body decodes. WCF's binary writer (`XmlBinaryWriterSession.cs` in `dotnet/wcf`) auto-interns inline element/attribute names — the dynamic dict grows as the message decodes. For decoder/encoder parity we need the same auto-intern behaviour in `nbfx.rs::decode_tokens` and `encode_tokens`. The current codec leaves `_dynamic` parameter unused (intentional per its doc comment, "the codec doesn't auto-intern because `[MC-NBFX]` doesn't define a built-in `intern this string` record") — but that comment is wrong for WCF binary messages, where the writer DOES intern by convention. Fix: rewrite both halves to auto-intern inline names and to refer back to the dict on subsequent inline-or-dict choices.
2. **Builder rewrite** — once (1) lands and we can read the captured request structurally, rewrite `build_add_monitored_items_request_body` and `build_delete_monitored_items_request_body` to emit each `MonitoredItem` child as the DataContract field-suffix names (`activeField` / `activeFieldSpecified` / `bufferedField` / `itemField` / `sampleIntervalField` / `timeDeadbandField` / `timeDeadbandFieldSpecified` / `userDataField` / `valueDeadbandField`) under a `b` namespace prefix that maps to `http://schemas.datacontract.org/2004/07/ArchestrAServices.ASBIDataV2Contract`. The nested `<itemField>` carries an ItemIdentity serialized via DataContract (NOT the binary `<ASBIData>` fast-path — that only kicks in at the outer body-member level) with children `contextNameField` / `idField` / `idFieldSpecified` / `nameField` / `referenceTypeField` / `typeField` under a different `b` prefix mapping to `http://schemas.datacontract.org/2004/07/ArchestrAServices.ASBContract`. The Variant fields (`userDataField` / `valueDeadbandField`) carry `lengthField` / `payloadField` / `typeField` children. Same fix likely applies to `WriteBasicRequest`'s `WriteValue[]? Values` field (also non-`IAsbCustomSerializable`); needs its own capture-and-verify pass.
The dictionary-id pre-population that .NET's WCF binary writer uses is a perf optimisation; an inline-string emit will work for correctness once the structure is right.
**Bonus context discovered while debugging F34:**
- `MinimalMonitoredItem` gained an `active: Option<bool>` field with a `with_active(item, interval, active)` constructor. Without `<Active>true</Active>` on the wire, MxDataProvider treats the subscription as inactive even when AddMonitoredItems "succeeds" — F26 stream then never sees values. (Once the field-name fix lands, this becomes the determining factor for whether values flow.)
- `SampleInterval` unit corrected from "100-ns ticks" to **milliseconds** in the example + the `MinimalMonitoredItem.sample_interval` doc — matches `MxAsbDataClient.cs:441`'s `ulong sampleInterval = 1000` default.
- `result_code = 32` is `AsbErrorCode.PublishComplete` (`AsbResultMapping.cs:37`), informational not fatal — `ToResult:122-129` treats it like `Success`. F26 stream's `publish_loop` narrowed to bail only on `RESULT_CODE_INVALID_CONNECTION_ID = 1`.
**Evidence.** Side-by-side comparison after the .NET probe and Rust client both ran `--subscribe` against the same `TestChildObject.TestInt` tag:
```
.NET probe (working):
publish[0]_error=0x00000020 status=0x00000000 specific=0x00000000
published_event[0]=item:TestChildObject.TestInt id:18446462598732840962
type:4 quality:0x00C0 timestamp:... preview:99
publish[0]_value[0]=item: id:... id_specified:True type:4 length:4
payload_len:4 preview:99
Rust client (this followup):
poll 0: 0 value(s); result_code=Some(32) success=Some(false)
poll 1: 0 value(s); result_code=Some(32) success=Some(false)
... (8 polls total, 0 values each)
```
Both sides see the same `result_code=32` (= `AsbErrorCode.PublishComplete`, informational per `AsbResultMapping.cs:37` + `ToResult:122-129` which treats it like `Success`). Both sides see `successField=false`. **But .NET extracts a value while we extract zero.**
**Confirmed not the cause:**
-`SampleInterval` unit (was 10_000_000 in example, fixed to `1000` ms — matches `MxAsbDataClient.cs:441`'s `ulong sampleInterval = 1000` default).
- ✅ Treating `result_code=32` as informational (the F26 narrower-bail fix matches the .NET logic exactly).
- ✅ Canonical-XML signing of the `PublishRequest` (the server returns a typed response, not a SOAP fault).
-`MonitoredItemValue` IS `IAsbCustomSerializableType` (`AsbContracts.cs:1035`), so it travels via the `<Values><ASBIData>{bytes}</ASBIData></Values>` binary fast-path same as `Read`'s Values path which works.
**Open hypotheses (each needs wire-capture verification via `examples/asb-relay.rs` middleman with the .NET probe):**
1. Wire shape mismatch: server may emit Values via a different element/namespace than `<Values>` in this case (some kind of dynamic dict id we don't resolve), so our `collect_asbidata_payloads` walking by name misses it.
2. Status array shape: a 0-length Status (empty array, 4-byte `count=0`) may render as a populated `<ASBIData>` containing just `[0,0,0,0]`; our decoder reads `payloads[0]` as Status (correctly empty) and `payloads[1]` as Values — but if the server's wire only has 1 ASBIData (Status was omitted entirely, not empty-but-present), `payloads[1]` would BE the Values payload getting mis-read as Status.
3. `decode_monitored_item_value_array` (which builds on `MonitoredItemValue::decode` at `contracts.rs:277`) may have a subtle bug — `MonitoredItemValue` carries `ItemIdentity + RuntimeValue + AsbVariant userdata` per `cs:1064-1068`; if the wire bytes don't match that order or include extra fields, the count read at offset 0 would be wrong.
**Resolves when:** A middleman trace via `examples/asb-relay.rs` between the .NET probe and MxDataProvider for a working subscribe-flow exposes the actual wire bytes of a `PublishResponse` carrying values. Compare against `decode_publish_response`'s expectations and reconcile (likely a `MonitoredItemValue` byte-layout fix or a Values-payload-locator fix).
**Adjacent observation worth noting:** `AddMonitoredItemsResponse` shows the same symptom shape — our trace reports `add status: 0 item(s); result_code=Some(0) success=Some(true)` while the .NET probe reports `add_monitored_status[0]=item:TestChildObject.TestInt id:18446462598732840962 ...`. Same `IAsbCustomSerializableType`-wrapped Status array; same "0 items where .NET sees 1". These two decoders likely fail for the same root reason and a single fix should close both.
### F3 — Cross-domain NTLM Type1/2/3 fixture
**Severity:** P2
**Status:** Permanently out-of-scope on the current dev host (no second AD domain). Resolution requires external infrastructure not available here.
@@ -182,6 +85,28 @@ Both sides see the same `result_code=32` (= `AsbErrorCode.PublishComplete`, info
## Resolved
### F34 — `MonitoredItem` wire format: DataContract field-suffix names, not XmlSerializer property names
**Resolved:** 2026-05-06 (commit `101a8b1`). **Severity:** P2 — affected the F26 stream's data flow against MxDataProvider; canonical-XML HMAC signing for the operation was already verified working (server accepted the request, returned a non-fault response).
**Two halves, both closed:**
**Half 1 — Response decoder (closed earlier).** `decode_publish_response` previously filtered empty `<ASBIData/>` placeholders out of the positional payload list. Captured the full S→C bytes of a working `PublishResponse` via `examples/asb-relay.rs` between the .NET probe and MxDataProvider (fixture stashed at `crates/mxaccess-asb/tests/fixtures/publish-response-with-value.bin`). The wire shape is `<Status><ASBIData/></Status><Values><ASBIData>{bytes}</ASBIData></Values>` — Status is empty-but-present, Values carries the binary `MonitoredItemValue[]`. `collect_asbidata_payloads` previously skipped the empty Status, shifting Values down to index `0` where the decoder mis-read it as Status and corrupted the parse. Fix: always push every `<ASBIData>` element as a positional entry, empty or not. `tests/publish_capture.rs` runs the full decode chain over the real wire bytes and asserts `values.len() == 1`.
**Half 2 — Request body emitter (closed by this commit).** Rewrite of `push_monitored_item_body` (`crates/mxaccess-asb/src/operations.rs`) replaces the legacy XmlSerializer property names (`<MonitoredItem>`, `<Item>`, `<SampleInterval>`, `<Active>`, `<Buffered>`) with the WCF DataContract field-suffix names emitted under prefix `b` bound to `http://schemas.datacontract.org/2004/07/ArchestrAServices.ASBIDataV2Contract`. Children: `<b:MonitoredItem>` with `<b:activeField>`, `<b:activeFieldSpecified>`, `<b:bufferedField>`, `<b:itemField>` (with nested ItemIdentity DC fields `<b:contextNameField>` / `<b:idField>` / `<b:idFieldSpecified>` / `<b:nameField>` / `<b:referenceTypeField>` / `<b:typeField>`), `<b:sampleIntervalField>`, `<b:timeDeadbandField>`, `<b:timeDeadbandFieldSpecified>`, `<b:userDataField>` (Variant), `<b:valueDeadbandField>` (Variant). The `<Items>` wrapper now declares `xmlns:b` + `xmlns:i` (XSI). Wire-byte type encoding matches the captured fixture: `bool` → Bool record; `ulong` → Zero/One/Chars (decimal text via XmlConvert); `ushort` → Zero/One/Int8/Int16/Int32 (smallest-fit binary); `int32` → same. Empty `string?` and null `byte[]?` emit as empty elements (no `<i:nil>` attribute, matching the wire). Field order follows the explicit `[DataMember(Order = N)]` declarations from `AsbContracts.cs:940-965`. The canonical-XML HMAC-signing emitter at `xml_canonical::emit_monitored_item` is unchanged (still XmlSerializer-property names) — F28 fixture-byte-equality holds for all 13 ops.
**The dual-format world** (the root insight that drove the fix): ASB requests have *two* element-name conventions on the wire — **HMAC canonical XML** (input to `AsbAuthenticator::Sign`) uses XmlSerializer-derived names (`<Active>`, `<Items>`, `<MonitoredItem>`); **binary NBFX body** (the actual wire request) uses DataContractSerializer-derived names (`<b:activeField>`, `<b:bufferedField>`, etc.). For ops where the body is purely `IAsbCustomSerializableType` arrays (Read, Register, Unregister), no DataContract names appear — every payload is wrapped as `<Items><ASBIData>{bytes}</ASBIData></Items>` (binary fast-path) and our builders were already correct. The DC schema only matters for ops carrying non-`IAsbCustomSerializable` types like `MonitoredItem` and (likely) `WriteValue`.
**Captured ground-truth dictionary** (from `tests/fixtures/add-monitored-items-request-wire.bin``tests/add_monitored_items_request_capture.rs` decodes it). The .NET WCF binary writer pre-declares 23 strings in the per-message dynamic dictionary including the wrapper / array / namespace strings plus all DC field names: `activeField`, `activeFieldSpecified`, `bufferedField`, `itemField`, `contextNameField`, `idField`, `idFieldSpecified`, `nameField`, `referenceTypeField`, `typeField`, `sampleIntervalField`, `timeDeadbandField`, `timeDeadbandFieldSpecified`, `userDataField`, `lengthField`, `payloadField`, `valueDeadbandField`. The dictionary-id pre-population that .NET's WCF binary writer uses is a perf optimisation; an inline-string emit works for correctness — and that's what our rewrite does.
**Verification:**
1. New unit test `add_monitored_items_body_uses_data_contract_field_names` (asserts every DC field name appears under prefix `b` in `[DataMember(Order = N)]` sequence, with the legacy XmlSerializer names absent).
2. Live `cargo run -p mxaccess --example asb-subscribe -- --tag TestChildObject.TestInt` against the AVEVA install: `AddMonitoredItems` returns 1 status item with `error_code=0x0000` (was 0 items previously); `Publish` poll #4 delivers the actual tag value through the F26 stream as `AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] }`. Workspace `cargo test` 757 → 758 pass; clippy clean.
**Bonus context discovered while debugging F34:**
- `MinimalMonitoredItem` gained an `active: Option<bool>` field with `with_active(item, interval, active)` constructor. Without `<Active>true</Active>` on the wire (or its DC equivalent `<b:activeField>true</>`+`<b:activeFieldSpecified>true</>`), MxDataProvider treats the subscription as inactive even when AddMonitoredItems "succeeds" — F26 stream then never sees values.
- `SampleInterval` unit corrected from "100-ns ticks" to **milliseconds** in the example + the `MinimalMonitoredItem.sample_interval` doc — matches `MxAsbDataClient.cs:441`'s `ulong sampleInterval = 1000` default.
- `result_code = 32` is `AsbErrorCode.PublishComplete` (`AsbResultMapping.cs:37`), informational not fatal — `ToResult:122-129` treats it like `Success`. F26 stream's `publish_loop` narrowed to bail only on `RESULT_CODE_INVALID_CONNECTION_ID = 1`.
### F28 — Canonical XML serialiser for `ConnectedRequest` signing (matches `XmlSerializer.Serialize` byte-for-byte)
**Resolved:** 2026-05-06 (commit `<this commit>`). All 13 `ConnectedRequest` shapes now sign over byte-identical canonical XML; the legacy NBFX-bytes fallback is gone from every `client::*` op. Hardens the ASB transport against deployments with a non-empty `hashAlgorithm` registry value (where the server's HMAC validation actually runs).