[M5] design: followups update — M5 functionally LIVE, F30/F31 resolved
F18 (M5 master) gains an "M5 STATUS" block right after the DoD checklist showing the live end-to-end win (commit `9063f10`, TestChildObject.TestInt round-trips with payload [99,0,0,0]) and ticking each DoD bullet: - ✅ Live `asb-subscribe` succeeds. - ⚠️ Wire request bytes match .NET byte-for-byte; response parity uses the F30 dict-id resolution post-pass + chunked-Bytes concatenation instead of strict byte equality (functionally equivalent — both decode to the same logical XML). - ⚠️ Type matrix: only Int32 verified live; Bool/Float/Double/ String/DateTime/Duration/arrays pending sample tags. Tracked under new F32. - ✅ build/test/clippy green (711 tests). Followup churn: - F30 + F31 moved to ## Resolved with proper "Resolved: <date> (commit `<hash>`)" headers. F30 was the unblocker for F31 — without read-side dict-id resolution we couldn't see `<resultCodeField>1</>` in the response. - F28 status header updated to "PARTIALLY RESOLVED": the five [XmlSerializerFormat] ops (AuthenticateMe, Disconnect, KeepAlive, RegisterItems, UnregisterItems) plus DH params + dynamic-dict management all landed; Read/Write/Subscribe/Publish still sign over NBFX wire bytes via the legacy fallback. Severity demoted P0 → P2 because the live registry has empty `hashAlgorithm` and unsigned ops work in practice; promote back if that changes. - F29 reaffirmed P2 (latent NBFS dict-id drift, no live impact). - New F32 captures the type-matrix expansion as the only remaining P1 item for full M5 closeout. No code change in this commit — design doc only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+24
-11
@@ -46,6 +46,18 @@ move to `## Resolved` with a date + commit hash.
|
||||
|
||||
**Resolves when:** F19-F26 are all closed and the four DoD bullets above pass.
|
||||
|
||||
**M5 STATUS (commit `9063f10`): functionally LIVE.** End-to-end `cargo run -p mxaccess --example asb-subscribe -- --tag TestChildObject.TestInt` Connect → AuthenticateMe → Register → Read → Disconnect against the live MxDataProvider, returning the real tag value over the wire (`type_id=4 length=4 payload=[99,0,0,0]`). DoD checklist:
|
||||
1. ✅ Live `asb-subscribe` succeeds against the AVEVA endpoint.
|
||||
2. ⚠️ Wire structure matches .NET's request bytes for AuthenticateMe / Register byte-by-byte (verified via `asb-relay` middleman with the .NET probe routed through ClientVia); responses round-trip via the F30 dict-id resolution post-pass. Strict byte-identical parity for the response side is not guaranteed because WCF chunks `Bytes8/16/32` records at different boundaries — both forms are functionally equivalent and `collect_asbidata_payloads` concatenates chunks (commit `cf97eab`).
|
||||
3. ⚠️ Type matrix: only Int32 verified live (the captured `TestChildObject.TestInt` tag). Bool / Float / Double / String / DateTime / Duration / arrays not yet exercised — pending one or more sample tags per type and an `asb-subscribe` extension that loops over them. F32 captures this expansion.
|
||||
4. ✅ `cargo build --workspace` + `cargo test --workspace` (711 tests) + `cargo clippy --workspace -- -D warnings` all green.
|
||||
|
||||
**Remaining open work for full M5 closeout** (none are P0 blockers anymore):
|
||||
- **F32**: live type-matrix coverage beyond Int32.
|
||||
- **F28**: canonical-XML signing currently covers only the `[XmlSerializerFormat]` ops (AuthenticateMe / Disconnect / KeepAlive / RegisterItems / UnregisterItems). Read / Write / CreateSubscription / AddMonitoredItems / Publish / etc. still sign over NBFX wire bytes via the legacy fallback. Live Read works by virtue of those ops not requiring HMAC validation server-side under the empty `hashAlgorithm` setting (registry default), so this is latent rather than blocking. Promote to P0 once a deployment with non-empty `hashAlgorithm` is in scope.
|
||||
- **F29**: `nbfs.rs` static dictionary IDs drift +20 from canonical `[MC-NBFS]` for the SOAP-fault subset. P2; doesn't affect any live path today.
|
||||
- **F26 stream subscription**: `Stream<Item = MonitoredItemValue>` over a publish-loop is still stubbed in `AsbSession`. Tracked under F25 step 8 / F26 step 3 in the cumulative log.
|
||||
|
||||
**Cumulative execution log.** F19 + F23 (`ed17c07`); F24 (`7611d9e`); F20 (`9dfd193`); F22 (`43c10a1`); F21 (`5f98558`); F25 step 1 (`25dbd8d`); F25 step 2 (`a2b8989`); F25 step 3 (`c4bf0a0`); F25 step 4 (`1e59249`); F25 step 5 (`9b8133f`); F25 step 6 (`321b796`); F25 step 7 (`1b1ee1e`); F26 step 1 (`8a0f92b`); F26 step 2 (`14bb529`); example rewrite (`c6570dc`); F25 step 8 (`b543eb1`); F25 step 9 (`0441a2e`); F25 step 10 (`9876b4e`); F26 step 3 (`<previous>`); **F25 live-bring-up reconciliation** (this commit):
|
||||
- F25 live-bring-up reconciliation: live `asb-subscribe` + `asb-relay` (TCP middleman) capture-and-diff against AVEVA's MxDataProvider on Windows. Five concrete fixes landed:
|
||||
1. **NBFX `PrefixElement_a..z` (0x5E-0x77) and `PrefixAttribute_a..z` (0x26-0x3F) decode + encode arms** — single-letter-prefix records that WCF emits in responses but our codec only recognised the dictionary-named cousins (`PrefixDictionaryElement_a..z` 0x44-0x5D, `PrefixDictionaryAttribute_a..z` 0x0C-0x25). The server's ConnectResponse hit `0x65 = PrefixElement_h` for a dynamically-named element (e.g. `<h:Foo>`) and our decoder bailed with `unknown NBFX record byte 0x65`. Both directions now round-trip; the encoder picks the short-form arm whenever `prefix_letter_offset(prefix).is_some()`.
|
||||
@@ -134,19 +146,14 @@ move to `## Resolved` with a date + commit hash.
|
||||
|
||||
F25 (`mxaccess-asb` IASBIDataV2 client) and F26 (`mxaccess::Session` over `AsbTransport`) remain open. With F19-F24 landed, the M5 framing/encoder layer (streams A+B+C+D and the codec stream) is complete; F25 composes them into the `IASBIDataV2` wire client. F22's static dictionary subset is intentionally curated; expand entries as wire captures show new IDs. F27 (constant-time DH) is filed as a separate follow-up below.
|
||||
|
||||
### F30 — Resolve dict-id element/attribute names on the read side
|
||||
**Severity:** P1 — blocks decoding any non-trivial WCF response.
|
||||
**Source:** Live Register response decode (`MX_ASB_TRACE_REPLY` dump in `client.rs:172-190`).
|
||||
**Why deferred:** When the server returns a response with the `RegisterItemsResponse` wrapper + `Result` fields, every element name (and most attribute names) is dict-encoded — `<b:Static(43)>false</b:Static(43)>` is `successField=false` on the wire. Our `decode_tokens` produces `NbfxName::Static(id)` tokens without resolving them; downstream consumers (`collect_asbidata_payloads`, `find_element_named`, `decode_register_items_response`) only match against `NbfxName::Inline(local)` and miss every dict-named element. The fault detection works because the SOAP fault Action header contains `/fault` (a literal string), but real success-response decoding is blind.
|
||||
|
||||
**Resolves when:** `decode_tokens` (or a post-pass over the token stream) substitutes `NbfxName::Static(id)` with `NbfxName::Inline(name)` whenever the dict id resolves to a known string. The dynamic dict (`read_dictionary`) accumulates session strings via `intern`; the read-path needs the parallel session counter to map wire ids to slots — wire ids are odd and session-cumulative across messages, mirroring the F28 fix on the write side. **Resolves**: F25 live data path (Read/Write/Subscribe responses are all dict-encoded too).
|
||||
|
||||
### F30 — Resolve dict-id element/attribute names on the read side (RESOLVED, commit `eb6c689`)
|
||||
|
||||
### F31 — InvalidConnectionId on first Register after AuthenticateMe — RESOLVED via retry
|
||||
**Resolved:** `<this commit>`. Not a HMAC bug after all — `AsbErrorCode.InvalidConnectionId` (= 1) is a **transient race** condition that .NET's `MxAsbDataClient.RegisterMany` (`cs:191-204`) explicitly handles with a retry loop (`for (int attempt = 1; attempt < 5 && response.Result.ErrorCode == InvalidConnectionId; attempt++)` with `100*attempt` ms backoff). `AuthenticateMe` is one-way (`AsbContracts.cs:18`); the server commits auth state asynchronously after the request lands, and a Register that arrives too quickly sees the connection in pre-authenticated state. `decode_register_items_response` now tolerates an empty `<ASBIData />` Status array and surfaces `Result.resultCodeField` + `successField`; `AsbClient::register_items` retries up to 5 times on `RESULT_CODE_INVALID_CONNECTION_ID`, mirroring .NET. **Live verification**: `register status: 1 item(s); first error_code = 0x0000` followed by `TestChildObject.TestInt = AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] }` — the real tag value `99` over the live wire, end-to-end.
|
||||
### F32 — Live type-matrix coverage for `asb-subscribe`
|
||||
**Severity:** P1 — final M5 DoD bullet (#3).
|
||||
**Source:** F18 M5 status block.
|
||||
**Why deferred:** The live bring-up loop verified Int32 end-to-end (`TestChildObject.TestInt = 99`). The remaining proven-on-.NET-side types — Boolean, Float, Double, String, DateTime, Duration, plus deployed array shapes per `work_remain.md:108-113` — need at least one sample tag per type in the Galaxy and a probe loop in `examples/asb-subscribe.rs` (or a new `asb-typematrix.rs`) that registers + reads each, asserting the decoded `AsbVariant` round-trips through the F24 codec.
|
||||
**Resolves when:** A list of test tags (one per type) is provisioned in the live Galaxy and the matrix loop produces a clean run.
|
||||
|
||||
### F28 — Canonical XML serialiser for `ConnectedRequest` signing (matches `XmlSerializer.Serialize` byte-for-byte)
|
||||
**Status: PARTIALLY RESOLVED.** The five `[XmlSerializerFormat]` ops (AuthenticateMe, Disconnect, KeepAlive, RegisterItems, UnregisterItems) plus the per-action `ValidatorWireFormat` selector + DH-params-from-registry + dynamic-dict id management all landed in commits `f14580e` / `104efc4`. Live AuthenticateMe + RegisterItems work end-to-end (commit `9063f10`). Read / Write / CreateSubscription / AddMonitoredItems / Publish / DeleteMonitored / DeleteSubscription / PublishWriteComplete still sign over NBFX wire bytes via the legacy fallback; works in practice because the live registry has empty `hashAlgorithm` (no HMAC required for the unforced-MAC path), but will break under any deployment that sets a real algorithm. **Severity now P2** — promote back to P0 if a hashAlgorithm-non-empty environment is in scope.
|
||||
**Severity:** P0 — blocks every signed ASB operation (AuthenticateMe, RegisterItems, all data-plane RPCs).
|
||||
**Source:** F25 live-bring-up; `AsbSystemAuthenticator.cs:79` + `AsbSerialization.cs:12-48`.
|
||||
**Why deferred:** `AsbSystemAuthenticator.Sign` HMACs `Encoding.UTF8.GetBytes(request.ToXml())` — the XML text produced by .NET's `XmlSerializer.Serialize(writer, value)` with `XmlSerializerNamespaces` = `"urn:invensys.schemas"`, then re-parsed via `XDocument.Load` and re-saved to normalise xmlns attribute ordering (xsi before xsd; see `AsbSerialization.cs:36-47`). The HMAC must match the server's recomputation, which uses the same XmlSerializer on the deserialised request — so the Rust port has to produce byte-identical XML. We currently HMAC the NBFX wire bytes of the unsigned envelope, which never matches.
|
||||
@@ -245,6 +252,12 @@ The fixture is captured by `MxAsbClient.Probe --dump-deterministic-hmac` (`src/M
|
||||
|
||||
## Resolved
|
||||
|
||||
### F31 — InvalidConnectionId on first Register after AuthenticateMe
|
||||
**Resolved:** 2026-05-05 (commit `9063f10`). Not a HMAC bug — `AsbErrorCode.InvalidConnectionId` (= 1) is a transient race that .NET's `MxAsbDataClient.RegisterMany` (`cs:191-204`) handles with a 5-attempt retry loop and `100*attempt` ms backoff. `AuthenticateMe` is one-way (`AsbContracts.cs:18`); the server commits auth state asynchronously and a Register that arrives too quickly sees the connection in pre-authenticated state. `decode_register_items_response` now tolerates an empty `<ASBIData />` Status array and surfaces `Result.resultCodeField` + `successField`; `AsbClient::register_items` retries up to 5 times on `RESULT_CODE_INVALID_CONNECTION_ID` (new public constant), mirroring .NET. Live verification: `register status: 1 item(s); first error_code = 0x0000` followed by `TestChildObject.TestInt = AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] }` over the live wire.
|
||||
|
||||
### F30 — Resolve dict-id element/attribute names on the read side
|
||||
**Resolved:** 2026-05-05 (commit `eb6c689`). `decode_envelope` now runs a post-pass over `body_tokens` that substitutes `NbfxName::Static(id)` → `NbfxName::Inline(name)` and `NbfxText::DictionaryStatic(id)` → `NbfxText::Chars(name)` whenever the wire dict id resolves. Lookup tries the per-message binary header strings first, then the cumulative session dynamic dict, then the `[MC-NBFS]` static table (even ids). Tokens with unresolvable ids stay opaque so trace output still reveals them. Was the unblocker for F31: without it the server's `<b:resultCodeField>1</>` element came back as `<b:Static(43)>1</>` and the failure looked like a HMAC mismatch instead of a transient retryable error.
|
||||
|
||||
### F7 — Consolidate `Guid` type across `mxaccess-rpc`
|
||||
**Resolved:** 2026-05-05 in this iteration's commit. `Guid` was hoisted from `objref::Guid` into the new shared `crate::guid::Guid` module. `objref` and `pdu` now re-export from there; M2 wave 2's `orpc`, `object_exporter`, and `rem_unknown` import it directly. The OXID-resolve dual-string decoder additionally needs an owned protocol label (`format!("protseq_0x{:04x}", tower_id)` per `ObjectExporterMessages.cs:120`) — `ComDualStringEntry::protocol` was upgraded from `&'static str` to `Cow<'static, str>` to support both decoders without the agent's interim `Box::leak` workaround.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user