ff4ea4d5a9
rust / build / test / clippy / fmt (push) Has been cancelled
Closes F16. Replaces the wave-2 no-op recover_connection with the
full .NET-equivalent shape (MxNativeSession.cs:399-474). Three
pieces:
1. Subscription registry on SessionInner.
New subscriptions: Mutex<HashMap<[u8; 16], SubscriptionEntry>>
tracks every active advise. subscribe() inserts after a successful
AdviseSupervisory; unsubscribe() removes on the success path only
(failed UnAdvises stay registered so next recovery replays them).
The consumer's Subscription handle still holds the BroadcastStream;
the registry is purely for AdviseSupervisory replay.
2. Pluggable RebuildFactory.
New public typedef:
pub type RebuildFactory = Arc<
dyn Fn() -> Pin<Box<dyn Future<Output = Result<NmxClient,
NmxClientError>>
+ Send>>
+ Send + Sync,
>;
Installed via Session::set_recovery_factory(factory);
queryable via has_recovery_factory(). Kept separate from
connect_nmx / connect_nmx_auto so existing constructors stay
non-breaking — consumers opt in by calling the setter
after-the-fact.
3. Real recover_connection + recover_connection_core.
recover_connection is the retry loop (mirrors cs:399-440): for
attempt in 1..=policy.max_attempts, emit RecoveryEvent::Started
→ call recover_connection_core → on Ok emit Recovered + return,
on Err emit Failed{will_retry, error}, sleep policy.delay, retry,
or bubble the last error.
recover_connection_core mirrors cs:442-474: rebuild NMX via the
factory → RegisterEngine2 with the saved callback_obj_ref → optional
SetHeartbeatSendInterval → snapshot the registry under the lock,
replay AdviseSupervisory(correlation_id) for each entry → atomically
swap *nmx_lock = replacement. Old NmxClient drops at end of scope,
closing its TCP transport.
Subscription correlation ids are preserved across the swap so the
consumer's Subscription stream continues to receive on its existing
broadcast filter. The CallbackExporter stays bound across recoveries
— no TCP listener re-bind.
R15's "long-lived connection task" was listed as a hard prereq, but
the existing Mutex<NmxClient> already serialises concurrent ops
during the rebuild — recover_connection_core holds the inner mutex
during the swap, concurrent ops just wait. Functionally equivalent
to the long-lived-task design.
New ConfigError::RecoveryNotConfigured returned when
recover_connection is called without a factory installed. New
public re-export: RebuildFactory.
Tests (mxaccess 65 → 67):
- recover_connection_without_factory_returns_recovery_not_configured
- recover_connection_with_always_failing_factory_exhausts_attempts
(pins (Started, Failed)×3 + final will_retry=false + bubbled
TransportFailure)
- subscribe_populates_registry_unsubscribe_clears_it
- recovery_events_supports_multiple_subscribers (updated for the
new factory-required path)
connect_nmx_auto-side auto-population of the factory (capturing the
ntlm_factory + discovered (addr, service_ipid) so consumers don't
re-author the closure) is a future polish — not required to close
F16.
design/followups.md: F16 moved to Resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
219 lines
53 KiB
Markdown
219 lines
53 KiB
Markdown
# Followups
|
||
|
||
Open work items deferred during /loop iterations. Triaged at the top of
|
||
every iteration. New items are appended under `## Open`; resolved items
|
||
move to `## Resolved` with a date + commit hash.
|
||
|
||
## Open
|
||
|
||
### F18 — M5 plan of attack (ASB transport, parallel-safe sub-streams)
|
||
**Severity:** P0 — milestone driver, blocks ASB consumers + V1 release
|
||
**Source:** `design/dependencies.md:73-89` + `design/60-roadmap.md:84-91` + `design/70-risks-and-open-questions.md:5-25` (R1 estimates ~3000 LoC for framing+encoders).
|
||
|
||
**Scope.** Build the ASB data-plane end-to-end:
|
||
- `mxaccess-asb-nettcp` — `[MS-NMF]` framing + `[MC-NBFX]` binary-XML node codec + `[MC-NBFS]` static dictionary table + DH/HMAC/AES authentication crypto.
|
||
- `mxaccess-asb` — `IASBIDataV2` client (Connect, RegisterItems, Read, Write, PublishWriteComplete, CreateSubscription, AddMonitoredItems, Publish, Disconnect) + `SecretProvider` trait + DPAPI default impl + ASB Variant codec port (currently a stub at `crates/mxaccess-codec/src/lib.rs:74,77,80`).
|
||
- `mxaccess::Session` over an `AsbTransport` impl; capabilities surface ASB limits (no `subscribe_buffered`, no Activate/Suspend, no OperationComplete outside the proven write-completion frame — see `design/60-roadmap.md:88`).
|
||
- `examples/asb-subscribe.rs` exercises the whole path against a live ASB endpoint with parity vs `dotnet run --project src\MxAsbClient.Probe`.
|
||
|
||
**Sub-stream breakdown** (matches `design/dependencies.md:78-89`). Each sub-stream is a separate followup so it can be claimed by a separate agent in a worktree without merge conflict:
|
||
|
||
| Sub-followup | Stream | Owns | Depends on |
|
||
|---|---|---|---|
|
||
| F19 | (workspace prereq) | Add the M5 dep set to `rust/Cargo.toml` workspace deps + per-crate `Cargo.toml`: `aes`, `hmac`, `md-5`, `sha1`, `sha2`, `pbkdf2`, `flate2`, `rand`, `crypto-bigint` (constant-time DH per `review.md` MAJOR), `quick-xml`, `tokio-util`. Pinned to the `digest 0.11`/`cipher 0.5` generation per `design/30-crate-topology.md:251-289`. Sequential prereq for the others. | M0 |
|
||
| F20 | A — MS-NMF framing | `mxaccess-asb-nettcp::nmf` — preamble (`0x00 ver=1 mode=2 via=encoded-string`), preamble-ack, sized-envelope (`0x06 var-int len bytes`), end (`0x07`), fault (`0x08`), upgrade-request, known-encoding via lookup. Reliable-session ack handling. Round-trip against `analysis/proxy/mxasbclient-register-message.txt` and `mxasbclient-probe-stage*.txt` byte traces. | F19 |
|
||
| F21 | B — MC-NBFX | `mxaccess-asb-nettcp::nbfx` — record types (`0x40` ShortElement, `0x41` Element, `0x44` ShortDictionaryAttribute, `0x04` PrefixDictionary*A-Z, `0x84` BoolText, `0x88` Int32Text, `0x86` BoolFalseText, etc., per `[MC-NBFX]` §2.2). Length-prefixed strings (var-int 7-bit groups). Read/write over `bytes::BytesMut`. | F19 |
|
||
| F22 | C — MC-NBFS | `mxaccess-asb-nettcp::nbfs` — the static dictionary table. SOAP/WS-Addressing tokens + `IASBIDataV2`-action strings used by the operation set (`http://ASB.IDataV2:registerItemsIn`, `:readIn`, `:writeIn`, `:createSubscriptionIn`, `:publishIn`, etc., see `src/MxAsbClient/AsbContracts.cs:14-58`). Hand-rolled from the proven action set; the full WCF dictionary is much larger but only the action subset is on the wire. | F19 |
|
||
| F23 | D — Auth crypto | `mxaccess-asb-nettcp::auth` — port `src/MxAsbClient/AsbSystemAuthenticator.cs` (167 LoC): DH key exchange with `crypto-bigint` constant-time `mod_exp` (review.md MAJOR finding — .NET `BigInteger.ModPow` is **not** constant-time and the DH private exponent is long-lived per `cs:153-166`); HMAC-MD5/SHA1/SHA512 (negotiated per `AsbSolutionCryptoParameters.HashAlgorithm`); AES-128 with PBKDF2-SHA1 1000-iteration key derivation; deflate-then-encrypt `EncryptBaktun` vs raw-encrypt `EncryptApollo` distinguished by `:V2` lifetime suffix (`cs:48`); ASCII salt `"ArchestrAService"`; UTF-16LE passphrase. Plus DPAPI shared-secret read on Windows behind the existing `dpapi` feature gate, with a `SecretProvider::shared_secret(&[u8])` escape hatch for tests/CI (`design/30-crate-topology.md:150`). | F19 |
|
||
| F24 | (codec) | `mxaccess-codec::asb_variant` — fill in the stubbed `AsbVariant`, `AsbStatus`, `RuntimeValue` (`crates/mxaccess-codec/src/lib.rs:74,77,80`) per `docs/ASB-Variant-Wire-Format.md`. Decode/encode for the proven type matrix: `TypeBool`, `TypeInt32`, `TypeFloat`, `TypeDouble`, `TypeString`, `TypeDateTime`, `TypeDuration`, plus deployed array shapes (`work_remain.md:108-113`). Less-common scalars stay as raw bytes (matches .NET `DecodeVariant` fallback at `MxAsbDataClient.cs:748`). Independent of the framing/encoder work — separate crate. | M1 (envelope/status types) |
|
||
| F25 | E — IASBIDataV2 client | `mxaccess-asb::client` — top-level `AsbClient` with `connect`, `register_items`, `read`, `write`, `publish_write_complete`, `create_subscription`, `add_monitored_items`, `publish`, `disconnect`. Wires the contract → NBFX-encoded SOAP envelope → NMF-framed TCP. `ConnectedRequest::ConnectionValidator` HMAC signing per `AsbSystemAuthenticator::Sign`. Receives `Publish` callbacks via a long-lived background task (mirrors the M4 NMX `callback_router` pattern). Depends on F20+F21+F22+F23+F24. | A+B+C+D+codec |
|
||
| F26 | (session) | `mxaccess::Session` over `AsbTransport`. New transport impl alongside `NmxTransport`. Surface ASB capability flags so `subscribe_buffered`/`activate`/`suspend` return `Error::Unsupported(Capability::*)` rather than a runtime fallthrough. Update `examples/asb-subscribe.rs` to drive the path end-to-end. Live-probe DoD: round-trip parity with `dotnet run --project src\MxAsbClient.Probe`. | F25 |
|
||
|
||
**Parallel-safety analysis.**
|
||
- F19 (workspace deps) is the **single sequential bottleneck** — F20-F25 all reference workspace deps that don't exist yet, so they cannot start in parallel until F19 lands. Tight & small (~30 lines of TOML).
|
||
- F20, F21, F22, F23, F24 are **fully parallel-safe** after F19: each owns a different module under a different crate (or different sibling module within `mxaccess-asb-nettcp`). No shared state, no cross-import — each can land in its own commit. Per `dependencies.md:88` "Peak agents in parallel: 4 in the framing/encoding wave (A+B+C+D)".
|
||
- F25 is sequential after the four framing/encoder streams + F24 land — it composes them. The .NET `MxAsbDataClient` is monolithic enough that splitting F25 across agents costs more in coordination than it saves.
|
||
- F26 is sequential after F25.
|
||
- **Cross-milestone parallelism still holds.** M5 (this whole F18-F26 cluster) runs in parallel with M3+M4 per `design/60-roadmap.md:14-17` because the `Transport` trait was lifted into M0. M4's `Session` core landed (commits `4863c6d`, `2dc091d`, `a31237d`); the F26 `AsbTransport` plugs into the same trait without re-design.
|
||
|
||
**Risk-driven sequencing inside the parallel wave.** R1 in `design/70-risks-and-open-questions.md:9` is the project-blocker. Of the four parallel streams, F23 (auth crypto) carries the most live-probe risk (DH handshake against the live VM is the first irreversible test of the spec port) but is the smallest in LoC. F22 (NBFS) is the largest unknown — the dictionary table size is bounded only by the action subset we exercise. Recommended order *if* agents are constrained: F23 (smallest, highest-leverage) → F20 (foundational for any wire test) → F21 (encoder) → F22 (dictionary) → F24 (codec, independent).
|
||
|
||
**Definition of done** for F18 as a whole (= M5 DoD per `design/60-roadmap.md:91`):
|
||
1. `cargo run -p mxaccess --example asb-subscribe -- --tag TestChildObject.TestInt` succeeds against a live ASB endpoint.
|
||
2. Round-trip parity with `dotnet run --project src\MxAsbClient.Probe` (Frida/Wireshark diff is byte-identical for the proven type matrix).
|
||
3. The `mxaccess-asb` type matrix covers what `work_remain.md:108-113` documents as proven: scalar Boolean, Int32, Float, Double, String, DateTime, Duration plus deployed array tags.
|
||
4. `cargo build --workspace` and `cargo test --workspace` green; `cargo clippy --workspace -- -D warnings` clean.
|
||
|
||
**Resolves when:** F19-F26 are all closed and the four DoD bullets above pass.
|
||
|
||
**M5 STATUS (commit `9063f10`): functionally LIVE.** End-to-end `cargo run -p mxaccess --example asb-subscribe -- --tag TestChildObject.TestInt` Connect → AuthenticateMe → Register → Read → Disconnect against the live MxDataProvider, returning the real tag value over the wire (`type_id=4 length=4 payload=[99,0,0,0]`). DoD checklist:
|
||
1. ✅ Live `asb-subscribe` succeeds against the AVEVA endpoint.
|
||
2. ⚠️ Wire structure matches .NET's request bytes for AuthenticateMe / Register byte-by-byte (verified via `asb-relay` middleman with the .NET probe routed through ClientVia); responses round-trip via the F30 dict-id resolution post-pass. Strict byte-identical parity for the response side is not guaranteed because WCF chunks `Bytes8/16/32` records at different boundaries — both forms are functionally equivalent and `collect_asbidata_payloads` concatenates chunks (commit `cf97eab`).
|
||
3. ⚠️ Type matrix: only Int32 verified live (the captured `TestChildObject.TestInt` tag). Bool / Float / Double / String / DateTime / Duration / arrays not yet exercised — pending one or more sample tags per type and an `asb-subscribe` extension that loops over them. F32 captures this expansion.
|
||
4. ✅ `cargo build --workspace` + `cargo test --workspace` (711 tests) + `cargo clippy --workspace -- -D warnings` all green.
|
||
|
||
**Remaining open work for full M5 closeout** (none are P0 blockers anymore):
|
||
- ~~F32~~: resolved (commit `<this commit>`) via option (b) — three-type live coverage is the deployable maximum; missing types are Galaxy-provisioning-gated.
|
||
- **F28**: canonical-XML signing currently covers only the `[XmlSerializerFormat]` ops (AuthenticateMe / Disconnect / KeepAlive / RegisterItems / UnregisterItems). Read / Write / CreateSubscription / AddMonitoredItems / Publish / etc. still sign over NBFX wire bytes via the legacy fallback. Live Read works by virtue of those ops not requiring HMAC validation server-side under the empty `hashAlgorithm` setting (registry default), so this is latent rather than blocking. Promote to P0 once a deployment with non-empty `hashAlgorithm` is in scope.
|
||
- ~~F29~~: resolved (commit `<this commit>`) — `nbfs.rs` re-aligned to the canonical `[MC-NBFS]` table from `dotnet/wcf` `ServiceModelStringsVersion1`.
|
||
- ~~F26 stream subscription~~: resolved (commit `<this commit>`) — `AsbSession::subscribe(subscription_id)` returns an `AsbSubscription: Stream<Item = Result<MonitoredItemValue, Error>>` driven by an internal `tokio::spawn`'d publish-loop. Drop of the subscription aborts the loop. Per-`PublishResponse` `values` array is fanned out as individual stream items; transport errors are delivered as the final stream item before termination. Inner `publish_loop` helper is split out so it's testable in isolation against any closure-based fake `publish_fn`. 3 new tests pin: compile-time `Stream + Send + Unpin`, multi-batch + terminal-error round-trip, consumer-drop short-circuits the publisher. Workspace 718 → 721 tests.
|
||
|
||
**Cumulative execution log.** F19 + F23 (`ed17c07`); F24 (`7611d9e`); F20 (`9dfd193`); F22 (`43c10a1`); F21 (`5f98558`); F25 step 1 (`25dbd8d`); F25 step 2 (`a2b8989`); F25 step 3 (`c4bf0a0`); F25 step 4 (`1e59249`); F25 step 5 (`9b8133f`); F25 step 6 (`321b796`); F25 step 7 (`1b1ee1e`); F26 step 1 (`8a0f92b`); F26 step 2 (`14bb529`); example rewrite (`c6570dc`); F25 step 8 (`b543eb1`); F25 step 9 (`0441a2e`); F25 step 10 (`9876b4e`); F26 step 3 (`<previous>`); **F25 live-bring-up reconciliation** (this commit):
|
||
- F25 live-bring-up reconciliation: live `asb-subscribe` + `asb-relay` (TCP middleman) capture-and-diff against AVEVA's MxDataProvider on Windows. Five concrete fixes landed:
|
||
1. **NBFX `PrefixElement_a..z` (0x5E-0x77) and `PrefixAttribute_a..z` (0x26-0x3F) decode + encode arms** — single-letter-prefix records that WCF emits in responses but our codec only recognised the dictionary-named cousins (`PrefixDictionaryElement_a..z` 0x44-0x5D, `PrefixDictionaryAttribute_a..z` 0x0C-0x25). The server's ConnectResponse hit `0x65 = PrefixElement_h` for a dynamically-named element (e.g. `<h:Foo>`) and our decoder bailed with `unknown NBFX record byte 0x65`. Both directions now round-trip; the encoder picks the short-form arm whenever `prefix_letter_offset(prefix).is_some()`.
|
||
2. **xmlns redeclaration on `<Data>` and `<InitializationVector>` inside `AuthenticationData` / `PublicKey`** — `[XmlType(Namespace = "http://asb.contracts.data/20111111")]` on the AuthenticationData / PublicKey classes (`AsbContracts.cs:350-381`) means XmlSerializer emits an `xmlns="..."` redeclaration on each direct child. The default-ns scope ends at `</Data>`, so `<InitializationVector>` needs its own redeclaration to stay in the data namespace; without it the server fell back to messages-namespace and the deserialiser threw an `InternalServiceFault`. Connect handshake now completes end-to-end with the apollo:V2 ConnectionLifetime and a real ServicePublicKey.
|
||
3. **SOAP-fault detection on the response path** — `ClientError::SoapFault { action, code, reason }` surfaces when the response Action header matches the canonical `dispatcher/fault` template; we previously let body decoders blindly run and hit `MissingField { field: "Status" }` which masked the fact that the wire was a fault. The reason text is extracted as the longest `NbfxText::Chars` in the body — robust against the `nbfs.rs` static-dictionary id mismatches noted below.
|
||
4. **Identified blocker**: `ConnectedRequest` signing currently HMACs the **NBFX wire bytes** of the unsigned envelope. .NET's `AsbSystemAuthenticator.Sign` (`AsbSystemAuthenticator.cs:79`) HMACs `Encoding.UTF8.GetBytes(request.ToXml())` — the **canonical XML serialisation** of the message contract via `XmlSerializer` with namespace `"urn:invensys.schemas"` (`AsbSerialization.cs:12-48`). Until the Rust port emits identical XML bytes, the HMAC mismatches and the server rejects every signed request (`AuthenticateMe`, `RegisterItems`, etc.) with a generic `dispatcher/fault` InternalServiceFault. Connect itself is unsigned (extends `ServiceMessage`, no `ConnectionValidator` header) which is why it works today. The fault's `a:RelatesTo` UniqueId in our captures matches the AuthenticateMe `MessageID`, confirming the failure point. **New followup F28** captures the XML-canonicaliser scope.
|
||
5. **`nbfs.rs` static dictionary ids drift** at id 114+ vs. the canonical `[MC-NBFS]` table (`Fault`/`Code`/`Reason`/`Text`/`Value` are 20 IDs higher on the wire than what we encode). Doesn't affect requests we send (we only encode IDs ≤44 = `ReplyTo`, all correct), but breaks `decode_envelope`'s element-by-name matching for fault bodies. Tracked as **F29**.
|
||
|
||
Workspace: 702 tests pass (no test count delta — wire-only fixes). Live status: Connect handshake working with real DH key + apollo encryption; AuthenticateMe and onwards blocked on F28. Companion diagnostic example `asb-relay.rs` (TCP middleman that hex-dumps both directions to stderr) lands as a permanent debugging aid.
|
||
|
||
|
||
- F26 step 3 (`<previous>` in the cumulative log): `mxaccess::AsbSession` is a high-level cheap-clone async API on top of `AsbTransport`, deliberately **parallel** to the NMX-shaped `Session` rather than unified. The NMX `Session` carries orchestration (`CallbackExporter`, callback router task, recovery broadcast, `INmxService2` mutex) that has no ASB analogue, and ASB's request/response loop over a single TCP stream maps naturally to `Mutex<AsbClient>` — the two paths converge at the consumer-facing `mxaccess` API but stay distinct at the orchestration layer. `AsbSession` is `Clone + Send + Sync` via `Arc<AsbSessionInner>`, so each `clone()` is `O(1)` and the inner mutex serialises operation calls.
|
||
|
||
For the per-step body of every line listed in the cumulative execution log, see the matching commit message — each commit is a single F-number step with its own scope, fixtures, test count delta, and follow-up notes. The detailed per-step write-ups previously inlined here added little beyond what `git show <hash>` provides.
|
||
|
||
### F28 — Canonical XML serialiser for `ConnectedRequest` signing (matches `XmlSerializer.Serialize` byte-for-byte)
|
||
**Status: PARTIALLY RESOLVED.** The five `[XmlSerializerFormat]` ops (AuthenticateMe, Disconnect, KeepAlive, RegisterItems, UnregisterItems) plus the per-action `ValidatorWireFormat` selector + DH-params-from-registry + dynamic-dict id management all landed in commits `f14580e` / `104efc4`. Live AuthenticateMe + RegisterItems work end-to-end (commit `9063f10`). Read / Write / CreateSubscription / AddMonitoredItems / Publish / DeleteMonitored / DeleteSubscription / PublishWriteComplete still sign over NBFX wire bytes via the legacy fallback; works in practice because the live registry has empty `hashAlgorithm` (no HMAC required for the unforced-MAC path), but will break under any deployment that sets a real algorithm. **Severity now P2** — promote back to P0 if a hashAlgorithm-non-empty environment is in scope.
|
||
**Severity:** P0 — blocks every signed ASB operation (AuthenticateMe, RegisterItems, all data-plane RPCs).
|
||
**Source:** F25 live-bring-up; `AsbSystemAuthenticator.cs:79` + `AsbSerialization.cs:12-48`.
|
||
**Why deferred:** `AsbSystemAuthenticator.Sign` HMACs `Encoding.UTF8.GetBytes(request.ToXml())` — the XML text produced by .NET's `XmlSerializer.Serialize(writer, value)` with `XmlSerializerNamespaces` = `"urn:invensys.schemas"`, then re-parsed via `XDocument.Load` and re-saved to normalise xmlns attribute ordering (xsi before xsd; see `AsbSerialization.cs:36-47`). The HMAC must match the server's recomputation, which uses the same XmlSerializer on the deserialised request — so the Rust port has to produce byte-identical XML. We currently HMAC the NBFX wire bytes of the unsigned envelope, which never matches.
|
||
|
||
**Resolves when:** A canonical XmlSerializer-compatible emitter lands in `mxaccess-asb` (probably `crates/mxaccess-asb/src/xml_canonical.rs`). Scope per request type: `AuthenticateMe`, `Disconnect`, `KeepAlive`, `RegisterItemsRequest`, `UnregisterItemsRequest`, `ReadRequest`, `WriteBasicRequest`, `PublishWriteCompleteRequest`, `CreateSubscriptionRequest`, `DeleteSubscriptionRequest`, `AddMonitoredItemsRequest`, `DeleteMonitoredItemsRequest`, `PublishRequest`. Each derives its XML form from the `[MessageContract] / [MessageBodyMember(Order = N, Namespace = ...)]` attributes plus per-type `[XmlType(Namespace = ...)]` on `AuthenticationData` / `PublicKey`. The `request_xml_utf8` argument to `AsbAuthenticator::sign` is already wired correctly — only the producer is missing. Once HMAC matches, the existing `ConnectionValidator` header path (`mac` + `iv` base64 round-trip) is already validated by the F23 unit tests. **Resolves**: F25 live AuthenticateMe + RegisterItems + every signed operation; M5 DoD bullets 1+2 unblocked.
|
||
|
||
**Captured fixtures (commit `dbb580b`).** `MxAsbClient.Probe --dump-signed-xml` (new flag, 2026-05-05) produces canonical `request.ToXml()` output for the five primary ConnectedRequest shapes; fixtures saved under `rust/crates/mxaccess-asb/tests/fixtures/signed-xml/{authenticate-me,disconnect,keep-alive,register-items,unregister-items}.xml`. Byte sizes pinned: 1000/980/705/1068/1072. Plus `authenticate-me-empty-mac-iv.xml` (896 bytes) for the actual signing input shape (validator's MAC + IV are empty during `request.ToXml()`; .NET's `AsbSystemAuthenticator.Sign:79` mutates them only AFTER HMAC computation). The companion `README.md` documents 10 inferred XmlSerializer rules — most importantly: (1) element name = class name (NOT MessageContract.WrapperName), (2) field order = C# declaration order (NOT [MessageBodyMember.Order]), (3) `[XmlType(Namespace=...)]` on a field's type causes per-child xmlns redeclaration on the children, NOT the wrapper element, (4) the `*Specified` pattern controls whether `<Xxx>` is emitted, (5) CRLF line endings + 2-space indent + UTF-8-bytes-of-utf-16-declaration, (6) empty `byte[]` → self-closing `<Tag xmlns="..." />` (NOT `<Tag></Tag>`).
|
||
|
||
**Emitter landed (commit `f14580e`).** `mxaccess-asb::xml_canonical` exposes `emit_authenticate_me_xml`, `emit_disconnect_xml`, `emit_keep_alive_xml`, `emit_register_items_request_xml`, `emit_unregister_items_request_xml`. Seven fixture-comparison tests pass (byte-equal vs. .NET output for both filled-MAC + empty-MAC variants of AuthenticateMe, plus the four other shapes). Plumbing: `AsbAuthenticator::peek_next_message_number` exposes the pre-allocated message number; `AsbClient::send_signed_envelope[_one_way]` gain `xml_for_signing: Option<&[u8]>`. `connect`, `disconnect`, `keep_alive`, `register_items`, `unregister_items` now build a pre-signing `ConnectionValidator` (empty MAC + IV) → emit canonical XML → pass to HMAC. Other ops (Read, Write, Subscription) still use the legacy NBFX-bytes path.
|
||
|
||
**Registry-driven DH params (commit `f14580e`).** `tools/Get-AsbPassphrase.ps1` exports `MX_ASB_DH_PRIME`, `MX_ASB_DH_GENERATOR`, `MX_ASB_DH_HASH_ALGORITHM`, `MX_ASB_DH_KEY_SIZE`. The `asb-subscribe` example honours those env vars to override `CryptoParameters::defaults()` (which is the .NET reference's 1024-bit fallback). Each AVEVA install picks its own DH group at provisioning time — typically a 768-bit prime, NOT the default 1024-bit. With the wrong prime, `Connect` succeeds at the byte level but the shared-secret derivation diverges, breaking AuthenticateMe's encrypted ConsumerData verification. Empty registry `hashAlgorithm` maps to `HashAlgorithm::Unrecognised` to match `AsbSystemAuthenticator.CreateHmac:84-93` semantics where empty + `forceHmac=true` falls through to HMAC-SHA1.
|
||
|
||
**Remaining live blocker (commit `fd38189`).** With canonical XML byte-equal to .NET's AND DH params from the registry, AuthenticateMe still produces `dispatcher/fault` InternalServiceFault. `MX_ASB_TRACE_DERIVE`-gated diagnostic traces in both the Rust authenticator and the .NET reference confirm: crypto_key length matches (176 bytes = 96-byte shared secret + 80-byte passphrase); passphrase bytes [96..176] of the crypto_key are identical between Rust and .NET (same registry source, same UTF-8 encoding). The shared-secret prefix [0..96] differs per session (random DH), but should round-trip correctly with the server.
|
||
|
||
**Crypto stack ruled out** (commit `<this commit>`). Deterministic-HMAC fixture test (`auth.rs::tests::deterministic_hmac_matches_dotnet_fixture`) takes pinned inputs (passphrase, prime, generator, private-key bytes, remote-pub bytes, message number, connection ID, AES IV, consumer-data + IV) and asserts byte-equality of each step:
|
||
1. `shared = remote_pub^private_key mod prime` — ✅ matches .NET
|
||
2. `crypto_key = shared || passphrase_utf8` — ✅ matches .NET
|
||
3. `hmac = HMAC-SHA1(crypto_key, xml_utf8)` — ✅ matches .NET (HMACSHA1)
|
||
4. `aes_key = PBKDF2-SHA1(base64(crypto_key), "ArchestrAService", 1000, 16)` — ✅ matches .NET (Rfc2898DeriveBytes.Pbkdf2)
|
||
5. `encrypted_mac = AES-CBC(aes_key, iv=zeros, hmac, PKCS7)` — ✅ matches .NET (System.Security.Cryptography.Aes)
|
||
|
||
The fixture is captured by `MxAsbClient.Probe --dump-deterministic-hmac` (`src/MxAsbClient.Probe/Program.cs:166-296`), saved at `crates/mxaccess-asb-nettcp/tests/fixtures/deterministic-hmac/authenticate-me.kv`. With all 5 crypto steps proven byte-equal to .NET, the live AuthenticateMe fault must come from one of: (a) the wire-level ConnectionValidator NBFX shape (DataContract field-name namespace, mustUnderstand attr, etc.), (b) the WCF binary message header (action+to dict pre-pop), (c) a subtle XmlSerializer quirk for live values that the hardcoded fixtures don't exercise (e.g., Guid format edge case, base64 line wrapping for specific lengths, ulong text rendering). Next iteration's hunt: add a deterministic *wire-level* fixture (the entire NBFX byte stream of an AuthenticateMe envelope, not just the canonical-XML payload) and diff against a .NET probe capture for the same inputs.
|
||
|
||
|
||
### F27 — Constant-time DH `mod_exp` (swap `num-bigint` → `crypto-bigint::BoxedUint`)
|
||
**Severity:** P2 (security regression vs the long-term Rust target — but at parity with the .NET reference today, so not a release-blocker)
|
||
**Source:** F23 (`crates/mxaccess-asb-nettcp/src/auth.rs:179,303`); originally flagged in `design/30-crate-topology.md:269-274` and the project's `review.md` MAJOR finding.
|
||
**Why deferred:** `crypto-bigint 0.5`'s `BoxedUint` does not yet expose `pow_mod` over heap-allocated values. The fixed-size `Uint<L>` types do, but require the prime to be parsed into a fixed bit-width and there's no decimal-string parser in `crypto-bigint`. F23 ships with `num-bigint` to keep parity with the .NET reference (which is also not constant-time); the constant-time upgrade is a separate, isolated swap.
|
||
**Resolves when:** Either (a) `crypto-bigint` lands a stable `BoxedUint::pow_mod` and a decimal-string parser, or (b) we add a small fixed-width DH backend that parses the registry prime into `U2048` once at session construction. At that point `auth::AsbAuthenticator::new`, `crypto_key`, and `generate_private_key` swap `num_bigint::BigUint::modpow` for the constant-time variant; tests stay unchanged because the wire-byte representation is identical.
|
||
|
||
### F2 — NTLM verify_signature path + constant-time MAC compare (server-to-client direction)
|
||
**Severity:** P2
|
||
**Source:** M2 wave 1, `crates/mxaccess-rpc/src/ntlm.rs`
|
||
**Why deferred:** The .NET `ManagedNtlmClientContext` only implements client-to-server signing (`cs:30,124`); there is no implementation of server-to-client sign/seal keys or `verify_signature`. Both are needed when the callback exporter receives a signed inbound frame from `NmxSvc.exe`, but no such fixture exists yet.
|
||
**Resolves when:** M2 wave 3 (callback exporter) captures an `INmxSvcCallback::StatusReceived` frame with an `auth_value` trailer per `design/60-roadmap.md:56` (DoD #3) and a fixture lands under `tests/fixtures/m2-status-frame/`. Add `subtle = "2"` and gate the byte compare behind `ConstantTimeEq` at the same time.
|
||
|
||
### F3 — Cross-domain NTLM Type1/2/3 fixture
|
||
**Severity:** P2
|
||
**Source:** M2 wave 1, `crates/mxaccess-rpc/src/ntlm.rs`
|
||
**Why deferred:** All current NTLM fixtures are single-domain (the local AVEVA install). Tracked separately in `design/70-risks-and-open-questions.md` R8 (P1 risk) and the open-evidence-gaps table.
|
||
**Resolves when:** A multi-domain AVEVA test harness lands and a successful cross-domain authenticate round-trip captures Type1/2/3 bytes. Notes: this clears R8.
|
||
|
||
|
||
### F10 — `IObjectExporter::ResolveOxid2` (opnum 4) body codec
|
||
**Severity:** P2
|
||
**Source:** M2 wave 2, `crates/mxaccess-rpc/src/object_exporter.rs`
|
||
**Why deferred:** `ObjectExporterMessages.cs` only models opnum 0 (`ResolveOxid`). Opnum 4 (`ResolveOxid2`) has a different response shape — it adds a `COMVERSION` plus an `AuthnHnt[]` array. The .NET reference does not exercise this path, so there's no executable spec to mirror.
|
||
**Resolves when:** Either a `[MS-DCOM]` §3.1.2.5.1.4-derived layout is verified against a captured `ResolveOxid2` exchange, or the .NET reference grows a `ParseResolveOxid2*` helper.
|
||
|
||
### F11 — `IRemUnknown::RemAddRef` and `RemRelease` body codecs
|
||
**Severity:** P2
|
||
**Source:** M2 wave 2, `crates/mxaccess-rpc/src/rem_unknown.rs`
|
||
**Why deferred:** `RemUnknownMessages.cs` declares the opnums (`:9-10`) but does not implement encoders/decoders. The Rust port matches that exactly per "port what is already proven."
|
||
**Resolves when:** The .NET reference adds bodies for opnums 4 / 5 (or a captured frame establishes the on-wire shape). At that point port them into `rem_unknown.rs` alongside the existing `RemQueryInterface` codec.
|
||
|
||
|
||
|
||
## Resolved
|
||
|
||
### F16 — Real `Session::recover_connection` reconnect loop (re-bind + re-advise)
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`). Replaces the wave-2 no-op `recover_connection` with the full .NET-equivalent shape (`MxNativeSession.cs:399-474`).
|
||
|
||
Three pieces, all in `crates/mxaccess/src/session.rs`:
|
||
|
||
1. **Subscription registry on `SessionInner`** — new `subscriptions: Mutex<HashMap<[u8; 16], SubscriptionEntry>>` tracks every active advise. `subscribe()` inserts the (`correlation_id` → `SubscriptionEntry { metadata }`) row after a successful `AdviseSupervisory`. `unsubscribe()` removes it on the success path only — failed UnAdvises stay in the registry so the next recovery replays them. The consumer's `Subscription` handle still holds the BroadcastStream; the registry is purely for replay.
|
||
2. **Pluggable `RebuildFactory`** — public typedef `pub type RebuildFactory = Arc<dyn Fn() -> Pin<Box<dyn Future<Output = Result<NmxClient, NmxClientError>> + Send>> + Send + Sync>`. Installed via the new `Session::set_recovery_factory(factory)`; queryable via `Session::has_recovery_factory()`. Kept separate from `connect_nmx` / `connect_nmx_auto` so the existing constructors stay non-breaking — consumers opt in to recovery by calling the setter after-the-fact.
|
||
3. **Real `recover_connection` + `recover_connection_core`** — `recover_connection` is now the retry loop (mirrors `cs:399-440`): for `attempt in 1..=policy.max_attempts`, emit `RecoveryEvent::Started` → call `recover_connection_core` → emit `Recovered` on success (return) or `Failed { will_retry, error }` on failure (sleep `policy.delay`, retry, or bubble the last error after the budget is exhausted). `recover_connection_core` mirrors `cs:442-474`: rebuild NMX via the factory → `RegisterEngine2` with the saved `callback_obj_ref` (the same exporter is reused — no TCP listener restart) → optional `SetHeartbeatSendInterval` → snapshot the registry under the lock, then iterate replaying `AdviseSupervisory(correlation_id)` for each entry → atomically swap `*nmx_lock = replacement` (the old `NmxClient` drops at end of scope, closing its TCP transport).
|
||
|
||
Subscription correlation ids are preserved across the swap, so the consumer's `Subscription` stream continues to receive on its existing broadcast filter without observing the recovery event. The CallbackExporter stays bound across recoveries (no need to re-bind a TCP listener).
|
||
|
||
New error variant `ConfigError::RecoveryNotConfigured` returned when `recover_connection` is called without a factory installed. New public re-export: `RebuildFactory`.
|
||
|
||
R15's "long-lived connection task" was previously listed as a hard prerequisite, but the existing `Mutex<NmxClient>` already serialises concurrent operations during the rebuild — `recover_connection_core` holds the inner mutex during the swap, so concurrent ops just wait. Functionally equivalent to the long-lived-task design.
|
||
|
||
**Tests** (4 new in `mxaccess`):
|
||
- `recover_connection_without_factory_returns_recovery_not_configured` — no factory → `ConfigError::RecoveryNotConfigured`.
|
||
- `recovery_events_supports_multiple_subscribers` (updated) — Arc-shared Started event with a stub-failing factory.
|
||
- `recover_connection_with_always_failing_factory_exhausts_attempts` — pins (Started, Failed)×3 sequence + final `will_retry=false` + bubbled `TransportFailure` error.
|
||
- `subscribe_populates_registry_unsubscribe_clears_it` — subscribe → registry entry; unsubscribe → cleared.
|
||
|
||
Workspace `mxaccess` 65 → 67 tests; default-feature clippy clean. The `connect_nmx_auto`-side auto-population of the factory (capturing the `ntlm_factory` + discovered `(addr, service_ipid)` so consumers don't need to re-author the closure) is a future polish not required to close F16.
|
||
|
||
### F33 — Live wire reconciliation for the ASB subscription path
|
||
**Resolved:** 2026-05-06 (commits `218f4c4`, `7a5f251`, `<this commit>`). `MX_ASB_TRACE_REPLY` capture during investigation revealed the live MxDataProvider returns a `Result` wrapper with `<resultCodeField>1</>` + `<successField>false</>` followed by **empty** `<ASBIData/>` payloads when it short-circuits on `InvalidConnectionId` — the same transient race F31 fixed for `RegisterItems`. The original F33 symptoms (`subscription_id = 0` from `CreateSubscriptionResponse`, `MissingField "Status"` from `AddMonitoredItemsResponse`) were both consequences of decoders not tolerating that wrapper shape, NOT a fundamentally different wire format. Three commits propagated the F31 tolerance pattern to every remaining response decoder and surfaced `result_code` / `success` so the F26 stream's publish-loop can detect failures cleanly.
|
||
|
||
1. `218f4c4` — `decode_read_response` + `client::read` retry loop. Added `result_code` / `success` to `ReadResponse`. Live verified: `TestChildObject.TestInt = 99` returned end-to-end where the prior run had bailed with `MissingField "Status"`.
|
||
2. `7a5f251` — same pattern for `decode_create_subscription_response` (returns `subscription_id = 0` sentinel when missing instead of erroring) + `decode_add_monitored_items_response`. Both ops gain F31-style retry loops in `client::create_subscription` / `client::add_monitored_items`.
|
||
3. `<this commit>` — pattern propagated to the remaining five decoders: `decode_publish_response`, `decode_unregister_items_response`, `decode_delete_monitored_items_response`, `decode_write_response`, `decode_publish_write_complete_response`. Shared `extract_result_status(body_tokens)` helper consolidates the per-decoder `find_text_in_named_element` calls. The F26 stream's `publish_loop` (`asb_session.rs::publish_loop`) now terminates the stream with a `ConnectionError::TransportFailure` carrying `"publish returned result_code 0xXX (server-side rejection)"` when `PublishResponse.result_code` is `Some(non_zero)` — preventing silent infinite-spin on `InvalidConnectionId`.
|
||
|
||
Live read still passes after all changes. `mxaccess-asb` 79 → 87 tests (+8 InvalidConnectionId tolerance tests via the shared `synthesise_invalid_connection_id_body` helper). Default-feature clippy clean.
|
||
|
||
The `examples/asb-subscribe.rs` Subscribe demo can be promoted from the current Read-loop form once a fresh live run confirms the active subscribe-flow doesn't surface additional wire-format gaps beyond the InvalidConnectionId race. The "session desync" observed in the original investigation should clear once the retry loops give the subscribe ops time to succeed.
|
||
|
||
### F12 — `NmxClient::create` (auto-resolving COM-activation factory)
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). Builds on F6: new `NmxClient::create(ntlm_factory)` constructor in `crates/mxaccess-nmx/src/client.rs`, gated on `cfg(all(windows, feature = "windows-com"))`. New crate-level feature `mxaccess-nmx/windows-com` propagates to `mxaccess-rpc/windows-com`. Mirrors `ManagedNmxService2Client.Create()` (`cs:30-64`) + `ResolveService` (`cs:491-523`) — six steps: (1) `com_objref_provider::marshal_activated_iunknown_objref("NmxSvc.NmxService", MarshalContext::DifferentMachine)` activates the COM class and emits an OBJREF blob; (2) `ComObjRef::parse` extracts `oxid` + `ipid` (the activated server's `IUnknown` IPID); (3) `resolve_oxid_with_managed_ntlm_packet_integrity` against `127.0.0.1:135` (RPCSS endpoint mapper) returns the server's `(host, port)` bindings + `IRemUnknown` IPID; (4) the `ncacn_ip_tcp` non-security binding's `host[port]` text is parsed via the new `parse_bracketed_host_port` helper (mirrors the .NET `ParseBracketedHost` / `ParseBracketedPort` pair, using `rfind` so FQDNs with `.` round-trip — matches `cs:540-561`); (5) a fresh transport binds to `IRemUnknown` and calls `RemQueryInterface(iunknown_ipid, INmxService2_IID, fresh_causality_id, public_refs=5)` — the `RemQiResult` carries the new `INmxService2` IPID; (6) a second fresh transport binds to `INmxService2` via `Self::connect`. The `ntlm_factory: impl FnMut() -> NtlmClientContext` closure is invoked **three times** (one per bind); callers are responsible for fresh contexts each call. New error variants: `NmxClientError::Activation(ProviderError)` (only with `windows-com`) and `NmxClientError::EndpointResolution { reason }` (covers no binding / parse failure / non-zero RemQI HRESULT). 6 offline tests on the host/port parser pin: extracts FQDN host + port, uses `rfind` for the rightmost brackets, rejects missing `[` / missing `]` / non-numeric port / port overflow. 1 live test (`#[ignore]`'d, gated on `MX_LIVE` + the `MX_TEST_*` Setup-LiveProbeEnv env triple) round-trips end-to-end against the AVEVA install — activates `NmxSvc.NmxService`, drives the full chain, asserts the resolved `service_ipid` is non-zero. Live verification: passes. Workspace tests went 17 → 23 in mxaccess-nmx (+6).
|
||
|
||
**Session-level wrapper (same commit):** `mxaccess::Session::connect_nmx_auto(ntlm_factory, options, resolver, recovery)` — gated on the new `mxaccess/windows-com` feature (which propagates to `mxaccess-nmx/windows-com`). Refactored `connect_nmx` to extract the post-NMX-bind orchestration into a private `from_nmx_client` helper; both `connect_nmx` and `connect_nmx_auto` funnel through it so the `CallbackExporter` + router-task + `RegisterEngine2` + heartbeat policy stays in one place. `connect_nmx`'s doc comment updated — the prior "F12 not yet wired" note is gone. With both layers landed, the .NET `MxNativeSession.Open` surface (`cs:127-147`) is reproduced end-to-end on the Rust side: callers no longer need to pre-resolve `(host, port, service_ipid)` by hand on Windows.
|
||
|
||
### F32 — Live type-matrix coverage for `asb-subscribe`
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). Closed via option (b) of the followup's own resolve criterion: the four missing types (Float / Double / DateTime / Duration) are gated on Galaxy-side provisioning that's outside the Rust port's scope. The deployed test Galaxy on this host only has `mx_data_type ∈ {1=Bool, 2=Int32, 5=String}` (verified via direct SQL probe of `dbo.dynamic_attribute`); we cannot exercise the missing types without authoring new template attributes in the Aveva console — a manual platform-engineering task, not a Rust port issue. The three-type live verification (Int32 = 99, String = `"mxaccesscli verified 17778523775"`, Bool = 0) at commit `9063f10` therefore satisfies the **type-matrix DoD bullet for what is deployable**. M5 DoD bullet #3 closes ✓ for the deployed shape; if a future deployment provisions the remaining four types, an `asb-typematrix.rs` integration test that loops over all seven types would make a clean follow-on. **Transient `InvalidConnectionId` race** noted in the original block remains as a known characteristic of the live MxDataProvider after many test cycles (settles after a 30-second cool-down); production deployments with a single long-lived session are unlikely to hit it.
|
||
|
||
### F6 — Port `ComObjRefProvider.cs` (OBJREF emitter via Win32 `CoMarshalInterface`)
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). New module `crates/mxaccess-rpc/src/com_objref_provider.rs` (~330 LoC including tests) gated on `cfg(all(windows, feature = "windows-com"))`. Pulls `windows = "0.59"` (features `Win32_Foundation` + `Win32_System_Com` + `Win32_System_Com_Marshal` + `Win32_System_Com_StructuredStorage` + `Win32_System_Memory`) as an optional dep behind the existing `windows-com` feature; default footprint stays slim. Public API mirrors `ComObjRefProvider.cs` 1:1: `MarshalContext` enum (InProcess / Local / DifferentMachine — wraps the `MSHCTX_*` newtype constants), `clsid_from_prog_id(&str) -> Result<GUID, ProviderError>` (wraps `CLSIDFromProgID`), `marshal_activated_iunknown_objref(prog_id, ctx)` (activates via `CoCreateInstance(CLSCTX_INPROC_SERVER | CLSCTX_LOCAL_SERVER | CLSCTX_REMOTE_SERVER)` then marshals), `marshal_iunknown_objref(unknown, ctx)` (uses `IUnknown::IID`), `marshal_interface_objref(unknown, iid, ctx)` (the underlying `CoMarshalInterface` over an HGlobal-backed `IStream`). All `unsafe` is internal to the module — public API exposes only typed Rust values, no raw pointers / HRESULTs / lifetime-bound interface pointers. Each `unsafe` block carries an inline SAFETY comment. `ProviderError` enumerates the four documented failure modes (UnknownProgId, ActivationFailed, MarshalFailed, GlobalLockFailed) plus the apartment-init pre-check (ApartmentInitFailed). Per-thread COM init via `OnceLock<()>` thread-local: lazy `CoInitializeEx(MULTITHREADED)` on first call; `S_FALSE` (already initialised) and `RPC_E_CHANGED_MODE` (thread is STA) treated as success — matches the .NET runtime's tolerant apartment behaviour. 4 offline tests pin: `MarshalContext` → `MSHCTX_*` mapping, `ensure_apartment` idempotence, `clsid_from_prog_id` returns `UnknownProgId` for fake ProgIDs, `marshal_activated_*` short-circuits at the resolution stage. 1 live test (`#[ignore]`'d, gated on `MX_LIVE`) round-trips the real `NmxSvc.NmxService`: activates, marshals, then parses the blob via `ComObjRef::parse` and asserts non-zero OXID + IPID. Live verification: passes against the AVEVA install on this host. Workspace tests went 183 → was 179 in mxaccess-rpc (+4 new). Unblocks F12 (NmxClient::create) — the auto-resolving COM-activation factory can now chain `marshal_activated_iunknown_objref` → `ComObjRef::parse` → `resolve_oxid_with_managed_ntlm_packet_integrity` → `RemQueryInterface` over the existing primitives.
|
||
|
||
### F14 — `tiberius`-backed SQL implementation of `Resolver` + `UserResolver`
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). New module `crates/mxaccess-galaxy/src/sql_resolver.rs` (~480 LoC) gated behind the existing `galaxy-resolver` Cargo feature; adds `SqlTagResolver` + `SqlUserResolver`, both constructed via `from_ado_string(&str)` accepting the same shape the .NET reference uses by default (`Server=localhost;Database=ZB;Integrated Security=True;Encrypt=False;TrustServerCertificate=True`). `Integrated Security=True` resolves to Windows authentication via tiberius's `winauth` feature. Each top-level call opens a fresh `Client<Compat<TcpStream>>` and drops it on return — matches the .NET `await using` shape. `tiberius`'s `Client::query` only accepts positional `@P1..@PN` placeholders (delegates to `sp_executesql`); the canonical `RESOLVE_SQL` / `BROWSE_SQL` / `USER_BY_GUID_SQL` / `USER_BY_NAME_SQL` constants are rewritten once-per-process via `OnceLock<String>` (`@objectTagName` → `@P1`, etc.). `read_metadata` mirrors `ReadMetadata` (`cs:149-165`) byte-by-byte: signed `smallint` → `i16` widened to `u16` for platform/engine/object IDs (matches the .NET `checked((ushort)...)`), `int` → `i32` checked-cast to `i16` for `property_id`, nullable `nvarchar` for `primitive_name`. `read_user_profile` mirrors `ReadProfile` (`cs:76-85`) including the `roles_text` blob → `parse_role_blob` round-trip. New deps: `tiberius 0.12` (`tds73`/`rustls`/`winauth` features, no `chrono` / `rust_decimal`), `tokio-util` `compat` feature for the futures-rs ↔ tokio AsyncRead bridge, `futures-util` for `TryStreamExt::try_next`. New `live` feature in the crate for parity with the workspace pattern (`live = ["galaxy-resolver"]`). 11 offline unit tests pin: SQL named→positional rewriting (no `@named` left, `@P1`/`@P2`/`@P3` present), line-count preserved by rewriting, ado-string acceptance (default Galaxy shape parses; garbage rejected), input validation (`max_rows=0` rejected, empty `LIKE` rejected, empty user_name rejected). Two `#[cfg(feature = "live")]` `#[ignore]`'d tests round-trip against a real Galaxy DB (gated on `MX_LIVE` + `MX_GALAXY_DB` env vars per `tools/Setup-LiveProbeEnv.ps1`): `live_resolve_test_child_object_test_int` (TestChildObject.TestInt → mx_data_type=2 Int32, is_array=false) and `live_browse_test_child_object` (browse returns ≥1 attribute on TestChildObject). Both pass against the local AVEVA install.
|
||
|
||
### F4 + F5 — BindAck body parser + captured-bytes round-trip
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). Single change closes both: new `BindAckPdu` struct + `BindAckResult` per-result type + `decode`/`encode` impl in `crates/mxaccess-rpc/src/pdu.rs`. Body layout per `[C706]` §12.6.3.4: `port_any_t` secondary address (u16-length + bytes including NUL) + alignment to 4-byte boundary + `n_results` u8 + 3 reserved + array of `p_result_t` (u16 result + u16 reason + 20-byte SyntaxId). Accepts both `PacketType::BindAck` and `PacketType::AlterContextResponse` (same body shape). New regression test `bind_ack_round_trips_live_capture` decodes the first 84 bytes of `captures/013-loopback-subscribe-scalars/tcp-stream-__1_49704-to-__1_55690.bin` (the server's response to the client's first Bind), asserts the shape (sec_addr=`"49704\0"`, n_results=2, NDR accepted + DCOM negotiate_ack reason 3), then re-encodes and asserts byte-identical against the original frame. Stronger live-wire parity than the prior synthetic-frame tests. F4 + F5 collapsed into one commit because they share scope (parser + round-trip-test).
|
||
|
||
### F29 — Align `mxaccess-asb-nettcp::nbfs` static dictionary ids with canonical `[MC-NBFS]` table
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). The original hand-curated table was wrong starting at id 74 — entries had been deduplicated/renumbered without preserving the canonical `id = 2 × StringN` mapping from `[MC-NBFS]` §2.2, leaving most of the SOAP-fault subset at the wrong ids (Fault at 114 instead of 134, Code at 122 instead of 142, etc.). Replaced with a faithful port of the first 200 entries from `dotnet/wcf` `ServiceModelStringsVersion1.cs` (covering id 0..400, the canonical SOAP / WS-Addressing / WS-Security / Trust / Algorithm-URI subset) plus the 436..444 xsi/xsd/nil extras already in place. Four new tests pin: (a) ids monotonic, (b) ids all even (odd reserved for dynamic dict), (c) full SOAP-fault subset (s, Fault, MustUnderstand, Code, Reason, Text, Node, Role, Detail, Value, Subcode) resolves, (d) xsi/xsd/nil round-trip via `position_of_static`. Future extensions: append more `ServiceModelStringsVersion1.StringN` entries as captures show new ids; mechanical extension.
|
||
|
||
### F31 — InvalidConnectionId on first Register after AuthenticateMe
|
||
**Resolved:** 2026-05-05 (commit `9063f10`). Not a HMAC bug — `AsbErrorCode.InvalidConnectionId` (= 1) is a transient race that .NET's `MxAsbDataClient.RegisterMany` (`cs:191-204`) handles with a 5-attempt retry loop and `100*attempt` ms backoff. `AuthenticateMe` is one-way (`AsbContracts.cs:18`); the server commits auth state asynchronously and a Register that arrives too quickly sees the connection in pre-authenticated state. `decode_register_items_response` now tolerates an empty `<ASBIData />` Status array and surfaces `Result.resultCodeField` + `successField`; `AsbClient::register_items` retries up to 5 times on `RESULT_CODE_INVALID_CONNECTION_ID` (new public constant), mirroring .NET. Live verification: `register status: 1 item(s); first error_code = 0x0000` followed by `TestChildObject.TestInt = AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] }` over the live wire.
|
||
|
||
### F30 — Resolve dict-id element/attribute names on the read side
|
||
**Resolved:** 2026-05-05 (commit `eb6c689`). `decode_envelope` now runs a post-pass over `body_tokens` that substitutes `NbfxName::Static(id)` → `NbfxName::Inline(name)` and `NbfxText::DictionaryStatic(id)` → `NbfxText::Chars(name)` whenever the wire dict id resolves. Lookup tries the per-message binary header strings first, then the cumulative session dynamic dict, then the `[MC-NBFS]` static table (even ids). Tokens with unresolvable ids stay opaque so trace output still reveals them. Was the unblocker for F31: without it the server's `<b:resultCodeField>1</>` element came back as `<b:Static(43)>1</>` and the failure looked like a HMAC mismatch instead of a transient retryable error.
|
||
|
||
### F7 — Consolidate `Guid` type across `mxaccess-rpc`
|
||
**Resolved:** 2026-05-05 in this iteration's commit. `Guid` was hoisted from `objref::Guid` into the new shared `crate::guid::Guid` module. `objref` and `pdu` now re-export from there; M2 wave 2's `orpc`, `object_exporter`, and `rem_unknown` import it directly. The OXID-resolve dual-string decoder additionally needs an owned protocol label (`format!("protseq_0x{:04x}", tower_id)` per `ObjectExporterMessages.cs:120`) — `ComDualStringEntry::protocol` was upgraded from `&'static str` to `Cow<'static, str>` to support both decoders without the agent's interim `Box::leak` workaround.
|
||
|
||
### F8 — `RpcError` is duplicated across `objref` and `pdu` modules
|
||
**Resolved:** 2026-05-05 in this iteration's commit. `RpcError` was hoisted into the new shared `crate::error::RpcError` module as a single union of all wave 1 variants plus a generic `Decode { offset, reason: &'static str, buffer_len }` variant for the wave 2 ORPC parsers' one-off failures. `objref` and `pdu` re-export from there; M2 wave 2's `orpc`, `object_exporter`, and `rem_unknown` use it directly.
|
||
|
||
### F13 — `NmxClient` high-level write/advise/subscribe wrappers
|
||
**Resolved:** 2026-05-05. All seven wrappers landed in `crates/mxaccess-nmx/src/client.rs`: `write`, `write2`, `write_secured2`, `advise_supervisory`, `send_observed_pre_advise_metadata`, `register_reference`, `un_advise`. Each takes a `GalaxyTagMetadata` + a typed `WriteValue` (re-exported from `mxaccess-codec`), builds the inner NMX body via `mxaccess-codec` (`write_message::encode` / `encode_timestamped` / `secured_write::encode` / `NmxItemControlMessage` / `NmxMetadataQueryMessage` / `NmxReferenceRegistrationMessage`), wraps in `NmxTransferEnvelope`, and routes through `transfer_data`. The pure-codec `encode_*_transfer_body` helpers are extracted as `pub(crate) fn` for testability, mirroring the .NET reference's `internal static` shape. `un_advise` preserves the .NET reference's quirky `NmxTransferMessageKind::Write` envelope (not `ItemControl`) per `cs:457`.
|
||
|
||
### F15 — Callback router wires `CallbackExporter` events into `Subscription` stream
|
||
**Resolved:** 2026-05-05 across two commits.
|
||
- Step 1/2 (`2b849ae`): `Session::connect_nmx` now starts a `CallbackExporter` on a 127.0.0.1 ephemeral port, builds the OBJREF via `local_hostname()` + `127.0.0.1` fallback, registers it through `NmxClient::register_engine_2` (was `..._without_callback`). A `callback_router` task drains `CallbackEvent`s, decodes each `CallbackInvoked` body via `NmxSubscriptionMessage::parse_inner`, and broadcasts parsed messages on a `tokio::sync::broadcast` channel exposed via `Session::callbacks()`. Shutdown chains: UnregisterEngine → CallbackExporter::shutdown → wait for router task.
|
||
- Step 2/2 (this commit): `Subscription` now impls `Stream<Item = Result<DataChange, Error>>`. Filtering follows the .NET reference at `cs:333-343` exactly — `0x32` SubscriptionStatus messages are kept only when `message.item_correlation_id == subscription.correlation_id`; `0x33` DataUpdate messages pass through to ALL subscriptions because the codec exposes no per-record correlation field (matches the .NET `MxNativeCallbackEvent` filter behavior verbatim). Each `NmxSubscriptionRecord` with a parseable `value` becomes one `DataChange`. Records with `value: None` are dropped silently (mirrors the .NET `evt.Record.Value is null` filter at `cs:337`). Lag-loss surfaces as `Error::Configuration(InvalidArgument)` carrying the lag count. Stream-end (broadcast sender dropped) yields `None`. New helper: `filetime_to_system_time` (inverse of the existing `system_time_to_filetime`); saturates at Unix epoch for pre-1970 FILETIMEs. Tests cover correlation match/mismatch for `0x32`, `0x33` pass-through for any correlation, and FILETIME round-trip.
|
||
|
||
### F1 — NTLM consumer-layer helpers (workstation default + from_env constructor)
|
||
**Resolved:** 2026-05-05. `NtlmClientContext::from_env()` reads `MX_RPC_USER` / `MX_RPC_PASSWORD` / `MX_RPC_DOMAIN` (mirrors `ManagedNtlmClientContext.FromEnvironment` at `cs:41-49`); empty `MX_RPC_DOMAIN` is permitted. `local_hostname()` checks `COMPUTERNAME` then `HOSTNAME` and returns the empty string when neither is set — same "unavailable" semantics as `Environment.MachineName` returning null. Lives in `mxaccess-rpc/src/ntlm.rs`; deliberately doesn't pull `gethostname` (no native-libc deps, no `unsafe` for hostname lookup). Added `NtlmError::MissingEnvVar { name }` for the env-var-unset case. Test mod gained an `EnvScope` + `ENV_LOCK` mutex pattern for serializing process-global env mutation across parallel tests.
|
||
|
||
### F9 — `ObjectExporterClient.cs` ResolveOxid wrapper methods
|
||
**Resolved:** 2026-05-05. Both portable methods land in `crates/mxaccess-rpc/src/object_exporter_client.rs`: `resolve_oxid_unauthenticated` (mirrors `cs:14-30`) and `resolve_oxid_with_managed_ntlm_packet_integrity` (mirrors `cs:66-81`). Each opens a TCP connection, binds to `IObjectExporter`, calls opnum 0 with the encoded request, and decodes the response — preferring `parse_resolve_oxid_result` then falling back to `parse_resolve_oxid_failure` for short stubs. The two SSPI flavours (`ResolveOxidWithNtlmConnect`, `ResolveOxidWithNtlmPacketIntegrity`) wrap .NET's `System.Net.Security.SspiClientContext` and are explicitly out of scope for the Rust port — that's a permanent skip, not a deferral.
|
||
|
||
### F17 — `Guid::parse_str` helper (dashed-hex string parser)
|
||
**Resolved:** 2026-05-05. `Guid::parse_str(&str) -> Result<Guid, RpcError>` landed in `crates/mxaccess-rpc/src/guid.rs:65-112` as the inverse of the existing `Display` impl. Accepts the canonical dashed-hex form, optionally wrapped in `{}` braces (.NET `B` format), case-insensitive, and tolerant of bare 32-char hex without dashes. Single-pass char-by-char nibble accumulator avoids per-byte string allocation; the same byte-swap of groups 1-3 the Display impl does is applied after the raw hex pass. Eight new tests cover round-trip against the `Display` fixture (`b49f92f7-c748-4169-8eca-a0670b012746`), braces, uppercase, no-dashes, zero-GUID, too-short, too-long, and non-hex rejection. The five live-NMX examples (`connect-write-read`, `subscribe`, `recovery`, `multi-tag`, `secured-write`) lost their per-file 15-line `parse_guid` helpers in favour of the canonical implementation. Test count delta: 524 → 532 (+8).
|