Files
mxaccess/design/followups.md
T
Joseph Doherty ff4ea4d5a9
rust / build / test / clippy / fmt (push) Has been cancelled
[F16] mxaccess: real Session::recover_connection (re-bind + re-advise)
Closes F16. Replaces the wave-2 no-op recover_connection with the
full .NET-equivalent shape (MxNativeSession.cs:399-474). Three
pieces:

1. Subscription registry on SessionInner.
   New subscriptions: Mutex<HashMap<[u8; 16], SubscriptionEntry>>
   tracks every active advise. subscribe() inserts after a successful
   AdviseSupervisory; unsubscribe() removes on the success path only
   (failed UnAdvises stay registered so next recovery replays them).
   The consumer's Subscription handle still holds the BroadcastStream;
   the registry is purely for AdviseSupervisory replay.

2. Pluggable RebuildFactory.
   New public typedef:
     pub type RebuildFactory = Arc<
         dyn Fn() -> Pin<Box<dyn Future<Output = Result<NmxClient,
                                                        NmxClientError>>
                            + Send>>
             + Send + Sync,
     >;
   Installed via Session::set_recovery_factory(factory);
   queryable via has_recovery_factory(). Kept separate from
   connect_nmx / connect_nmx_auto so existing constructors stay
   non-breaking — consumers opt in by calling the setter
   after-the-fact.

3. Real recover_connection + recover_connection_core.
   recover_connection is the retry loop (mirrors cs:399-440): for
   attempt in 1..=policy.max_attempts, emit RecoveryEvent::Started
   → call recover_connection_core → on Ok emit Recovered + return,
   on Err emit Failed{will_retry, error}, sleep policy.delay, retry,
   or bubble the last error.

   recover_connection_core mirrors cs:442-474: rebuild NMX via the
   factory → RegisterEngine2 with the saved callback_obj_ref → optional
   SetHeartbeatSendInterval → snapshot the registry under the lock,
   replay AdviseSupervisory(correlation_id) for each entry → atomically
   swap *nmx_lock = replacement. Old NmxClient drops at end of scope,
   closing its TCP transport.

Subscription correlation ids are preserved across the swap so the
consumer's Subscription stream continues to receive on its existing
broadcast filter. The CallbackExporter stays bound across recoveries
— no TCP listener re-bind.

R15's "long-lived connection task" was listed as a hard prereq, but
the existing Mutex<NmxClient> already serialises concurrent ops
during the rebuild — recover_connection_core holds the inner mutex
during the swap, concurrent ops just wait. Functionally equivalent
to the long-lived-task design.

New ConfigError::RecoveryNotConfigured returned when
recover_connection is called without a factory installed. New
public re-export: RebuildFactory.

Tests (mxaccess 65 → 67):
  - recover_connection_without_factory_returns_recovery_not_configured
  - recover_connection_with_always_failing_factory_exhausts_attempts
    (pins (Started, Failed)×3 + final will_retry=false + bubbled
    TransportFailure)
  - subscribe_populates_registry_unsubscribe_clears_it
  - recovery_events_supports_multiple_subscribers (updated for the
    new factory-required path)

connect_nmx_auto-side auto-population of the factory (capturing the
ntlm_factory + discovered (addr, service_ipid) so consumers don't
re-author the closure) is a future polish — not required to close
F16.

design/followups.md: F16 moved to Resolved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 01:57:43 -04:00

53 KiB
Raw Blame History

Followups

Open work items deferred during /loop iterations. Triaged at the top of every iteration. New items are appended under ## Open; resolved items move to ## Resolved with a date + commit hash.

Open

F18 — M5 plan of attack (ASB transport, parallel-safe sub-streams)

Severity: P0 — milestone driver, blocks ASB consumers + V1 release Source: design/dependencies.md:73-89 + design/60-roadmap.md:84-91 + design/70-risks-and-open-questions.md:5-25 (R1 estimates ~3000 LoC for framing+encoders).

Scope. Build the ASB data-plane end-to-end:

  • mxaccess-asb-nettcp[MS-NMF] framing + [MC-NBFX] binary-XML node codec + [MC-NBFS] static dictionary table + DH/HMAC/AES authentication crypto.
  • mxaccess-asbIASBIDataV2 client (Connect, RegisterItems, Read, Write, PublishWriteComplete, CreateSubscription, AddMonitoredItems, Publish, Disconnect) + SecretProvider trait + DPAPI default impl + ASB Variant codec port (currently a stub at crates/mxaccess-codec/src/lib.rs:74,77,80).
  • mxaccess::Session over an AsbTransport impl; capabilities surface ASB limits (no subscribe_buffered, no Activate/Suspend, no OperationComplete outside the proven write-completion frame — see design/60-roadmap.md:88).
  • examples/asb-subscribe.rs exercises the whole path against a live ASB endpoint with parity vs dotnet run --project src\MxAsbClient.Probe.

Sub-stream breakdown (matches design/dependencies.md:78-89). Each sub-stream is a separate followup so it can be claimed by a separate agent in a worktree without merge conflict:

Sub-followup Stream Owns Depends on
F19 (workspace prereq) Add the M5 dep set to rust/Cargo.toml workspace deps + per-crate Cargo.toml: aes, hmac, md-5, sha1, sha2, pbkdf2, flate2, rand, crypto-bigint (constant-time DH per review.md MAJOR), quick-xml, tokio-util. Pinned to the digest 0.11/cipher 0.5 generation per design/30-crate-topology.md:251-289. Sequential prereq for the others. M0
F20 A — MS-NMF framing mxaccess-asb-nettcp::nmf — preamble (0x00 ver=1 mode=2 via=encoded-string), preamble-ack, sized-envelope (0x06 var-int len bytes), end (0x07), fault (0x08), upgrade-request, known-encoding via lookup. Reliable-session ack handling. Round-trip against analysis/proxy/mxasbclient-register-message.txt and mxasbclient-probe-stage*.txt byte traces. F19
F21 B — MC-NBFX mxaccess-asb-nettcp::nbfx — record types (0x40 ShortElement, 0x41 Element, 0x44 ShortDictionaryAttribute, 0x04 PrefixDictionary*A-Z, 0x84 BoolText, 0x88 Int32Text, 0x86 BoolFalseText, etc., per [MC-NBFX] §2.2). Length-prefixed strings (var-int 7-bit groups). Read/write over bytes::BytesMut. F19
F22 C — MC-NBFS mxaccess-asb-nettcp::nbfs — the static dictionary table. SOAP/WS-Addressing tokens + IASBIDataV2-action strings used by the operation set (http://ASB.IDataV2:registerItemsIn, :readIn, :writeIn, :createSubscriptionIn, :publishIn, etc., see src/MxAsbClient/AsbContracts.cs:14-58). Hand-rolled from the proven action set; the full WCF dictionary is much larger but only the action subset is on the wire. F19
F23 D — Auth crypto mxaccess-asb-nettcp::auth — port src/MxAsbClient/AsbSystemAuthenticator.cs (167 LoC): DH key exchange with crypto-bigint constant-time mod_exp (review.md MAJOR finding — .NET BigInteger.ModPow is not constant-time and the DH private exponent is long-lived per cs:153-166); HMAC-MD5/SHA1/SHA512 (negotiated per AsbSolutionCryptoParameters.HashAlgorithm); AES-128 with PBKDF2-SHA1 1000-iteration key derivation; deflate-then-encrypt EncryptBaktun vs raw-encrypt EncryptApollo distinguished by :V2 lifetime suffix (cs:48); ASCII salt "ArchestrAService"; UTF-16LE passphrase. Plus DPAPI shared-secret read on Windows behind the existing dpapi feature gate, with a SecretProvider::shared_secret(&[u8]) escape hatch for tests/CI (design/30-crate-topology.md:150). F19
F24 (codec) mxaccess-codec::asb_variant — fill in the stubbed AsbVariant, AsbStatus, RuntimeValue (crates/mxaccess-codec/src/lib.rs:74,77,80) per docs/ASB-Variant-Wire-Format.md. Decode/encode for the proven type matrix: TypeBool, TypeInt32, TypeFloat, TypeDouble, TypeString, TypeDateTime, TypeDuration, plus deployed array shapes (work_remain.md:108-113). Less-common scalars stay as raw bytes (matches .NET DecodeVariant fallback at MxAsbDataClient.cs:748). Independent of the framing/encoder work — separate crate. M1 (envelope/status types)
F25 E — IASBIDataV2 client mxaccess-asb::client — top-level AsbClient with connect, register_items, read, write, publish_write_complete, create_subscription, add_monitored_items, publish, disconnect. Wires the contract → NBFX-encoded SOAP envelope → NMF-framed TCP. ConnectedRequest::ConnectionValidator HMAC signing per AsbSystemAuthenticator::Sign. Receives Publish callbacks via a long-lived background task (mirrors the M4 NMX callback_router pattern). Depends on F20+F21+F22+F23+F24. A+B+C+D+codec
F26 (session) mxaccess::Session over AsbTransport. New transport impl alongside NmxTransport. Surface ASB capability flags so subscribe_buffered/activate/suspend return Error::Unsupported(Capability::*) rather than a runtime fallthrough. Update examples/asb-subscribe.rs to drive the path end-to-end. Live-probe DoD: round-trip parity with dotnet run --project src\MxAsbClient.Probe. F25

Parallel-safety analysis.

  • F19 (workspace deps) is the single sequential bottleneck — F20-F25 all reference workspace deps that don't exist yet, so they cannot start in parallel until F19 lands. Tight & small (~30 lines of TOML).
  • F20, F21, F22, F23, F24 are fully parallel-safe after F19: each owns a different module under a different crate (or different sibling module within mxaccess-asb-nettcp). No shared state, no cross-import — each can land in its own commit. Per dependencies.md:88 "Peak agents in parallel: 4 in the framing/encoding wave (A+B+C+D)".
  • F25 is sequential after the four framing/encoder streams + F24 land — it composes them. The .NET MxAsbDataClient is monolithic enough that splitting F25 across agents costs more in coordination than it saves.
  • F26 is sequential after F25.
  • Cross-milestone parallelism still holds. M5 (this whole F18-F26 cluster) runs in parallel with M3+M4 per design/60-roadmap.md:14-17 because the Transport trait was lifted into M0. M4's Session core landed (commits 4863c6d, 2dc091d, a31237d); the F26 AsbTransport plugs into the same trait without re-design.

Risk-driven sequencing inside the parallel wave. R1 in design/70-risks-and-open-questions.md:9 is the project-blocker. Of the four parallel streams, F23 (auth crypto) carries the most live-probe risk (DH handshake against the live VM is the first irreversible test of the spec port) but is the smallest in LoC. F22 (NBFS) is the largest unknown — the dictionary table size is bounded only by the action subset we exercise. Recommended order if agents are constrained: F23 (smallest, highest-leverage) → F20 (foundational for any wire test) → F21 (encoder) → F22 (dictionary) → F24 (codec, independent).

Definition of done for F18 as a whole (= M5 DoD per design/60-roadmap.md:91):

  1. cargo run -p mxaccess --example asb-subscribe -- --tag TestChildObject.TestInt succeeds against a live ASB endpoint.
  2. Round-trip parity with dotnet run --project src\MxAsbClient.Probe (Frida/Wireshark diff is byte-identical for the proven type matrix).
  3. The mxaccess-asb type matrix covers what work_remain.md:108-113 documents as proven: scalar Boolean, Int32, Float, Double, String, DateTime, Duration plus deployed array tags.
  4. cargo build --workspace and cargo test --workspace green; cargo clippy --workspace -- -D warnings clean.

Resolves when: F19-F26 are all closed and the four DoD bullets above pass.

M5 STATUS (commit 9063f10): functionally LIVE. End-to-end cargo run -p mxaccess --example asb-subscribe -- --tag TestChildObject.TestInt Connect → AuthenticateMe → Register → Read → Disconnect against the live MxDataProvider, returning the real tag value over the wire (type_id=4 length=4 payload=[99,0,0,0]). DoD checklist:

  1. Live asb-subscribe succeeds against the AVEVA endpoint.
  2. ⚠️ Wire structure matches .NET's request bytes for AuthenticateMe / Register byte-by-byte (verified via asb-relay middleman with the .NET probe routed through ClientVia); responses round-trip via the F30 dict-id resolution post-pass. Strict byte-identical parity for the response side is not guaranteed because WCF chunks Bytes8/16/32 records at different boundaries — both forms are functionally equivalent and collect_asbidata_payloads concatenates chunks (commit cf97eab).
  3. ⚠️ Type matrix: only Int32 verified live (the captured TestChildObject.TestInt tag). Bool / Float / Double / String / DateTime / Duration / arrays not yet exercised — pending one or more sample tags per type and an asb-subscribe extension that loops over them. F32 captures this expansion.
  4. cargo build --workspace + cargo test --workspace (711 tests) + cargo clippy --workspace -- -D warnings all green.

Remaining open work for full M5 closeout (none are P0 blockers anymore):

  • F32: resolved (commit <this commit>) via option (b) — three-type live coverage is the deployable maximum; missing types are Galaxy-provisioning-gated.
  • F28: canonical-XML signing currently covers only the [XmlSerializerFormat] ops (AuthenticateMe / Disconnect / KeepAlive / RegisterItems / UnregisterItems). Read / Write / CreateSubscription / AddMonitoredItems / Publish / etc. still sign over NBFX wire bytes via the legacy fallback. Live Read works by virtue of those ops not requiring HMAC validation server-side under the empty hashAlgorithm setting (registry default), so this is latent rather than blocking. Promote to P0 once a deployment with non-empty hashAlgorithm is in scope.
  • F29: resolved (commit <this commit>) — nbfs.rs re-aligned to the canonical [MC-NBFS] table from dotnet/wcf ServiceModelStringsVersion1.
  • F26 stream subscription: resolved (commit <this commit>) — AsbSession::subscribe(subscription_id) returns an AsbSubscription: Stream<Item = Result<MonitoredItemValue, Error>> driven by an internal tokio::spawn'd publish-loop. Drop of the subscription aborts the loop. Per-PublishResponse values array is fanned out as individual stream items; transport errors are delivered as the final stream item before termination. Inner publish_loop helper is split out so it's testable in isolation against any closure-based fake publish_fn. 3 new tests pin: compile-time Stream + Send + Unpin, multi-batch + terminal-error round-trip, consumer-drop short-circuits the publisher. Workspace 718 → 721 tests.

Cumulative execution log. F19 + F23 (ed17c07); F24 (7611d9e); F20 (9dfd193); F22 (43c10a1); F21 (5f98558); F25 step 1 (25dbd8d); F25 step 2 (a2b8989); F25 step 3 (c4bf0a0); F25 step 4 (1e59249); F25 step 5 (9b8133f); F25 step 6 (321b796); F25 step 7 (1b1ee1e); F26 step 1 (8a0f92b); F26 step 2 (14bb529); example rewrite (c6570dc); F25 step 8 (b543eb1); F25 step 9 (0441a2e); F25 step 10 (9876b4e); F26 step 3 (<previous>); F25 live-bring-up reconciliation (this commit):

  • F25 live-bring-up reconciliation: live asb-subscribe + asb-relay (TCP middleman) capture-and-diff against AVEVA's MxDataProvider on Windows. Five concrete fixes landed:

    1. NBFX PrefixElement_a..z (0x5E-0x77) and PrefixAttribute_a..z (0x26-0x3F) decode + encode arms — single-letter-prefix records that WCF emits in responses but our codec only recognised the dictionary-named cousins (PrefixDictionaryElement_a..z 0x44-0x5D, PrefixDictionaryAttribute_a..z 0x0C-0x25). The server's ConnectResponse hit 0x65 = PrefixElement_h for a dynamically-named element (e.g. <h:Foo>) and our decoder bailed with unknown NBFX record byte 0x65. Both directions now round-trip; the encoder picks the short-form arm whenever prefix_letter_offset(prefix).is_some().
    2. xmlns redeclaration on <Data> and <InitializationVector> inside AuthenticationData / PublicKey[XmlType(Namespace = "http://asb.contracts.data/20111111")] on the AuthenticationData / PublicKey classes (AsbContracts.cs:350-381) means XmlSerializer emits an xmlns="..." redeclaration on each direct child. The default-ns scope ends at </Data>, so <InitializationVector> needs its own redeclaration to stay in the data namespace; without it the server fell back to messages-namespace and the deserialiser threw an InternalServiceFault. Connect handshake now completes end-to-end with the apollo:V2 ConnectionLifetime and a real ServicePublicKey.
    3. SOAP-fault detection on the response pathClientError::SoapFault { action, code, reason } surfaces when the response Action header matches the canonical dispatcher/fault template; we previously let body decoders blindly run and hit MissingField { field: "Status" } which masked the fact that the wire was a fault. The reason text is extracted as the longest NbfxText::Chars in the body — robust against the nbfs.rs static-dictionary id mismatches noted below.
    4. Identified blocker: ConnectedRequest signing currently HMACs the NBFX wire bytes of the unsigned envelope. .NET's AsbSystemAuthenticator.Sign (AsbSystemAuthenticator.cs:79) HMACs Encoding.UTF8.GetBytes(request.ToXml()) — the canonical XML serialisation of the message contract via XmlSerializer with namespace "urn:invensys.schemas" (AsbSerialization.cs:12-48). Until the Rust port emits identical XML bytes, the HMAC mismatches and the server rejects every signed request (AuthenticateMe, RegisterItems, etc.) with a generic dispatcher/fault InternalServiceFault. Connect itself is unsigned (extends ServiceMessage, no ConnectionValidator header) which is why it works today. The fault's a:RelatesTo UniqueId in our captures matches the AuthenticateMe MessageID, confirming the failure point. New followup F28 captures the XML-canonicaliser scope.
    5. nbfs.rs static dictionary ids drift at id 114+ vs. the canonical [MC-NBFS] table (Fault/Code/Reason/Text/Value are 20 IDs higher on the wire than what we encode). Doesn't affect requests we send (we only encode IDs ≤44 = ReplyTo, all correct), but breaks decode_envelope's element-by-name matching for fault bodies. Tracked as F29.

    Workspace: 702 tests pass (no test count delta — wire-only fixes). Live status: Connect handshake working with real DH key + apollo encryption; AuthenticateMe and onwards blocked on F28. Companion diagnostic example asb-relay.rs (TCP middleman that hex-dumps both directions to stderr) lands as a permanent debugging aid.

  • F26 step 3 (<previous> in the cumulative log): mxaccess::AsbSession is a high-level cheap-clone async API on top of AsbTransport, deliberately parallel to the NMX-shaped Session rather than unified. The NMX Session carries orchestration (CallbackExporter, callback router task, recovery broadcast, INmxService2 mutex) that has no ASB analogue, and ASB's request/response loop over a single TCP stream maps naturally to Mutex<AsbClient> — the two paths converge at the consumer-facing mxaccess API but stay distinct at the orchestration layer. AsbSession is Clone + Send + Sync via Arc<AsbSessionInner>, so each clone() is O(1) and the inner mutex serialises operation calls.

For the per-step body of every line listed in the cumulative execution log, see the matching commit message — each commit is a single F-number step with its own scope, fixtures, test count delta, and follow-up notes. The detailed per-step write-ups previously inlined here added little beyond what git show <hash> provides.

F28 — Canonical XML serialiser for ConnectedRequest signing (matches XmlSerializer.Serialize byte-for-byte)

Status: PARTIALLY RESOLVED. The five [XmlSerializerFormat] ops (AuthenticateMe, Disconnect, KeepAlive, RegisterItems, UnregisterItems) plus the per-action ValidatorWireFormat selector + DH-params-from-registry + dynamic-dict id management all landed in commits f14580e / 104efc4. Live AuthenticateMe + RegisterItems work end-to-end (commit 9063f10). Read / Write / CreateSubscription / AddMonitoredItems / Publish / DeleteMonitored / DeleteSubscription / PublishWriteComplete still sign over NBFX wire bytes via the legacy fallback; works in practice because the live registry has empty hashAlgorithm (no HMAC required for the unforced-MAC path), but will break under any deployment that sets a real algorithm. Severity now P2 — promote back to P0 if a hashAlgorithm-non-empty environment is in scope. Severity: P0 — blocks every signed ASB operation (AuthenticateMe, RegisterItems, all data-plane RPCs). Source: F25 live-bring-up; AsbSystemAuthenticator.cs:79 + AsbSerialization.cs:12-48. Why deferred: AsbSystemAuthenticator.Sign HMACs Encoding.UTF8.GetBytes(request.ToXml()) — the XML text produced by .NET's XmlSerializer.Serialize(writer, value) with XmlSerializerNamespaces = "urn:invensys.schemas", then re-parsed via XDocument.Load and re-saved to normalise xmlns attribute ordering (xsi before xsd; see AsbSerialization.cs:36-47). The HMAC must match the server's recomputation, which uses the same XmlSerializer on the deserialised request — so the Rust port has to produce byte-identical XML. We currently HMAC the NBFX wire bytes of the unsigned envelope, which never matches.

Resolves when: A canonical XmlSerializer-compatible emitter lands in mxaccess-asb (probably crates/mxaccess-asb/src/xml_canonical.rs). Scope per request type: AuthenticateMe, Disconnect, KeepAlive, RegisterItemsRequest, UnregisterItemsRequest, ReadRequest, WriteBasicRequest, PublishWriteCompleteRequest, CreateSubscriptionRequest, DeleteSubscriptionRequest, AddMonitoredItemsRequest, DeleteMonitoredItemsRequest, PublishRequest. Each derives its XML form from the [MessageContract] / [MessageBodyMember(Order = N, Namespace = ...)] attributes plus per-type [XmlType(Namespace = ...)] on AuthenticationData / PublicKey. The request_xml_utf8 argument to AsbAuthenticator::sign is already wired correctly — only the producer is missing. Once HMAC matches, the existing ConnectionValidator header path (mac + iv base64 round-trip) is already validated by the F23 unit tests. Resolves: F25 live AuthenticateMe + RegisterItems + every signed operation; M5 DoD bullets 1+2 unblocked.

Captured fixtures (commit dbb580b). MxAsbClient.Probe --dump-signed-xml (new flag, 2026-05-05) produces canonical request.ToXml() output for the five primary ConnectedRequest shapes; fixtures saved under rust/crates/mxaccess-asb/tests/fixtures/signed-xml/{authenticate-me,disconnect,keep-alive,register-items,unregister-items}.xml. Byte sizes pinned: 1000/980/705/1068/1072. Plus authenticate-me-empty-mac-iv.xml (896 bytes) for the actual signing input shape (validator's MAC + IV are empty during request.ToXml(); .NET's AsbSystemAuthenticator.Sign:79 mutates them only AFTER HMAC computation). The companion README.md documents 10 inferred XmlSerializer rules — most importantly: (1) element name = class name (NOT MessageContract.WrapperName), (2) field order = C# declaration order (NOT [MessageBodyMember.Order]), (3) [XmlType(Namespace=...)] on a field's type causes per-child xmlns redeclaration on the children, NOT the wrapper element, (4) the *Specified pattern controls whether <Xxx> is emitted, (5) CRLF line endings + 2-space indent + UTF-8-bytes-of-utf-16-declaration, (6) empty byte[] → self-closing <Tag xmlns="..." /> (NOT <Tag></Tag>).

Emitter landed (commit f14580e). mxaccess-asb::xml_canonical exposes emit_authenticate_me_xml, emit_disconnect_xml, emit_keep_alive_xml, emit_register_items_request_xml, emit_unregister_items_request_xml. Seven fixture-comparison tests pass (byte-equal vs. .NET output for both filled-MAC + empty-MAC variants of AuthenticateMe, plus the four other shapes). Plumbing: AsbAuthenticator::peek_next_message_number exposes the pre-allocated message number; AsbClient::send_signed_envelope[_one_way] gain xml_for_signing: Option<&[u8]>. connect, disconnect, keep_alive, register_items, unregister_items now build a pre-signing ConnectionValidator (empty MAC + IV) → emit canonical XML → pass to HMAC. Other ops (Read, Write, Subscription) still use the legacy NBFX-bytes path.

Registry-driven DH params (commit f14580e). tools/Get-AsbPassphrase.ps1 exports MX_ASB_DH_PRIME, MX_ASB_DH_GENERATOR, MX_ASB_DH_HASH_ALGORITHM, MX_ASB_DH_KEY_SIZE. The asb-subscribe example honours those env vars to override CryptoParameters::defaults() (which is the .NET reference's 1024-bit fallback). Each AVEVA install picks its own DH group at provisioning time — typically a 768-bit prime, NOT the default 1024-bit. With the wrong prime, Connect succeeds at the byte level but the shared-secret derivation diverges, breaking AuthenticateMe's encrypted ConsumerData verification. Empty registry hashAlgorithm maps to HashAlgorithm::Unrecognised to match AsbSystemAuthenticator.CreateHmac:84-93 semantics where empty + forceHmac=true falls through to HMAC-SHA1.

Remaining live blocker (commit fd38189). With canonical XML byte-equal to .NET's AND DH params from the registry, AuthenticateMe still produces dispatcher/fault InternalServiceFault. MX_ASB_TRACE_DERIVE-gated diagnostic traces in both the Rust authenticator and the .NET reference confirm: crypto_key length matches (176 bytes = 96-byte shared secret + 80-byte passphrase); passphrase bytes [96..176] of the crypto_key are identical between Rust and .NET (same registry source, same UTF-8 encoding). The shared-secret prefix [0..96] differs per session (random DH), but should round-trip correctly with the server.

Crypto stack ruled out (commit <this commit>). Deterministic-HMAC fixture test (auth.rs::tests::deterministic_hmac_matches_dotnet_fixture) takes pinned inputs (passphrase, prime, generator, private-key bytes, remote-pub bytes, message number, connection ID, AES IV, consumer-data + IV) and asserts byte-equality of each step:

  1. shared = remote_pub^private_key mod prime matches .NET
  2. crypto_key = shared || passphrase_utf8 matches .NET
  3. hmac = HMAC-SHA1(crypto_key, xml_utf8) matches .NET (HMACSHA1)
  4. aes_key = PBKDF2-SHA1(base64(crypto_key), "ArchestrAService", 1000, 16) matches .NET (Rfc2898DeriveBytes.Pbkdf2)
  5. encrypted_mac = AES-CBC(aes_key, iv=zeros, hmac, PKCS7) matches .NET (System.Security.Cryptography.Aes)

The fixture is captured by MxAsbClient.Probe --dump-deterministic-hmac (src/MxAsbClient.Probe/Program.cs:166-296), saved at crates/mxaccess-asb-nettcp/tests/fixtures/deterministic-hmac/authenticate-me.kv. With all 5 crypto steps proven byte-equal to .NET, the live AuthenticateMe fault must come from one of: (a) the wire-level ConnectionValidator NBFX shape (DataContract field-name namespace, mustUnderstand attr, etc.), (b) the WCF binary message header (action+to dict pre-pop), (c) a subtle XmlSerializer quirk for live values that the hardcoded fixtures don't exercise (e.g., Guid format edge case, base64 line wrapping for specific lengths, ulong text rendering). Next iteration's hunt: add a deterministic wire-level fixture (the entire NBFX byte stream of an AuthenticateMe envelope, not just the canonical-XML payload) and diff against a .NET probe capture for the same inputs.

F27 — Constant-time DH mod_exp (swap num-bigintcrypto-bigint::BoxedUint)

Severity: P2 (security regression vs the long-term Rust target — but at parity with the .NET reference today, so not a release-blocker) Source: F23 (crates/mxaccess-asb-nettcp/src/auth.rs:179,303); originally flagged in design/30-crate-topology.md:269-274 and the project's review.md MAJOR finding. Why deferred: crypto-bigint 0.5's BoxedUint does not yet expose pow_mod over heap-allocated values. The fixed-size Uint<L> types do, but require the prime to be parsed into a fixed bit-width and there's no decimal-string parser in crypto-bigint. F23 ships with num-bigint to keep parity with the .NET reference (which is also not constant-time); the constant-time upgrade is a separate, isolated swap. Resolves when: Either (a) crypto-bigint lands a stable BoxedUint::pow_mod and a decimal-string parser, or (b) we add a small fixed-width DH backend that parses the registry prime into U2048 once at session construction. At that point auth::AsbAuthenticator::new, crypto_key, and generate_private_key swap num_bigint::BigUint::modpow for the constant-time variant; tests stay unchanged because the wire-byte representation is identical.

F2 — NTLM verify_signature path + constant-time MAC compare (server-to-client direction)

Severity: P2 Source: M2 wave 1, crates/mxaccess-rpc/src/ntlm.rs Why deferred: The .NET ManagedNtlmClientContext only implements client-to-server signing (cs:30,124); there is no implementation of server-to-client sign/seal keys or verify_signature. Both are needed when the callback exporter receives a signed inbound frame from NmxSvc.exe, but no such fixture exists yet. Resolves when: M2 wave 3 (callback exporter) captures an INmxSvcCallback::StatusReceived frame with an auth_value trailer per design/60-roadmap.md:56 (DoD #3) and a fixture lands under tests/fixtures/m2-status-frame/. Add subtle = "2" and gate the byte compare behind ConstantTimeEq at the same time.

F3 — Cross-domain NTLM Type1/2/3 fixture

Severity: P2 Source: M2 wave 1, crates/mxaccess-rpc/src/ntlm.rs Why deferred: All current NTLM fixtures are single-domain (the local AVEVA install). Tracked separately in design/70-risks-and-open-questions.md R8 (P1 risk) and the open-evidence-gaps table. Resolves when: A multi-domain AVEVA test harness lands and a successful cross-domain authenticate round-trip captures Type1/2/3 bytes. Notes: this clears R8.

F10 — IObjectExporter::ResolveOxid2 (opnum 4) body codec

Severity: P2 Source: M2 wave 2, crates/mxaccess-rpc/src/object_exporter.rs Why deferred: ObjectExporterMessages.cs only models opnum 0 (ResolveOxid). Opnum 4 (ResolveOxid2) has a different response shape — it adds a COMVERSION plus an AuthnHnt[] array. The .NET reference does not exercise this path, so there's no executable spec to mirror. Resolves when: Either a [MS-DCOM] §3.1.2.5.1.4-derived layout is verified against a captured ResolveOxid2 exchange, or the .NET reference grows a ParseResolveOxid2* helper.

F11 — IRemUnknown::RemAddRef and RemRelease body codecs

Severity: P2 Source: M2 wave 2, crates/mxaccess-rpc/src/rem_unknown.rs Why deferred: RemUnknownMessages.cs declares the opnums (:9-10) but does not implement encoders/decoders. The Rust port matches that exactly per "port what is already proven." Resolves when: The .NET reference adds bodies for opnums 4 / 5 (or a captured frame establishes the on-wire shape). At that point port them into rem_unknown.rs alongside the existing RemQueryInterface codec.

Resolved

F16 — Real Session::recover_connection reconnect loop (re-bind + re-advise)

Resolved: 2026-05-06 (commit <this commit>). Replaces the wave-2 no-op recover_connection with the full .NET-equivalent shape (MxNativeSession.cs:399-474).

Three pieces, all in crates/mxaccess/src/session.rs:

  1. Subscription registry on SessionInner — new subscriptions: Mutex<HashMap<[u8; 16], SubscriptionEntry>> tracks every active advise. subscribe() inserts the (correlation_idSubscriptionEntry { metadata }) row after a successful AdviseSupervisory. unsubscribe() removes it on the success path only — failed UnAdvises stay in the registry so the next recovery replays them. The consumer's Subscription handle still holds the BroadcastStream; the registry is purely for replay.
  2. Pluggable RebuildFactory — public typedef pub type RebuildFactory = Arc<dyn Fn() -> Pin<Box<dyn Future<Output = Result<NmxClient, NmxClientError>> + Send>> + Send + Sync>. Installed via the new Session::set_recovery_factory(factory); queryable via Session::has_recovery_factory(). Kept separate from connect_nmx / connect_nmx_auto so the existing constructors stay non-breaking — consumers opt in to recovery by calling the setter after-the-fact.
  3. Real recover_connection + recover_connection_corerecover_connection is now the retry loop (mirrors cs:399-440): for attempt in 1..=policy.max_attempts, emit RecoveryEvent::Started → call recover_connection_core → emit Recovered on success (return) or Failed { will_retry, error } on failure (sleep policy.delay, retry, or bubble the last error after the budget is exhausted). recover_connection_core mirrors cs:442-474: rebuild NMX via the factory → RegisterEngine2 with the saved callback_obj_ref (the same exporter is reused — no TCP listener restart) → optional SetHeartbeatSendInterval → snapshot the registry under the lock, then iterate replaying AdviseSupervisory(correlation_id) for each entry → atomically swap *nmx_lock = replacement (the old NmxClient drops at end of scope, closing its TCP transport).

Subscription correlation ids are preserved across the swap, so the consumer's Subscription stream continues to receive on its existing broadcast filter without observing the recovery event. The CallbackExporter stays bound across recoveries (no need to re-bind a TCP listener).

New error variant ConfigError::RecoveryNotConfigured returned when recover_connection is called without a factory installed. New public re-export: RebuildFactory.

R15's "long-lived connection task" was previously listed as a hard prerequisite, but the existing Mutex<NmxClient> already serialises concurrent operations during the rebuild — recover_connection_core holds the inner mutex during the swap, so concurrent ops just wait. Functionally equivalent to the long-lived-task design.

Tests (4 new in mxaccess):

  • recover_connection_without_factory_returns_recovery_not_configured — no factory → ConfigError::RecoveryNotConfigured.
  • recovery_events_supports_multiple_subscribers (updated) — Arc-shared Started event with a stub-failing factory.
  • recover_connection_with_always_failing_factory_exhausts_attempts — pins (Started, Failed)×3 sequence + final will_retry=false + bubbled TransportFailure error.
  • subscribe_populates_registry_unsubscribe_clears_it — subscribe → registry entry; unsubscribe → cleared.

Workspace mxaccess 65 → 67 tests; default-feature clippy clean. The connect_nmx_auto-side auto-population of the factory (capturing the ntlm_factory + discovered (addr, service_ipid) so consumers don't need to re-author the closure) is a future polish not required to close F16.

F33 — Live wire reconciliation for the ASB subscription path

Resolved: 2026-05-06 (commits 218f4c4, 7a5f251, <this commit>). MX_ASB_TRACE_REPLY capture during investigation revealed the live MxDataProvider returns a Result wrapper with <resultCodeField>1</> + <successField>false</> followed by empty <ASBIData/> payloads when it short-circuits on InvalidConnectionId — the same transient race F31 fixed for RegisterItems. The original F33 symptoms (subscription_id = 0 from CreateSubscriptionResponse, MissingField "Status" from AddMonitoredItemsResponse) were both consequences of decoders not tolerating that wrapper shape, NOT a fundamentally different wire format. Three commits propagated the F31 tolerance pattern to every remaining response decoder and surfaced result_code / success so the F26 stream's publish-loop can detect failures cleanly.

  1. 218f4c4decode_read_response + client::read retry loop. Added result_code / success to ReadResponse. Live verified: TestChildObject.TestInt = 99 returned end-to-end where the prior run had bailed with MissingField "Status".
  2. 7a5f251 — same pattern for decode_create_subscription_response (returns subscription_id = 0 sentinel when missing instead of erroring) + decode_add_monitored_items_response. Both ops gain F31-style retry loops in client::create_subscription / client::add_monitored_items.
  3. <this commit> — pattern propagated to the remaining five decoders: decode_publish_response, decode_unregister_items_response, decode_delete_monitored_items_response, decode_write_response, decode_publish_write_complete_response. Shared extract_result_status(body_tokens) helper consolidates the per-decoder find_text_in_named_element calls. The F26 stream's publish_loop (asb_session.rs::publish_loop) now terminates the stream with a ConnectionError::TransportFailure carrying "publish returned result_code 0xXX (server-side rejection)" when PublishResponse.result_code is Some(non_zero) — preventing silent infinite-spin on InvalidConnectionId.

Live read still passes after all changes. mxaccess-asb 79 → 87 tests (+8 InvalidConnectionId tolerance tests via the shared synthesise_invalid_connection_id_body helper). Default-feature clippy clean.

The examples/asb-subscribe.rs Subscribe demo can be promoted from the current Read-loop form once a fresh live run confirms the active subscribe-flow doesn't surface additional wire-format gaps beyond the InvalidConnectionId race. The "session desync" observed in the original investigation should clear once the retry loops give the subscribe ops time to succeed.

F12 — NmxClient::create (auto-resolving COM-activation factory)

Resolved: 2026-05-05 (commit <this commit>). Builds on F6: new NmxClient::create(ntlm_factory) constructor in crates/mxaccess-nmx/src/client.rs, gated on cfg(all(windows, feature = "windows-com")). New crate-level feature mxaccess-nmx/windows-com propagates to mxaccess-rpc/windows-com. Mirrors ManagedNmxService2Client.Create() (cs:30-64) + ResolveService (cs:491-523) — six steps: (1) com_objref_provider::marshal_activated_iunknown_objref("NmxSvc.NmxService", MarshalContext::DifferentMachine) activates the COM class and emits an OBJREF blob; (2) ComObjRef::parse extracts oxid + ipid (the activated server's IUnknown IPID); (3) resolve_oxid_with_managed_ntlm_packet_integrity against 127.0.0.1:135 (RPCSS endpoint mapper) returns the server's (host, port) bindings + IRemUnknown IPID; (4) the ncacn_ip_tcp non-security binding's host[port] text is parsed via the new parse_bracketed_host_port helper (mirrors the .NET ParseBracketedHost / ParseBracketedPort pair, using rfind so FQDNs with . round-trip — matches cs:540-561); (5) a fresh transport binds to IRemUnknown and calls RemQueryInterface(iunknown_ipid, INmxService2_IID, fresh_causality_id, public_refs=5) — the RemQiResult carries the new INmxService2 IPID; (6) a second fresh transport binds to INmxService2 via Self::connect. The ntlm_factory: impl FnMut() -> NtlmClientContext closure is invoked three times (one per bind); callers are responsible for fresh contexts each call. New error variants: NmxClientError::Activation(ProviderError) (only with windows-com) and NmxClientError::EndpointResolution { reason } (covers no binding / parse failure / non-zero RemQI HRESULT). 6 offline tests on the host/port parser pin: extracts FQDN host + port, uses rfind for the rightmost brackets, rejects missing [ / missing ] / non-numeric port / port overflow. 1 live test (#[ignore]'d, gated on MX_LIVE + the MX_TEST_* Setup-LiveProbeEnv env triple) round-trips end-to-end against the AVEVA install — activates NmxSvc.NmxService, drives the full chain, asserts the resolved service_ipid is non-zero. Live verification: passes. Workspace tests went 17 → 23 in mxaccess-nmx (+6).

Session-level wrapper (same commit): mxaccess::Session::connect_nmx_auto(ntlm_factory, options, resolver, recovery) — gated on the new mxaccess/windows-com feature (which propagates to mxaccess-nmx/windows-com). Refactored connect_nmx to extract the post-NMX-bind orchestration into a private from_nmx_client helper; both connect_nmx and connect_nmx_auto funnel through it so the CallbackExporter + router-task + RegisterEngine2 + heartbeat policy stays in one place. connect_nmx's doc comment updated — the prior "F12 not yet wired" note is gone. With both layers landed, the .NET MxNativeSession.Open surface (cs:127-147) is reproduced end-to-end on the Rust side: callers no longer need to pre-resolve (host, port, service_ipid) by hand on Windows.

F32 — Live type-matrix coverage for asb-subscribe

Resolved: 2026-05-05 (commit <this commit>). Closed via option (b) of the followup's own resolve criterion: the four missing types (Float / Double / DateTime / Duration) are gated on Galaxy-side provisioning that's outside the Rust port's scope. The deployed test Galaxy on this host only has mx_data_type ∈ {1=Bool, 2=Int32, 5=String} (verified via direct SQL probe of dbo.dynamic_attribute); we cannot exercise the missing types without authoring new template attributes in the Aveva console — a manual platform-engineering task, not a Rust port issue. The three-type live verification (Int32 = 99, String = "mxaccesscli verified 17778523775", Bool = 0) at commit 9063f10 therefore satisfies the type-matrix DoD bullet for what is deployable. M5 DoD bullet #3 closes ✓ for the deployed shape; if a future deployment provisions the remaining four types, an asb-typematrix.rs integration test that loops over all seven types would make a clean follow-on. Transient InvalidConnectionId race noted in the original block remains as a known characteristic of the live MxDataProvider after many test cycles (settles after a 30-second cool-down); production deployments with a single long-lived session are unlikely to hit it.

F6 — Port ComObjRefProvider.cs (OBJREF emitter via Win32 CoMarshalInterface)

Resolved: 2026-05-05 (commit <this commit>). New module crates/mxaccess-rpc/src/com_objref_provider.rs (~330 LoC including tests) gated on cfg(all(windows, feature = "windows-com")). Pulls windows = "0.59" (features Win32_Foundation + Win32_System_Com + Win32_System_Com_Marshal + Win32_System_Com_StructuredStorage + Win32_System_Memory) as an optional dep behind the existing windows-com feature; default footprint stays slim. Public API mirrors ComObjRefProvider.cs 1:1: MarshalContext enum (InProcess / Local / DifferentMachine — wraps the MSHCTX_* newtype constants), clsid_from_prog_id(&str) -> Result<GUID, ProviderError> (wraps CLSIDFromProgID), marshal_activated_iunknown_objref(prog_id, ctx) (activates via CoCreateInstance(CLSCTX_INPROC_SERVER | CLSCTX_LOCAL_SERVER | CLSCTX_REMOTE_SERVER) then marshals), marshal_iunknown_objref(unknown, ctx) (uses IUnknown::IID), marshal_interface_objref(unknown, iid, ctx) (the underlying CoMarshalInterface over an HGlobal-backed IStream). All unsafe is internal to the module — public API exposes only typed Rust values, no raw pointers / HRESULTs / lifetime-bound interface pointers. Each unsafe block carries an inline SAFETY comment. ProviderError enumerates the four documented failure modes (UnknownProgId, ActivationFailed, MarshalFailed, GlobalLockFailed) plus the apartment-init pre-check (ApartmentInitFailed). Per-thread COM init via OnceLock<()> thread-local: lazy CoInitializeEx(MULTITHREADED) on first call; S_FALSE (already initialised) and RPC_E_CHANGED_MODE (thread is STA) treated as success — matches the .NET runtime's tolerant apartment behaviour. 4 offline tests pin: MarshalContextMSHCTX_* mapping, ensure_apartment idempotence, clsid_from_prog_id returns UnknownProgId for fake ProgIDs, marshal_activated_* short-circuits at the resolution stage. 1 live test (#[ignore]'d, gated on MX_LIVE) round-trips the real NmxSvc.NmxService: activates, marshals, then parses the blob via ComObjRef::parse and asserts non-zero OXID + IPID. Live verification: passes against the AVEVA install on this host. Workspace tests went 183 → was 179 in mxaccess-rpc (+4 new). Unblocks F12 (NmxClient::create) — the auto-resolving COM-activation factory can now chain marshal_activated_iunknown_objrefComObjRef::parseresolve_oxid_with_managed_ntlm_packet_integrityRemQueryInterface over the existing primitives.

F14 — tiberius-backed SQL implementation of Resolver + UserResolver

Resolved: 2026-05-05 (commit <this commit>). New module crates/mxaccess-galaxy/src/sql_resolver.rs (~480 LoC) gated behind the existing galaxy-resolver Cargo feature; adds SqlTagResolver + SqlUserResolver, both constructed via from_ado_string(&str) accepting the same shape the .NET reference uses by default (Server=localhost;Database=ZB;Integrated Security=True;Encrypt=False;TrustServerCertificate=True). Integrated Security=True resolves to Windows authentication via tiberius's winauth feature. Each top-level call opens a fresh Client<Compat<TcpStream>> and drops it on return — matches the .NET await using shape. tiberius's Client::query only accepts positional @P1..@PN placeholders (delegates to sp_executesql); the canonical RESOLVE_SQL / BROWSE_SQL / USER_BY_GUID_SQL / USER_BY_NAME_SQL constants are rewritten once-per-process via OnceLock<String> (@objectTagName@P1, etc.). read_metadata mirrors ReadMetadata (cs:149-165) byte-by-byte: signed smallinti16 widened to u16 for platform/engine/object IDs (matches the .NET checked((ushort)...)), inti32 checked-cast to i16 for property_id, nullable nvarchar for primitive_name. read_user_profile mirrors ReadProfile (cs:76-85) including the roles_text blob → parse_role_blob round-trip. New deps: tiberius 0.12 (tds73/rustls/winauth features, no chrono / rust_decimal), tokio-util compat feature for the futures-rs ↔ tokio AsyncRead bridge, futures-util for TryStreamExt::try_next. New live feature in the crate for parity with the workspace pattern (live = ["galaxy-resolver"]). 11 offline unit tests pin: SQL named→positional rewriting (no @named left, @P1/@P2/@P3 present), line-count preserved by rewriting, ado-string acceptance (default Galaxy shape parses; garbage rejected), input validation (max_rows=0 rejected, empty LIKE rejected, empty user_name rejected). Two #[cfg(feature = "live")] #[ignore]'d tests round-trip against a real Galaxy DB (gated on MX_LIVE + MX_GALAXY_DB env vars per tools/Setup-LiveProbeEnv.ps1): live_resolve_test_child_object_test_int (TestChildObject.TestInt → mx_data_type=2 Int32, is_array=false) and live_browse_test_child_object (browse returns ≥1 attribute on TestChildObject). Both pass against the local AVEVA install.

F4 + F5 — BindAck body parser + captured-bytes round-trip

Resolved: 2026-05-05 (commit <this commit>). Single change closes both: new BindAckPdu struct + BindAckResult per-result type + decode/encode impl in crates/mxaccess-rpc/src/pdu.rs. Body layout per [C706] §12.6.3.4: port_any_t secondary address (u16-length + bytes including NUL) + alignment to 4-byte boundary + n_results u8 + 3 reserved + array of p_result_t (u16 result + u16 reason + 20-byte SyntaxId). Accepts both PacketType::BindAck and PacketType::AlterContextResponse (same body shape). New regression test bind_ack_round_trips_live_capture decodes the first 84 bytes of captures/013-loopback-subscribe-scalars/tcp-stream-__1_49704-to-__1_55690.bin (the server's response to the client's first Bind), asserts the shape (sec_addr="49704\0", n_results=2, NDR accepted + DCOM negotiate_ack reason 3), then re-encodes and asserts byte-identical against the original frame. Stronger live-wire parity than the prior synthetic-frame tests. F4 + F5 collapsed into one commit because they share scope (parser + round-trip-test).

F29 — Align mxaccess-asb-nettcp::nbfs static dictionary ids with canonical [MC-NBFS] table

Resolved: 2026-05-05 (commit <this commit>). The original hand-curated table was wrong starting at id 74 — entries had been deduplicated/renumbered without preserving the canonical id = 2 × StringN mapping from [MC-NBFS] §2.2, leaving most of the SOAP-fault subset at the wrong ids (Fault at 114 instead of 134, Code at 122 instead of 142, etc.). Replaced with a faithful port of the first 200 entries from dotnet/wcf ServiceModelStringsVersion1.cs (covering id 0..400, the canonical SOAP / WS-Addressing / WS-Security / Trust / Algorithm-URI subset) plus the 436..444 xsi/xsd/nil extras already in place. Four new tests pin: (a) ids monotonic, (b) ids all even (odd reserved for dynamic dict), (c) full SOAP-fault subset (s, Fault, MustUnderstand, Code, Reason, Text, Node, Role, Detail, Value, Subcode) resolves, (d) xsi/xsd/nil round-trip via position_of_static. Future extensions: append more ServiceModelStringsVersion1.StringN entries as captures show new ids; mechanical extension.

F31 — InvalidConnectionId on first Register after AuthenticateMe

Resolved: 2026-05-05 (commit 9063f10). Not a HMAC bug — AsbErrorCode.InvalidConnectionId (= 1) is a transient race that .NET's MxAsbDataClient.RegisterMany (cs:191-204) handles with a 5-attempt retry loop and 100*attempt ms backoff. AuthenticateMe is one-way (AsbContracts.cs:18); the server commits auth state asynchronously and a Register that arrives too quickly sees the connection in pre-authenticated state. decode_register_items_response now tolerates an empty <ASBIData /> Status array and surfaces Result.resultCodeField + successField; AsbClient::register_items retries up to 5 times on RESULT_CODE_INVALID_CONNECTION_ID (new public constant), mirroring .NET. Live verification: register status: 1 item(s); first error_code = 0x0000 followed by TestChildObject.TestInt = AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] } over the live wire.

F30 — Resolve dict-id element/attribute names on the read side

Resolved: 2026-05-05 (commit eb6c689). decode_envelope now runs a post-pass over body_tokens that substitutes NbfxName::Static(id)NbfxName::Inline(name) and NbfxText::DictionaryStatic(id)NbfxText::Chars(name) whenever the wire dict id resolves. Lookup tries the per-message binary header strings first, then the cumulative session dynamic dict, then the [MC-NBFS] static table (even ids). Tokens with unresolvable ids stay opaque so trace output still reveals them. Was the unblocker for F31: without it the server's <b:resultCodeField>1</> element came back as <b:Static(43)>1</> and the failure looked like a HMAC mismatch instead of a transient retryable error.

F7 — Consolidate Guid type across mxaccess-rpc

Resolved: 2026-05-05 in this iteration's commit. Guid was hoisted from objref::Guid into the new shared crate::guid::Guid module. objref and pdu now re-export from there; M2 wave 2's orpc, object_exporter, and rem_unknown import it directly. The OXID-resolve dual-string decoder additionally needs an owned protocol label (format!("protseq_0x{:04x}", tower_id) per ObjectExporterMessages.cs:120) — ComDualStringEntry::protocol was upgraded from &'static str to Cow<'static, str> to support both decoders without the agent's interim Box::leak workaround.

F8 — RpcError is duplicated across objref and pdu modules

Resolved: 2026-05-05 in this iteration's commit. RpcError was hoisted into the new shared crate::error::RpcError module as a single union of all wave 1 variants plus a generic Decode { offset, reason: &'static str, buffer_len } variant for the wave 2 ORPC parsers' one-off failures. objref and pdu re-export from there; M2 wave 2's orpc, object_exporter, and rem_unknown use it directly.

F13 — NmxClient high-level write/advise/subscribe wrappers

Resolved: 2026-05-05. All seven wrappers landed in crates/mxaccess-nmx/src/client.rs: write, write2, write_secured2, advise_supervisory, send_observed_pre_advise_metadata, register_reference, un_advise. Each takes a GalaxyTagMetadata + a typed WriteValue (re-exported from mxaccess-codec), builds the inner NMX body via mxaccess-codec (write_message::encode / encode_timestamped / secured_write::encode / NmxItemControlMessage / NmxMetadataQueryMessage / NmxReferenceRegistrationMessage), wraps in NmxTransferEnvelope, and routes through transfer_data. The pure-codec encode_*_transfer_body helpers are extracted as pub(crate) fn for testability, mirroring the .NET reference's internal static shape. un_advise preserves the .NET reference's quirky NmxTransferMessageKind::Write envelope (not ItemControl) per cs:457.

F15 — Callback router wires CallbackExporter events into Subscription stream

Resolved: 2026-05-05 across two commits.

  • Step 1/2 (2b849ae): Session::connect_nmx now starts a CallbackExporter on a 127.0.0.1 ephemeral port, builds the OBJREF via local_hostname() + 127.0.0.1 fallback, registers it through NmxClient::register_engine_2 (was ..._without_callback). A callback_router task drains CallbackEvents, decodes each CallbackInvoked body via NmxSubscriptionMessage::parse_inner, and broadcasts parsed messages on a tokio::sync::broadcast channel exposed via Session::callbacks(). Shutdown chains: UnregisterEngine → CallbackExporter::shutdown → wait for router task.
  • Step 2/2 (this commit): Subscription now impls Stream<Item = Result<DataChange, Error>>. Filtering follows the .NET reference at cs:333-343 exactly — 0x32 SubscriptionStatus messages are kept only when message.item_correlation_id == subscription.correlation_id; 0x33 DataUpdate messages pass through to ALL subscriptions because the codec exposes no per-record correlation field (matches the .NET MxNativeCallbackEvent filter behavior verbatim). Each NmxSubscriptionRecord with a parseable value becomes one DataChange. Records with value: None are dropped silently (mirrors the .NET evt.Record.Value is null filter at cs:337). Lag-loss surfaces as Error::Configuration(InvalidArgument) carrying the lag count. Stream-end (broadcast sender dropped) yields None. New helper: filetime_to_system_time (inverse of the existing system_time_to_filetime); saturates at Unix epoch for pre-1970 FILETIMEs. Tests cover correlation match/mismatch for 0x32, 0x33 pass-through for any correlation, and FILETIME round-trip.

F1 — NTLM consumer-layer helpers (workstation default + from_env constructor)

Resolved: 2026-05-05. NtlmClientContext::from_env() reads MX_RPC_USER / MX_RPC_PASSWORD / MX_RPC_DOMAIN (mirrors ManagedNtlmClientContext.FromEnvironment at cs:41-49); empty MX_RPC_DOMAIN is permitted. local_hostname() checks COMPUTERNAME then HOSTNAME and returns the empty string when neither is set — same "unavailable" semantics as Environment.MachineName returning null. Lives in mxaccess-rpc/src/ntlm.rs; deliberately doesn't pull gethostname (no native-libc deps, no unsafe for hostname lookup). Added NtlmError::MissingEnvVar { name } for the env-var-unset case. Test mod gained an EnvScope + ENV_LOCK mutex pattern for serializing process-global env mutation across parallel tests.

F9 — ObjectExporterClient.cs ResolveOxid wrapper methods

Resolved: 2026-05-05. Both portable methods land in crates/mxaccess-rpc/src/object_exporter_client.rs: resolve_oxid_unauthenticated (mirrors cs:14-30) and resolve_oxid_with_managed_ntlm_packet_integrity (mirrors cs:66-81). Each opens a TCP connection, binds to IObjectExporter, calls opnum 0 with the encoded request, and decodes the response — preferring parse_resolve_oxid_result then falling back to parse_resolve_oxid_failure for short stubs. The two SSPI flavours (ResolveOxidWithNtlmConnect, ResolveOxidWithNtlmPacketIntegrity) wrap .NET's System.Net.Security.SspiClientContext and are explicitly out of scope for the Rust port — that's a permanent skip, not a deferral.

F17 — Guid::parse_str helper (dashed-hex string parser)

Resolved: 2026-05-05. Guid::parse_str(&str) -> Result<Guid, RpcError> landed in crates/mxaccess-rpc/src/guid.rs:65-112 as the inverse of the existing Display impl. Accepts the canonical dashed-hex form, optionally wrapped in {} braces (.NET B format), case-insensitive, and tolerant of bare 32-char hex without dashes. Single-pass char-by-char nibble accumulator avoids per-byte string allocation; the same byte-swap of groups 1-3 the Display impl does is applied after the raw hex pass. Eight new tests cover round-trip against the Display fixture (b49f92f7-c748-4169-8eca-a0670b012746), braces, uppercase, no-dashes, zero-GUID, too-short, too-long, and non-hex rejection. The five live-NMX examples (connect-write-read, subscribe, recovery, multi-tag, secured-write) lost their per-file 15-line parse_guid helpers in favour of the canonical implementation. Test count delta: 524 → 532 (+8).