Closes F16. Replaces the wave-2 no-op recover_connection with the
full .NET-equivalent shape (MxNativeSession.cs:399-474). Three
pieces:
1. Subscription registry on SessionInner.
New subscriptions: Mutex<HashMap<[u8; 16], SubscriptionEntry>>
tracks every active advise. subscribe() inserts after a successful
AdviseSupervisory; unsubscribe() removes on the success path only
(failed UnAdvises stay registered so next recovery replays them).
The consumer's Subscription handle still holds the BroadcastStream;
the registry is purely for AdviseSupervisory replay.
2. Pluggable RebuildFactory.
New public typedef:
pub type RebuildFactory = Arc<
dyn Fn() -> Pin<Box<dyn Future<Output = Result<NmxClient,
NmxClientError>>
+ Send>>
+ Send + Sync,
>;
Installed via Session::set_recovery_factory(factory);
queryable via has_recovery_factory(). Kept separate from
connect_nmx / connect_nmx_auto so existing constructors stay
non-breaking — consumers opt in by calling the setter
after-the-fact.
3. Real recover_connection + recover_connection_core.
recover_connection is the retry loop (mirrors cs:399-440): for
attempt in 1..=policy.max_attempts, emit RecoveryEvent::Started
→ call recover_connection_core → on Ok emit Recovered + return,
on Err emit Failed{will_retry, error}, sleep policy.delay, retry,
or bubble the last error.
recover_connection_core mirrors cs:442-474: rebuild NMX via the
factory → RegisterEngine2 with the saved callback_obj_ref → optional
SetHeartbeatSendInterval → snapshot the registry under the lock,
replay AdviseSupervisory(correlation_id) for each entry → atomically
swap *nmx_lock = replacement. Old NmxClient drops at end of scope,
closing its TCP transport.
Subscription correlation ids are preserved across the swap so the
consumer's Subscription stream continues to receive on its existing
broadcast filter. The CallbackExporter stays bound across recoveries
— no TCP listener re-bind.
R15's "long-lived connection task" was listed as a hard prereq, but
the existing Mutex<NmxClient> already serialises concurrent ops
during the rebuild — recover_connection_core holds the inner mutex
during the swap, concurrent ops just wait. Functionally equivalent
to the long-lived-task design.
New ConfigError::RecoveryNotConfigured returned when
recover_connection is called without a factory installed. New
public re-export: RebuildFactory.
Tests (mxaccess 65 → 67):
- recover_connection_without_factory_returns_recovery_not_configured
- recover_connection_with_always_failing_factory_exhausts_attempts
(pins (Started, Failed)×3 + final will_retry=false + bubbled
TransportFailure)
- subscribe_populates_registry_unsubscribe_clears_it
- recovery_events_supports_multiple_subscribers (updated for the
new factory-required path)
connect_nmx_auto-side auto-population of the factory (capturing the
ntlm_factory + discovered (addr, service_ipid) so consumers don't
re-author the closure) is a future polish — not required to close
F16.
design/followups.md: F16 moved to Resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
design/ — Rust port architectural plan
This folder is the design contract for the Rust replacement of AVEVA/Wonderware MXAccess. It is the gap between the .NET reference in src/ and the Rust crates that will be written under a sibling rust/ workspace (per CLAUDE.md).
The folder is structured as a small set of focused documents. Read in order; each builds on the previous.
| File | Purpose |
|---|---|
00-overview.md |
Mission, two-layer goal, architectural principles, non-goals |
10-raw-layer.md |
Byte-accurate raw MXAccess layer (codec + transport + session) |
20-async-layer.md |
Idiomatic Tokio async layer on top of the raw layer |
30-crate-topology.md |
Cargo workspace, crates, dependencies, build/test commands |
40-protocol-invariants.md |
Bill of materials: IIDs, opnums, envelope/handle bytes |
50-error-model.md |
MxStatus, error types, panic/cancellation policy |
60-roadmap.md |
Milestones M0..M6, validation strategy |
70-risks-and-open-questions.md |
Parity gaps, unproven flows, cross-platform constraints |
dependencies.md |
Cross- and within-milestone parallelism map; agent budget per phase |
review.md |
Adversarial review log (BLOCKER/MAJOR/MINOR/NIT findings, all resolved) |
prompt.md |
/loop driver prompt for autonomous M2–M6 execution |
followups.md |
Open / resolved deferred work items; auto-triaged by prompt.md Step 0 (created on first /loop run if missing) |
The design is grounded in the .NET reference at src/ and the protocol artifacts in docs/, analysis/, and captures/. Do not introduce protocol behavior in these documents that is not already proven in the reference. When adding a new claim about wire format, cite either:
- a
.csfile path insrc/MxNativeCodec/,src/MxNativeClient/, orsrc/MxAsbClient/, or - a
docs/*.mdspec file, or - a
captures/0NN-frida-*directory oranalysis/frida/*.tsvrow.
This folder is documentation, not code. When the Rust workspace is created, the design here is the contract it must satisfy. When evidence in captures/ invalidates a design decision here, update the design first, then the code.
Reading order
- New contributor: 00 → 30 → 10 → 40 → 20 → 50 → 60 → 70.
- Protocol question: 40 first, then the relevant section of 10.
- API question: 20 first, then 50.
- Planning a milestone: 60 first, cross-reference 70 for blockers.
- Scheduling concurrent work:
dependencies.mdfor the per-phase parallelism map. - Driving M2–M6 autonomously via
/loop:prompt.md(and thefollowups.mdtriage log it maintains).