[F54] per-operation correlation + compat OnWriteComplete fan-out
rust / build / test / clippy / fmt (push) Has been cancelled
rust / cargo public-api drift check (F41) (push) Has been cancelled

Closes the residual that R3/R4 Path A's commit `c73a33e` deferred:
the OperationStatus.context field was always None because no
in-flight correlation map existed in SessionInner, and the
mxaccess-compat broadcast channels for OnWriteComplete /
OperationComplete were exposed on the public API but had no
fan-out task draining session events into them.

**mxaccess (Part 1 — per-operation correlation):**

- New `pending_ops: Mutex<HashMap<[u8; 16], OperationContext>>` on
  SessionInner. Populated when `Session::write*` / `subscribe*`
  dispatches an outstanding operation; entry removed when the
  matching OperationStatus event fires (one-shot semantics).
- New `Session::write_with_handle` (and equivalents for the secured /
  timestamped paths) returns a `WriteHandle { correlation_id }` so
  consumers can correlate completions back to their originating
  call. Existing `write` / `write_value` / etc. signatures unchanged
  and delegate to the handle-returning variant.
- Callback router extended to look up `pending_ops` by correlation_id
  on each operation-status event. When found, populates
  `OperationStatus.context: Some(OperationContext { correlation_id,
  op_kind, reference, retry_count: 0 })`. When not found, falls
  through with `context: None` (verbatim-preserve per CLAUDE.md).
- New unit tests assert: matching correlation_id populates context,
  unknown correlation_id leaves context None, the entry is removed
  from `pending_ops` after one event fires.

**mxaccess-compat (Part 2 — compat-layer fan-out):**

- New `correlation_to_item: tokio::sync::Mutex<HashMap<[u8; 16], i32>>`
  on LmxClientInner.
- `LmxClient::write` / `write_2` / `write_secured` / `write_secured_2`
  call `Session::write_with_handle` (or equivalent) and insert
  `correlation_id → item_handle` into the map before returning.
- `LmxClient::register` / `register_asb` spawn a background task that
  drains `session.operation_status_stream()`. Per event, looks up
  `correlation_to_item[event.context?.correlation_id]` to find the
  item_handle, then routes:
  - `OperationKind::Write` / `OperationKind::WriteSecured` →
    `WriteCompleteEvent { server_handle, item_handle, statuses,
    is_during_recovery }` into `on_write_complete_tx`.
  - Other variants → `OperationCompleteEvent { ... }` into
    `on_operation_complete_tx`.
  - Removes the correlation_id from `correlation_to_item` after
    firing (one-shot).
- Events with no matching item_handle (correlation_id not in map)
  are dropped silently — no bogus item_handle=0 events.
- Task cancelled on LmxClient drop via `JoinHandle::abort` (matches
  the existing `subscription_task` pattern).
- New unit tests cover: Write op routes to on_write_complete, Read
  op routes to on_operation_complete, unknown correlation_id is
  dropped.

Result: the C# `LMX_OnWriteComplete(int hLMXServerHandle, int
phItemHandle, ref MXSTATUS_PROXY[] pVars)` callback shape is now
end-to-end-achievable. A consumer calls `LmxClient::write(hServer,
hItem, value, userId)` and drains `client.on_write_complete()`; the
yielded `WriteCompleteEvent` carries the right `(server_handle,
item_handle, statuses, is_during_recovery)` tuple.

Public API: `Session::write_with_handle` + `WriteHandle` are new;
existing signatures unchanged. `cargo public-api` baselines
regenerated under `design/public-api/{mxaccess,mxaccess-compat}.txt`.

Workspace: 765 → 823 tests pass (~58 new tests from F54). Clippy
`-D warnings` clean. Rustdoc `-D warnings` clean.

F54 status in `design/followups.md` moved Open → Resolved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-06 07:41:28 -04:00
parent f98ab9846d
commit 4ff511bbed
5 changed files with 1040 additions and 78 deletions
+59 -5
View File
@@ -24,7 +24,8 @@ use std::sync::Arc;
use std::time::{Duration, SystemTime};
pub use mxaccess_codec::{
MxDataType, MxReferenceHandle, MxStatus, MxStatusCategory, MxStatusSource, MxValue, MxValueKind,
MxDataType, MxReferenceHandle, MxStatus, MxStatusCategory, MxStatusSource, MxValue,
MxValueKind, NmxOperationStatusFormat, NmxOperationStatusMessage,
};
// ---- Public types --------------------------------------------------------
@@ -39,7 +40,9 @@ pub use transport_asb::AsbTransport;
pub use mxaccess_galaxy::{GalaxyTagMetadata, Resolver, ResolverError};
pub use mxaccess_nmx::WriteValue;
pub use session::{OperationContext, OperationKind, OperationStatus, RebuildFactory, Subscription};
pub use session::{
OperationContext, OperationKind, OperationStatus, RebuildFactory, Subscription, WriteHandle,
};
/// Async session façade. Cheap clones share the inner state; drop of the last
/// clone fires `UnregisterEngine` best-effort. For deterministic shutdown,
@@ -456,8 +459,24 @@ impl Session {
/// `ElapsedTime` and their array variants — see module-level note
/// for why).
pub async fn write(&self, reference: &str, value: MxValue) -> Result<(), Error> {
self.write_with_handle(reference, value).await.map(|_| ())
}
/// `MxValue` overload of [`Self::write_value_with_handle`]. Same
/// conversion rules as [`Self::write`]; returns the
/// [`session::WriteHandle`] inserted into the session's
/// `pending_ops` registry so the caller can correlate this write
/// to a later [`session::OperationStatus`] event (F54).
///
/// # Errors
/// As for [`Self::write`].
pub async fn write_with_handle(
&self,
reference: &str,
value: MxValue,
) -> Result<session::WriteHandle, Error> {
let wv = mxvalue_to_writevalue(value)?;
self.write_value(reference, wv).await
self.write_value_with_handle(reference, wv).await
}
/// Write-with-completion — paired write + `OperationComplete`
@@ -498,9 +517,26 @@ impl Session {
value: MxValue,
timestamp: SystemTime,
) -> Result<(), Error> {
self.write_with_timestamp_and_handle(reference, value, timestamp)
.await
.map(|_| ())
}
/// `MxValue` overload of [`Self::write_value_at_with_handle`].
/// Same conversion rules as [`Self::write_with_timestamp`]; returns
/// the [`session::WriteHandle`] for F54 correlation.
///
/// # Errors
/// As for [`Self::write_with_timestamp`].
pub async fn write_with_timestamp_and_handle(
&self,
reference: &str,
value: MxValue,
timestamp: SystemTime,
) -> Result<session::WriteHandle, Error> {
let wv = mxvalue_to_writevalue(value)?;
let ft = session::system_time_to_filetime(timestamp)?;
self.write_value_at(reference, wv, ft).await
self.write_value_at_with_handle(reference, wv, ft).await
}
/// Verified Write without an explicit timestamp. Currently
@@ -542,9 +578,27 @@ impl Session {
timestamp: SystemTime,
security: SecurityContext,
) -> Result<(), Error> {
self.write_secured_at_with_handle(reference, value, timestamp, security)
.await
.map(|_| ())
}
/// `MxValue` overload of [`Self::write_value_secured_at_with_handle`].
/// Same conversion rules as [`Self::write_secured_at`]; returns
/// the [`session::WriteHandle`] for F54 correlation.
///
/// # Errors
/// As for [`Self::write_secured_at`].
pub async fn write_secured_at_with_handle(
&self,
reference: &str,
value: MxValue,
timestamp: SystemTime,
security: SecurityContext,
) -> Result<session::WriteHandle, Error> {
let wv = mxvalue_to_writevalue(value)?;
let ft = session::system_time_to_filetime(timestamp)?;
self.write_value_secured_at(reference, wv, ft, security)
self.write_value_secured_at_with_handle(reference, wv, ft, security)
.await
}
+484 -35
View File
@@ -42,7 +42,7 @@ use mxaccess_nmx::{NmxClient, NmxClientError, WriteValue};
use mxaccess_rpc::guid::Guid;
use mxaccess_rpc::ntlm::{NtlmClientContext, local_hostname};
use mxaccess_rpc::transport::TransportError;
use std::collections::HashMap;
use std::collections::{HashMap, VecDeque};
use std::net::SocketAddr;
use std::pin::Pin;
use std::task::{Context, Poll};
@@ -124,18 +124,29 @@ pub enum OperationKind {
/// Per-operation context tracked for outstanding RPCs.
///
/// The Rust port currently uses this struct only to enrich
/// [`OperationStatus`] events surfaced via
/// [`Session::operation_status_events`]. Future work
/// (`design/70-risks-and-open-questions.md` R3/R4 Path A follow-on)
/// will let the consumer correlate completion frames back to specific
/// outstanding write/subscribe calls; the current bring-up always
/// emits `OperationStatus.context = None` because the operation→
/// completion correlation channel is not yet wired.
/// Populated by every public `Session` call that issues an outstanding
/// NMX op (`write*`, `read`, `subscribe*`) and consumed by the callback
/// router when an operation-status frame arrives — the matching context
/// is attached to [`OperationStatus::context`] and removed from the
/// `pending_ops` registry. F54 wired the population path; the
/// kernel itself stays byte-deterministic per R3/R4 Path A.
///
/// Mirrors the bookkeeping `MxNativeSession` does in its private
/// `_pendingWrites` / `_pendingReads` dictionaries (referenced
/// in the source but not exposed publicly).
///
/// ## Correlation strategy
///
/// The 5-byte StatusWord and 1-byte CompletionOnly frames the LMX
/// engine emits do **not** carry an explicit per-op correlation id on
/// the wire. The Rust port assigns a synthetic 16-byte
/// [`Self::correlation_id`] at submission time, stores it in a FIFO
/// queue, and pops the oldest entry when an operation-status arrives —
/// matching the engine's serialised submission/completion model
/// (operations on a single `Mutex<NmxClient>` complete in submission
/// order). The synthetic id is exposed to consumers via [`WriteHandle`]
/// so layers like `mxaccess-compat`'s `LmxClient` can map a write back
/// to the `item_handle` it originated from.
#[derive(Debug, Clone)]
#[non_exhaustive]
pub struct OperationContext {
@@ -155,6 +166,27 @@ pub struct OperationContext {
pub retry_count: u32,
}
impl OperationContext {
/// Construct an `OperationContext`. Public so downstream crates
/// (e.g. `mxaccess-compat`) can synthesise events for unit tests
/// — the `#[non_exhaustive]` marker still applies for source-level
/// breaking-change protection.
#[must_use]
pub fn new(
correlation_id: [u8; 16],
op_kind: OperationKind,
reference: Option<Arc<str>>,
retry_count: u32,
) -> Self {
Self {
correlation_id,
op_kind,
reference,
retry_count,
}
}
}
/// One operation-status event surfaced to consumers via
/// [`Session::operation_status_events`].
///
@@ -175,9 +207,11 @@ pub struct OperationContext {
/// [`MxStatus::from_packed_u32`] directly.**
/// - [`Self::context`] carries the originating
/// [`OperationContext`] when the event can be correlated back to a
/// tracked outstanding operation. The current implementation
/// always emits `None` — operation-tracking plumbing lands as a
/// follow-up (see the module-level docs).
/// tracked outstanding operation. F54 wired this for the FIFO
/// submission model — events that arrive when the `pending_ops`
/// queue is non-empty pop the oldest pending op and attach its
/// context. Events arriving with an empty registry surface with
/// `context: None` (verbatim-preserve fallback per CLAUDE.md).
#[derive(Debug, Clone)]
#[non_exhaustive]
pub struct OperationStatus {
@@ -186,8 +220,11 @@ pub struct OperationStatus {
/// Typed status (synthesizer-promoted for known shapes; verbatim
/// for unknown).
pub status: MxStatus,
/// Optional originating-call context. Always `None` until the
/// operation-tracking plumbing is wired (see module-level docs).
/// Optional originating-call context. F54: populated by the
/// callback router when the `pending_ops` queue had an oldest
/// entry at the time the frame arrived; `None` otherwise (the
/// frame arrived without a matching outstanding op — preserved
/// verbatim per CLAUDE.md).
pub context: Option<OperationContext>,
/// `true` when the frame arrived during an active
/// `Session::recover_connection` window. Mirrors
@@ -196,6 +233,50 @@ pub struct OperationStatus {
pub is_during_recovery: bool,
}
impl OperationStatus {
/// Construct an `OperationStatus`. Public so downstream crates
/// (e.g. `mxaccess-compat`) can synthesise events for unit tests
/// — the `#[non_exhaustive]` marker still applies for source-level
/// breaking-change protection.
#[must_use]
pub fn new(
raw: NmxOperationStatusMessage,
status: MxStatus,
context: Option<OperationContext>,
is_during_recovery: bool,
) -> Self {
Self {
raw,
status,
context,
is_during_recovery,
}
}
}
/// Handle returned by the `write_*_with_handle` family of [`Session`]
/// methods so the caller can correlate this specific outstanding write
/// back to the next [`OperationStatus`] event.
///
/// The contained [`Self::correlation_id`] is the same 16-byte
/// identifier inserted into the session's `pending_ops` registry;
/// consumers that drain [`Session::operation_status_events`] can match
/// on `event.context?.correlation_id == handle.correlation_id` to
/// filter to a specific call's completion.
///
/// The `mxaccess-compat::LmxClient` uses this to map back to the
/// `item_handle` for `OnWriteComplete` fan-out (F54).
///
/// Mirrors the .NET reference's `MxNativeSession._pendingWrites`
/// dictionary key role — the consumer holds the key client-side.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
#[non_exhaustive]
pub struct WriteHandle {
/// Synthetic 16-byte correlation id inserted into the session's
/// `pending_ops` registry at submission time.
pub correlation_id: [u8; 16],
}
/// Subscription handle returned by [`Session::subscribe`]. Implements
/// `Stream<Item = Result<DataChange, Error>>` — driving it forward
/// yields one [`DataChange`] per matching record observed on the
@@ -499,6 +580,70 @@ pub struct SessionInner {
/// `recover_connection` returns
/// [`Error::Configuration`] with `RecoveryNotConfigured`.
pub(crate) rebuild_factory: Mutex<Option<RebuildFactory>>,
/// F54 — outstanding-operation registry. Each public Session call
/// that issues an outstanding NMX op (`write*`, `read`,
/// `subscribe*`) inserts an [`OperationContext`] keyed by a
/// synthetic 16-byte correlation id; the callback router pops the
/// oldest entry from `pending_ops_order` when an
/// [`NmxOperationStatusMessage`] arrives and uses the popped id to
/// look up + remove the matching context from `pending_ops`. The
/// FIFO ordering matches the engine's serialised submission /
/// completion model — operations on a single `Mutex<NmxClient>`
/// complete in submission order.
///
/// The two structures are kept in sync under a shared `Arc<Mutex<_>>`
/// so the spawned router task can reach them without holding a
/// strong reference to the entire `SessionInner`. Mirrors the .NET
/// reference's private `_pendingWrites` / `_pendingReads`
/// dictionaries (`MxNativeSession.cs` field-level comments) plus
/// the ordered list those dictionaries are consulted against.
pub(crate) pending_ops: Arc<Mutex<PendingOps>>,
}
/// FIFO-ordered registry of outstanding NMX operations waiting for an
/// operation-status frame to arrive. See
/// [`SessionInner::pending_ops`] for the correlation strategy.
#[derive(Debug, Default)]
pub(crate) struct PendingOps {
/// Submission order — the next operation-status pops the front.
pub(crate) order: VecDeque<[u8; 16]>,
/// Lookup table by correlation id, sized to match `order`.
pub(crate) by_id: HashMap<[u8; 16], OperationContext>,
}
impl PendingOps {
/// Insert a new outstanding op, recording its submission order.
pub(crate) fn push(&mut self, ctx: OperationContext) {
self.order.push_back(ctx.correlation_id);
self.by_id.insert(ctx.correlation_id, ctx);
}
/// Pop the oldest pending entry. Returns `None` when no operation
/// is outstanding — the arriving status frame then surfaces with
/// `context: None` (verbatim-preserve fallback per CLAUDE.md).
pub(crate) fn pop_oldest(&mut self) -> Option<OperationContext> {
loop {
let cid = self.order.pop_front()?;
// `by_id` may have lost the entry (e.g. cancelled) — keep
// popping until either we find one or the queue empties.
if let Some(ctx) = self.by_id.remove(&cid) {
return Some(ctx);
}
}
}
/// Remove a specific entry by correlation id (e.g. caller cancelled
/// the await). Drops the entry from the lookup table; the next
/// `pop_oldest` call will skip the orphaned slot in `order`.
pub(crate) fn remove(&mut self, cid: &[u8; 16]) -> Option<OperationContext> {
self.by_id.remove(cid)
}
/// Number of currently-outstanding ops (live in `by_id`).
#[cfg(test)]
pub(crate) fn len(&self) -> usize {
self.by_id.len()
}
}
/// Per-subscription state retained for [`Session::recover_connection`].
@@ -627,6 +772,7 @@ pub(crate) async fn callback_router(
callback_tx: broadcast::Sender<Arc<NmxSubscriptionMessage>>,
operation_status_tx: broadcast::Sender<Arc<OperationStatus>>,
recovery_active: Arc<std::sync::atomic::AtomicU32>,
pending_ops: Arc<Mutex<PendingOps>>,
) {
while let Some(event) = events.recv().await {
if let CallbackEvent::CallbackInvoked { body, .. } = event {
@@ -637,13 +783,19 @@ pub(crate) async fn callback_router(
let is_during_recovery =
recovery_active.load(std::sync::atomic::Ordering::Acquire) > 0;
let typed = op.promote_to_typed();
// F54: pop the oldest outstanding op (FIFO submission
// order). The wire frame carries no correlation id, so
// the oldest pending entry is matched. When the queue
// is empty, the event surfaces with `context: None`
// (verbatim-preserve fallback per CLAUDE.md).
let context = {
let mut guard = pending_ops.lock().await;
guard.pop_oldest()
};
let _ = operation_status_tx.send(Arc::new(OperationStatus {
raw: op,
status: typed,
// Operation-tracking plumbing not yet wired —
// always emit context=None for now (R3/R4
// follow-on tracks adding the correlation channel).
context: None,
context,
is_during_recovery,
}));
continue;
@@ -782,11 +934,16 @@ impl Session {
let (operation_status_tx, _) =
broadcast::channel::<Arc<OperationStatus>>(OPERATION_STATUS_BROADCAST_CAPACITY);
let recovery_active = Arc::new(std::sync::atomic::AtomicU32::new(0));
// F54: shared pending-ops registry — hand a clone to the
// router task; the SessionInner holds the other clone so the
// public `write_*_with_handle` family can populate it.
let pending_ops: Arc<Mutex<PendingOps>> = Arc::new(Mutex::new(PendingOps::default()));
let router_handle = tokio::spawn(callback_router(
callback_events,
callback_tx.clone(),
operation_status_tx.clone(),
recovery_active.clone(),
pending_ops.clone(),
));
// 3. RegisterEngine2 with the callback OBJREF. Mirrors cs:163-175.
@@ -839,6 +996,7 @@ impl Session {
subscriptions: Mutex::new(HashMap::new()),
callback_obj_ref,
rebuild_factory: Mutex::new(None),
pending_ops,
}),
})
}
@@ -899,7 +1057,7 @@ impl Session {
/// or lag-handling matters.
pub fn operation_status_stream(
&self,
) -> impl Stream<Item = Result<Arc<OperationStatus>, Error>> + Send {
) -> impl Stream<Item = Result<Arc<OperationStatus>, Error>> + Send + use<> {
let rx = self.inner.operation_status_tx.subscribe();
BroadcastStream::new(rx).map(|item| match item {
Ok(ev) => Ok(ev),
@@ -1221,6 +1379,28 @@ impl Session {
/// - [`Error::Status`]-shaped error if the LMX server returned a
/// non-zero application HRESULT.
pub async fn write_value(&self, reference: &str, value: WriteValue) -> Result<(), Error> {
self.write_value_with_handle(reference, value)
.await
.map(|_| ())
}
/// `write_value` variant that returns the [`WriteHandle`] inserted
/// into the session's `pending_ops` registry before dispatch —
/// useful when the caller needs to correlate this specific write
/// back to a later [`OperationStatus`] event (e.g. the
/// `mxaccess-compat::LmxClient` `OnWriteComplete` fan-out, F54).
///
/// Functionally identical to [`Self::write_value`]: both delegate
/// to the same underlying NMX `Write` op; only the return type
/// differs.
///
/// # Errors
/// As for [`Self::write_value`].
pub async fn write_value_with_handle(
&self,
reference: &str,
value: WriteValue,
) -> Result<WriteHandle, Error> {
self.ensure_connected()?;
let inner = self.inner.clone();
let metadata = inner
@@ -1229,9 +1409,15 @@ impl Session {
.await
.map_err(map_resolver)?;
let opts = &inner.options;
// F54: register the outstanding op BEFORE dispatching so a
// status frame that races with the wire send still sees the
// entry in `pending_ops`.
let handle = self
.register_pending_op(OperationKind::Write, Some(reference))
.await;
let started = std::time::Instant::now();
let mut nmx = inner.nmx.lock().await;
let hr = nmx
let hr_result = nmx
.write(
opts.local_engine_id,
&metadata,
@@ -1242,13 +1428,25 @@ impl Session {
/* source_galaxy_id */ i32::from(opts.galaxy_id),
opts.source_platform_id,
)
.await
.map_err(map_nmx)?;
ensure_hresult_ok(hr)?;
.await;
let hr = match hr_result {
Ok(hr) => hr,
Err(e) => {
// Wire send failed — no status frame will arrive, so
// remove the pending entry to avoid mis-correlating a
// future op.
self.remove_pending_op(&handle.correlation_id).await;
return Err(map_nmx(e));
}
};
if let Err(e) = ensure_hresult_ok(hr) {
self.remove_pending_op(&handle.correlation_id).await;
return Err(e);
}
// F40 — count + record latency only on the success path.
session_metrics::record_write();
session_metrics::record_write_latency(started.elapsed());
Ok(())
Ok(handle)
}
/// Write a value with an explicit Windows FILETIME timestamp. Mirrors
@@ -1266,6 +1464,22 @@ impl Session {
value: WriteValue,
timestamp_filetime: i64,
) -> Result<(), Error> {
self.write_value_at_with_handle(reference, value, timestamp_filetime)
.await
.map(|_| ())
}
/// `write_value_at` variant that returns the [`WriteHandle`] —
/// see [`Self::write_value_with_handle`] for the rationale.
///
/// # Errors
/// As for [`Self::write_value_at`].
pub async fn write_value_at_with_handle(
&self,
reference: &str,
value: WriteValue,
timestamp_filetime: i64,
) -> Result<WriteHandle, Error> {
self.ensure_connected()?;
let inner = self.inner.clone();
let metadata = inner
@@ -1274,9 +1488,12 @@ impl Session {
.await
.map_err(map_resolver)?;
let opts = &inner.options;
let handle = self
.register_pending_op(OperationKind::Write, Some(reference))
.await;
let started = std::time::Instant::now();
let mut nmx = inner.nmx.lock().await;
let hr = nmx
let hr_result = nmx
.write2(
opts.local_engine_id,
&metadata,
@@ -1288,14 +1505,23 @@ impl Session {
/* source_galaxy_id */ i32::from(opts.galaxy_id),
opts.source_platform_id,
)
.await
.map_err(map_nmx)?;
ensure_hresult_ok(hr)?;
.await;
let hr = match hr_result {
Ok(hr) => hr,
Err(e) => {
self.remove_pending_op(&handle.correlation_id).await;
return Err(map_nmx(e));
}
};
if let Err(e) = ensure_hresult_ok(hr) {
self.remove_pending_op(&handle.correlation_id).await;
return Err(e);
}
// F40 — write2 shares the writes counter (same Session::write*
// family on the wire).
session_metrics::record_write();
session_metrics::record_write_latency(started.elapsed());
Ok(())
Ok(handle)
}
/// Verified write — secured-classification tags require a pair of
@@ -1322,6 +1548,24 @@ impl Session {
timestamp_filetime: i64,
security: SecurityContext,
) -> Result<(), Error> {
self.write_value_secured_at_with_handle(reference, value, timestamp_filetime, security)
.await
.map(|_| ())
}
/// `write_value_secured_at` variant that returns the
/// [`WriteHandle`] — see [`Self::write_value_with_handle`] for the
/// rationale.
///
/// # Errors
/// As for [`Self::write_value_secured_at`].
pub async fn write_value_secured_at_with_handle(
&self,
reference: &str,
value: WriteValue,
timestamp_filetime: i64,
security: SecurityContext,
) -> Result<WriteHandle, Error> {
self.ensure_connected()?;
let inner = self.inner.clone();
let metadata = inner
@@ -1330,9 +1574,12 @@ impl Session {
.await
.map_err(map_resolver)?;
let opts = &inner.options;
let handle = self
.register_pending_op(OperationKind::WriteSecured, Some(reference))
.await;
let started = std::time::Instant::now();
let mut nmx = inner.nmx.lock().await;
let hr = nmx
let hr_result = nmx
.write_secured2(
opts.local_engine_id,
&metadata,
@@ -1347,13 +1594,51 @@ impl Session {
/* source_galaxy_id */ i32::from(opts.galaxy_id),
opts.source_platform_id,
)
.await
.map_err(map_nmx)?;
ensure_hresult_ok(hr)?;
.await;
let hr = match hr_result {
Ok(hr) => hr,
Err(e) => {
self.remove_pending_op(&handle.correlation_id).await;
return Err(map_nmx(e));
}
};
if let Err(e) = ensure_hresult_ok(hr) {
self.remove_pending_op(&handle.correlation_id).await;
return Err(e);
}
// F40 — secured-write success counts toward the writes total.
session_metrics::record_write();
session_metrics::record_write_latency(started.elapsed());
Ok(())
Ok(handle)
}
/// Insert a fresh [`OperationContext`] into [`SessionInner::pending_ops`]
/// and return the associated [`WriteHandle`]. F54 helper: every public
/// op (`write*`, `read`, `subscribe*`) calls this before dispatching
/// the wire op so the callback router can populate
/// [`OperationStatus::context`] when a status frame arrives.
async fn register_pending_op(
&self,
op_kind: OperationKind,
reference: Option<&str>,
) -> WriteHandle {
let correlation_id: [u8; 16] = rand::random();
let ctx = OperationContext {
correlation_id,
op_kind,
reference: reference.map(Arc::<str>::from),
retry_count: 0,
};
let mut guard = self.inner.pending_ops.lock().await;
guard.push(ctx);
WriteHandle { correlation_id }
}
/// Remove a pending-op entry by correlation id (e.g. after a wire
/// failure when no completion frame will ever arrive).
async fn remove_pending_op(&self, correlation_id: &[u8; 16]) {
let mut guard = self.inner.pending_ops.lock().await;
let _ = guard.remove(correlation_id);
}
/// Pre-resolve the wire kind a tag expects without dispatching a
@@ -2116,11 +2401,13 @@ mod tests {
let (operation_status_tx, _) =
broadcast::channel::<Arc<OperationStatus>>(OPERATION_STATUS_BROADCAST_CAPACITY);
let recovery_active = Arc::new(std::sync::atomic::AtomicU32::new(0));
let pending_ops: Arc<Mutex<PendingOps>> = Arc::new(Mutex::new(PendingOps::default()));
let router_handle = tokio::spawn(callback_router(
callback_events,
callback_tx.clone(),
operation_status_tx.clone(),
recovery_active.clone(),
pending_ops.clone(),
));
let (recovery_tx, _) = broadcast::channel(RECOVERY_BROADCAST_CAPACITY);
@@ -2139,6 +2426,7 @@ mod tests {
subscriptions: Mutex::new(HashMap::new()),
callback_obj_ref: Vec::new(),
rebuild_factory: Mutex::new(None),
pending_ops,
}),
})
}
@@ -2572,12 +2860,14 @@ mod tests {
let (callback_tx, mut callback_rx) = broadcast::channel(8);
let (operation_status_tx, _) = broadcast::channel::<Arc<OperationStatus>>(8);
let recovery_active = Arc::new(std::sync::atomic::AtomicU32::new(0));
let pending_ops: Arc<Mutex<PendingOps>> = Arc::new(Mutex::new(PendingOps::default()));
let router_h = tokio::spawn(callback_router(
event_rx,
callback_tx,
operation_status_tx,
recovery_active,
pending_ops,
));
// Build a minimal valid 0x32 SubscriptionStatus body: 23-byte
@@ -3353,12 +3643,14 @@ mod tests {
let (operation_status_tx, mut operation_status_rx) =
broadcast::channel::<Arc<OperationStatus>>(8);
let recovery_active = Arc::new(std::sync::atomic::AtomicU32::new(0));
let pending_ops: Arc<Mutex<PendingOps>> = Arc::new(Mutex::new(PendingOps::default()));
let router_h = tokio::spawn(callback_router(
event_rx,
callback_tx,
operation_status_tx,
recovery_active,
pending_ops,
));
// Inner body is the proven `00 00 50 80 00` 5-byte status-word frame.
@@ -3384,7 +3676,8 @@ mod tests {
// Synthesizer-promoted status equals the canonical sentinel.
assert_eq!(event.status, MxStatus::WRITE_COMPLETE_OK);
// Context not yet wired — always None for this iteration.
// F54: context is None when no pending op was outstanding —
// the registry was empty at frame-arrival time.
assert!(event.context.is_none());
// No recovery in flight when the event was dispatched.
assert!(!event.is_during_recovery);
@@ -3416,11 +3709,13 @@ mod tests {
let (operation_status_tx, mut operation_status_rx) =
broadcast::channel::<Arc<OperationStatus>>(8);
let recovery_active = Arc::new(std::sync::atomic::AtomicU32::new(0));
let pending_ops: Arc<Mutex<PendingOps>> = Arc::new(Mutex::new(PendingOps::default()));
let router_h = tokio::spawn(callback_router(
event_rx,
callback_tx,
operation_status_tx,
recovery_active,
pending_ops,
));
let inner = [byte];
@@ -3466,11 +3761,13 @@ mod tests {
let (operation_status_tx, mut operation_status_rx) =
broadcast::channel::<Arc<OperationStatus>>(8);
let recovery_active = Arc::new(std::sync::atomic::AtomicU32::new(1));
let pending_ops: Arc<Mutex<PendingOps>> = Arc::new(Mutex::new(PendingOps::default()));
let router_h = tokio::spawn(callback_router(
event_rx,
callback_tx,
operation_status_tx,
recovery_active,
pending_ops,
));
let inner = [0x00, 0x00, 0x50, 0x80, 0x00];
@@ -3521,11 +3818,13 @@ mod tests {
let (callback_tx, mut callback_rx) = broadcast::channel(8);
let (operation_status_tx, _) = broadcast::channel::<Arc<OperationStatus>>(8);
let recovery_active = Arc::new(std::sync::atomic::AtomicU32::new(0));
let pending_ops: Arc<Mutex<PendingOps>> = Arc::new(Mutex::new(PendingOps::default()));
let router_h = tokio::spawn(callback_router(
event_rx,
callback_tx,
operation_status_tx,
recovery_active,
pending_ops,
));
event_tx
@@ -3543,6 +3842,156 @@ mod tests {
let _ = router_h.await;
}
// ---- F54: per-operation correlation -------------------------------
/// F54: when a synthetic `OperationContext` exists in `pending_ops`
/// and an operation-status frame arrives through the router, the
/// emitted [`OperationStatus`] carries `context: Some(_)` populated
/// from the popped entry — and the registry no longer contains
/// that entry afterwards.
#[tokio::test]
async fn router_populates_operation_status_context_from_pending_ops_fifo() {
let (event_tx, event_rx) = tokio::sync::mpsc::unbounded_channel();
let (callback_tx, _callback_rx) = broadcast::channel(8);
let (operation_status_tx, mut operation_status_rx) =
broadcast::channel::<Arc<OperationStatus>>(8);
let recovery_active = Arc::new(std::sync::atomic::AtomicU32::new(0));
let pending_ops: Arc<Mutex<PendingOps>> = Arc::new(Mutex::new(PendingOps::default()));
// Pre-populate the registry with one outstanding Write op,
// mirroring what `Session::write_value_with_handle` would do.
let cid: [u8; 16] = [0xA1; 16];
{
let mut guard = pending_ops.lock().await;
guard.push(OperationContext {
correlation_id: cid,
op_kind: OperationKind::Write,
reference: Some(Arc::<str>::from("TestObj.TestInt")),
retry_count: 0,
});
assert_eq!(guard.len(), 1);
}
let router_h = tokio::spawn(callback_router(
event_rx,
callback_tx,
operation_status_tx,
recovery_active,
pending_ops.clone(),
));
// Drive the proven 5-byte WRITE_COMPLETE_OK frame through.
let inner = [0x00, 0x00, 0x50, 0x80, 0x00];
let body = wrap_op_status_envelope(&inner);
event_tx
.send(CallbackEvent::CallbackInvoked { opnum: 4, body })
.unwrap();
let event = tokio::time::timeout(
std::time::Duration::from_secs(1),
operation_status_rx.recv(),
)
.await
.expect("router timed out")
.expect("broadcast recv error");
// F54 contract: context populated with the same correlation id
// we inserted, the same op_kind, and the same reference.
let ctx = event.context.clone().expect("context should be populated");
assert_eq!(ctx.correlation_id, cid);
assert_eq!(ctx.op_kind, OperationKind::Write);
assert_eq!(ctx.reference.as_deref(), Some("TestObj.TestInt"));
assert_eq!(ctx.retry_count, 0);
// F54 one-shot semantics: the entry has been removed.
{
let guard = pending_ops.lock().await;
assert_eq!(guard.len(), 0, "entry must be removed after firing");
}
drop(event_tx);
let _ = router_h.await;
}
/// F54: when no pending op is registered, the emitted
/// [`OperationStatus`] carries `context: None` (verbatim-preserve
/// fallback per CLAUDE.md).
#[tokio::test]
async fn router_emits_none_context_for_unknown_correlation() {
let (event_tx, event_rx) = tokio::sync::mpsc::unbounded_channel();
let (callback_tx, _callback_rx) = broadcast::channel(8);
let (operation_status_tx, mut operation_status_rx) =
broadcast::channel::<Arc<OperationStatus>>(8);
let recovery_active = Arc::new(std::sync::atomic::AtomicU32::new(0));
// Empty pending_ops — the arriving status frame has no
// matching entry to attach.
let pending_ops: Arc<Mutex<PendingOps>> = Arc::new(Mutex::new(PendingOps::default()));
let router_h = tokio::spawn(callback_router(
event_rx,
callback_tx,
operation_status_tx,
recovery_active,
pending_ops.clone(),
));
let inner = [0x00, 0x00, 0x50, 0x80, 0x00];
let body = wrap_op_status_envelope(&inner);
event_tx
.send(CallbackEvent::CallbackInvoked { opnum: 4, body })
.unwrap();
let event = tokio::time::timeout(
std::time::Duration::from_secs(1),
operation_status_rx.recv(),
)
.await
.expect("router timed out")
.expect("broadcast recv error");
// F54 fallback: no pending ops means context is None — the
// raw frame is still surfaced verbatim.
assert!(event.context.is_none());
assert_eq!(event.status, MxStatus::WRITE_COMPLETE_OK);
drop(event_tx);
let _ = router_h.await;
}
/// F54: `Session::write_value_with_handle` returns a `WriteHandle`
/// whose correlation id matches the entry inserted into
/// `pending_ops`. The handle stays in the registry until a status
/// frame arrives or the consumer cancels.
#[tokio::test]
async fn write_value_with_handle_inserts_into_pending_ops() {
let (addr, handle) = unauthenticated_server(vec![(0, Vec::new())]).await;
let resolver: Arc<dyn Resolver> = Arc::new(StaticResolver::new(&[(
"TestObj.TestInt",
sample_metadata(),
)]));
let session = connect_test_session(addr, resolver).await.unwrap();
// Issue the write — handle returned carries the synthetic id.
let write_handle = session
.write_value_with_handle("TestObj.TestInt", WriteValue::Int32(7))
.await
.unwrap();
// The pending_ops registry should contain exactly that id with
// OperationKind::Write + the original reference.
let guard = session.inner.pending_ops.lock().await;
assert_eq!(guard.len(), 1);
let ctx = guard
.by_id
.get(&write_handle.correlation_id)
.expect("inserted entry");
assert_eq!(ctx.op_kind, OperationKind::Write);
assert_eq!(ctx.reference.as_deref(), Some("TestObj.TestInt"));
drop(guard);
handle.await.unwrap();
}
/// F47 — `Session::unsubscribe` must NOT emit an `UnAdvise` for
/// buffered subscriptions, mirroring the .NET reference's
/// `if (!subscription.IsBuffered)` guard at