[F12 wrapper + F32 close] Session::connect_nmx_auto + close M5 type-matrix DoD
rust / build / test / clippy / fmt (push) Has been cancelled

Two related closures in one commit:

1. Session-level wrapper around F12: new
   `mxaccess::Session::connect_nmx_auto(ntlm_factory, options,
   resolver, recovery)` gated on a new `mxaccess/windows-com` feature
   (which propagates `mxaccess-nmx/windows-com`). Drives
   `NmxClient::create` (the F12 COM-activation factory) for the
   `(host, port, service_ipid)` discovery, then funnels into the
   shared post-NMX-bind orchestration. Refactored `connect_nmx` to
   extract steps 1+2+4+5 into a private `from_nmx_client` helper —
   both `connect_nmx` and `connect_nmx_auto` reuse it so the
   `CallbackExporter` + router + `RegisterEngine2` + heartbeat policy
   stays in one place. The .NET `MxNativeSession.Open` shape
   (`MxNativeSession.cs:127-147`) is now reproduced end-to-end on
   Windows with `windows-com` on — callers no longer pre-resolve
   `(addr, service_ipid)` by hand.

   `connect_nmx`'s doc comment updated to drop the stale "F12 not yet
   wired" note. `parse_bracketed_host_port` in mxaccess-nmx gets a
   `cfg_attr(not(...), allow(dead_code))` so the default-feature
   build stays warning-clean.

2. F32 closed via option (b) of its own resolve criterion: the four
   missing types (Float / Double / DateTime / Duration) are gated on
   Galaxy-side template provisioning that's outside the Rust port's
   scope. The deployed test Galaxy on this host only has
   mx_data_type ∈ {1=Bool, 2=Int32, 5=String}; we cannot exercise
   the missing types without authoring new template attributes in
   the Aveva console (a manual platform-engineering task). The
   three-type live verification at commit 9063f10 satisfies the M5
   DoD bullet for what is deployable. F18's M5 status block updated
   to reflect F32-resolved.

Workspace: 718 tests pass on default features (was 712 before F12,
+6 from new parse_bracketed_host_port tests). Default-feature
clippy + windows-com clippy both clean.

Closes F32 in design/followups.md and extends F12's resolution note
with the Session-level wrapper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-05 22:30:25 -04:00
parent daa4ea3f16
commit 8e695b9347
4 changed files with 87 additions and 32 deletions
+6 -16
View File
@@ -53,7 +53,7 @@ move to `## Resolved` with a date + commit hash.
4.`cargo build --workspace` + `cargo test --workspace` (711 tests) + `cargo clippy --workspace -- -D warnings` all green.
**Remaining open work for full M5 closeout** (none are P0 blockers anymore):
- **F32**: live type-matrix coverage beyond Int32.
- ~~F32~~: resolved (commit `<this commit>`) via option (b) — three-type live coverage is the deployable maximum; missing types are Galaxy-provisioning-gated.
- **F28**: canonical-XML signing currently covers only the `[XmlSerializerFormat]` ops (AuthenticateMe / Disconnect / KeepAlive / RegisterItems / UnregisterItems). Read / Write / CreateSubscription / AddMonitoredItems / Publish / etc. still sign over NBFX wire bytes via the legacy fallback. Live Read works by virtue of those ops not requiring HMAC validation server-side under the empty `hashAlgorithm` setting (registry default), so this is latent rather than blocking. Promote to P0 once a deployment with non-empty `hashAlgorithm` is in scope.
- ~~F29~~: resolved (commit `<this commit>`) — `nbfs.rs` re-aligned to the canonical `[MC-NBFS]` table from `dotnet/wcf` `ServiceModelStringsVersion1`.
- **F26 stream subscription**: `Stream<Item = MonitoredItemValue>` over a publish-loop is still stubbed in `AsbSession`. Tracked under F25 step 8 / F26 step 3 in the cumulative log.
@@ -146,21 +146,6 @@ move to `## Resolved` with a date + commit hash.
F25 (`mxaccess-asb` IASBIDataV2 client) and F26 (`mxaccess::Session` over `AsbTransport`) remain open. With F19-F24 landed, the M5 framing/encoder layer (streams A+B+C+D and the codec stream) is complete; F25 composes them into the `IASBIDataV2` wire client. F22's static dictionary subset is intentionally curated; expand entries as wire captures show new IDs. F27 (constant-time DH) is filed as a separate follow-up below.
### F32 — Live type-matrix coverage for `asb-subscribe`
**Severity:** P1 — final M5 DoD bullet (#3).
**Source:** F18 M5 status block.
**Live coverage so far (commits `9063f10`, `<this commit>`):** three types round-trip end-to-end against the live MxDataProvider on this Windows host:
-**Int32** (type_id 4) — `TestChildObject.TestInt = 99`
-**String** (type_id 10) — `TestChildObject.TestString = "mxaccesscli verified 17778523775"` (UTF-16LE on the wire)
-**Bool** (type_id 17) — `DelmiaReceiver_001.TestAttribute = 0`
A SQL probe of the Galaxy DB (`SELECT mx_data_type, MIN(...) FROM gobject g INNER JOIN package p ON p.package_id = g.deployed_package_id INNER JOIN dynamic_attribute da ON da.package_id = p.package_id WHERE g.is_template = 0 GROUP BY da.mx_data_type`) shows the live Galaxy only has tags of `mx_data_type ∈ {1=Bool, 2=Int32, 5=String}`. Float (3), Double (4), DateTime (6), Duration (7), and array shapes are not deployed in this Galaxy, so we cannot exercise them without provisioning new attributes.
**Transient flakiness observed:** the `InvalidConnectionId` race after one-way `AuthenticateMe` is not deterministic — even with the `MAX_ATTEMPTS=10`, `BACKOFF_BASE_MS=200` retry loop in `register_items` (commit `<this commit>`) and a 250 ms post-auth settle in `connect`, individual runs occasionally exhaust the retry budget. The server appears to enter a degraded mode after many test runs (presumably pending-connection table fills), and a 30-second cool-down restores reliability. Each tag works in some runs and fails in others; the failure is not tag-specific. Production deployments with a single long-lived session are unlikely to hit this.
**Resolves when:** Either (a) the Galaxy is augmented with sample tags for the four missing types and an `asb-typematrix.rs` integration test loops over all seven proven types, OR (b) the four-missing-types coverage is acknowledged as gated on Galaxy-side provisioning that's outside the Rust port's scope and the followup is closed with the three-type live verification as the M5 DoD ✓.
### F28 — Canonical XML serialiser for `ConnectedRequest` signing (matches `XmlSerializer.Serialize` byte-for-byte)
**Status: PARTIALLY RESOLVED.** The five `[XmlSerializerFormat]` ops (AuthenticateMe, Disconnect, KeepAlive, RegisterItems, UnregisterItems) plus the per-action `ValidatorWireFormat` selector + DH-params-from-registry + dynamic-dict id management all landed in commits `f14580e` / `104efc4`. Live AuthenticateMe + RegisterItems work end-to-end (commit `9063f10`). Read / Write / CreateSubscription / AddMonitoredItems / Publish / DeleteMonitored / DeleteSubscription / PublishWriteComplete still sign over NBFX wire bytes via the legacy fallback; works in practice because the live registry has empty `hashAlgorithm` (no HMAC required for the unforced-MAC path), but will break under any deployment that sets a real algorithm. **Severity now P2** — promote back to P0 if a hashAlgorithm-non-empty environment is in scope.
**Severity:** P0 — blocks every signed ASB operation (AuthenticateMe, RegisterItems, all data-plane RPCs).
@@ -230,6 +215,11 @@ The fixture is captured by `MxAsbClient.Probe --dump-deterministic-hmac` (`src/M
### F12 — `NmxClient::create` (auto-resolving COM-activation factory)
**Resolved:** 2026-05-05 (commit `<this commit>`). Builds on F6: new `NmxClient::create(ntlm_factory)` constructor in `crates/mxaccess-nmx/src/client.rs`, gated on `cfg(all(windows, feature = "windows-com"))`. New crate-level feature `mxaccess-nmx/windows-com` propagates to `mxaccess-rpc/windows-com`. Mirrors `ManagedNmxService2Client.Create()` (`cs:30-64`) + `ResolveService` (`cs:491-523`) — six steps: (1) `com_objref_provider::marshal_activated_iunknown_objref("NmxSvc.NmxService", MarshalContext::DifferentMachine)` activates the COM class and emits an OBJREF blob; (2) `ComObjRef::parse` extracts `oxid` + `ipid` (the activated server's `IUnknown` IPID); (3) `resolve_oxid_with_managed_ntlm_packet_integrity` against `127.0.0.1:135` (RPCSS endpoint mapper) returns the server's `(host, port)` bindings + `IRemUnknown` IPID; (4) the `ncacn_ip_tcp` non-security binding's `host[port]` text is parsed via the new `parse_bracketed_host_port` helper (mirrors the .NET `ParseBracketedHost` / `ParseBracketedPort` pair, using `rfind` so FQDNs with `.` round-trip — matches `cs:540-561`); (5) a fresh transport binds to `IRemUnknown` and calls `RemQueryInterface(iunknown_ipid, INmxService2_IID, fresh_causality_id, public_refs=5)` — the `RemQiResult` carries the new `INmxService2` IPID; (6) a second fresh transport binds to `INmxService2` via `Self::connect`. The `ntlm_factory: impl FnMut() -> NtlmClientContext` closure is invoked **three times** (one per bind); callers are responsible for fresh contexts each call. New error variants: `NmxClientError::Activation(ProviderError)` (only with `windows-com`) and `NmxClientError::EndpointResolution { reason }` (covers no binding / parse failure / non-zero RemQI HRESULT). 6 offline tests on the host/port parser pin: extracts FQDN host + port, uses `rfind` for the rightmost brackets, rejects missing `[` / missing `]` / non-numeric port / port overflow. 1 live test (`#[ignore]`'d, gated on `MX_LIVE` + the `MX_TEST_*` Setup-LiveProbeEnv env triple) round-trips end-to-end against the AVEVA install — activates `NmxSvc.NmxService`, drives the full chain, asserts the resolved `service_ipid` is non-zero. Live verification: passes. Workspace tests went 17 → 23 in mxaccess-nmx (+6).
**Session-level wrapper (same commit):** `mxaccess::Session::connect_nmx_auto(ntlm_factory, options, resolver, recovery)` — gated on the new `mxaccess/windows-com` feature (which propagates to `mxaccess-nmx/windows-com`). Refactored `connect_nmx` to extract the post-NMX-bind orchestration into a private `from_nmx_client` helper; both `connect_nmx` and `connect_nmx_auto` funnel through it so the `CallbackExporter` + router-task + `RegisterEngine2` + heartbeat policy stays in one place. `connect_nmx`'s doc comment updated — the prior "F12 not yet wired" note is gone. With both layers landed, the .NET `MxNativeSession.Open` surface (`cs:127-147`) is reproduced end-to-end on the Rust side: callers no longer need to pre-resolve `(host, port, service_ipid)` by hand on Windows.
### F32 — Live type-matrix coverage for `asb-subscribe`
**Resolved:** 2026-05-05 (commit `<this commit>`). Closed via option (b) of the followup's own resolve criterion: the four missing types (Float / Double / DateTime / Duration) are gated on Galaxy-side provisioning that's outside the Rust port's scope. The deployed test Galaxy on this host only has `mx_data_type ∈ {1=Bool, 2=Int32, 5=String}` (verified via direct SQL probe of `dbo.dynamic_attribute`); we cannot exercise the missing types without authoring new template attributes in the Aveva console — a manual platform-engineering task, not a Rust port issue. The three-type live verification (Int32 = 99, String = `"mxaccesscli verified 17778523775"`, Bool = 0) at commit `9063f10` therefore satisfies the **type-matrix DoD bullet for what is deployable**. M5 DoD bullet #3 closes ✓ for the deployed shape; if a future deployment provisions the remaining four types, an `asb-typematrix.rs` integration test that loops over all seven types would make a clean follow-on. **Transient `InvalidConnectionId` race** noted in the original block remains as a known characteristic of the live MxDataProvider after many test cycles (settles after a 30-second cool-down); production deployments with a single long-lived session are unlikely to hit it.
### F6 — Port `ComObjRefProvider.cs` (OBJREF emitter via Win32 `CoMarshalInterface`)
**Resolved:** 2026-05-05 (commit `<this commit>`). New module `crates/mxaccess-rpc/src/com_objref_provider.rs` (~330 LoC including tests) gated on `cfg(all(windows, feature = "windows-com"))`. Pulls `windows = "0.59"` (features `Win32_Foundation` + `Win32_System_Com` + `Win32_System_Com_Marshal` + `Win32_System_Com_StructuredStorage` + `Win32_System_Memory`) as an optional dep behind the existing `windows-com` feature; default footprint stays slim. Public API mirrors `ComObjRefProvider.cs` 1:1: `MarshalContext` enum (InProcess / Local / DifferentMachine — wraps the `MSHCTX_*` newtype constants), `clsid_from_prog_id(&str) -> Result<GUID, ProviderError>` (wraps `CLSIDFromProgID`), `marshal_activated_iunknown_objref(prog_id, ctx)` (activates via `CoCreateInstance(CLSCTX_INPROC_SERVER | CLSCTX_LOCAL_SERVER | CLSCTX_REMOTE_SERVER)` then marshals), `marshal_iunknown_objref(unknown, ctx)` (uses `IUnknown::IID`), `marshal_interface_objref(unknown, iid, ctx)` (the underlying `CoMarshalInterface` over an HGlobal-backed `IStream`). All `unsafe` is internal to the module — public API exposes only typed Rust values, no raw pointers / HRESULTs / lifetime-bound interface pointers. Each `unsafe` block carries an inline SAFETY comment. `ProviderError` enumerates the four documented failure modes (UnknownProgId, ActivationFailed, MarshalFailed, GlobalLockFailed) plus the apartment-init pre-check (ApartmentInitFailed). Per-thread COM init via `OnceLock<()>` thread-local: lazy `CoInitializeEx(MULTITHREADED)` on first call; `S_FALSE` (already initialised) and `RPC_E_CHANGED_MODE` (thread is STA) treated as success — matches the .NET runtime's tolerant apartment behaviour. 4 offline tests pin: `MarshalContext``MSHCTX_*` mapping, `ensure_apartment` idempotence, `clsid_from_prog_id` returns `UnknownProgId` for fake ProgIDs, `marshal_activated_*` short-circuits at the resolution stage. 1 live test (`#[ignore]`'d, gated on `MX_LIVE`) round-trips the real `NmxSvc.NmxService`: activates, marshals, then parses the blob via `ComObjRef::parse` and asserts non-zero OXID + IPID. Live verification: passes against the AVEVA install on this host. Workspace tests went 183 → was 179 in mxaccess-rpc (+4 new). Unblocks F12 (NmxClient::create) — the auto-resolving COM-activation factory can now chain `marshal_activated_iunknown_objref``ComObjRef::parse``resolve_oxid_with_managed_ntlm_packet_integrity``RemQueryInterface` over the existing primitives.
+5 -1
View File
@@ -127,7 +127,11 @@ fn fresh_orpc_this() -> OrpcThis {
/// expects (`cs:540-561`). The host is everything before the **last** `[`,
/// the port is the decimal text between that `[` and the **last** `]`.
///
/// Used by [`NmxClient::create`] only.
/// Used by [`NmxClient::create`] only — gated on `windows-com`.
#[cfg_attr(
not(all(windows, feature = "windows-com")),
allow(dead_code)
)]
fn parse_bracketed_host_port(binding: &str) -> Result<(String, u16), NmxClientError> {
let open = binding.rfind('[').ok_or_else(|| NmxClientError::EndpointResolution {
reason: format!("binding {binding:?} has no '['"),
+3
View File
@@ -36,6 +36,9 @@ serde = ["mxaccess-codec/serde"]
# `live` gates integration tests that hit a running AVEVA install. Driven by
# the `MX_LIVE` env var via `tools/Setup-LiveProbeEnv.ps1`.
live = []
# Pulls F12's `Session::connect_nmx_auto` constructor — the auto-resolving
# COM-activation path. Propagates to `mxaccess-nmx/windows-com`.
windows-com = ["mxaccess-nmx/windows-com"]
[lints]
workspace = true
+73 -15
View File
@@ -393,14 +393,18 @@ pub(crate) async fn callback_router(
}
impl Session {
/// Open a session over the NMX transport. Mirrors the wire-side of
/// `MxNativeSession.Open` (`MxNativeSession.cs:127-147`) — `Open`
/// itself is .NET-side: COM-activates `NmxSvc.NmxService`, marshals
/// an OBJREF, calls ResolveOxid + RemQI to discover `(host, port,
/// service_ipid)`, then calls `RegisterEngine2`. The Rust port
/// requires the caller to pre-resolve those because COM activation
/// is not yet wired (followup F12); the call sequence after that is
/// identical.
/// Open a session over the NMX transport with caller-supplied
/// `(addr, service_ipid)`. Mirrors the wire-side of
/// `MxNativeSession.Open` (`MxNativeSession.cs:127-147`).
///
/// The .NET shape of `Open` includes a COM-activation pre-step
/// (`NmxSvc.NmxService` → `CoMarshalInterface` → `ResolveOxid` →
/// `IRemUnknown::RemQueryInterface`) that auto-resolves
/// `(host, port, service_ipid)` before `RegisterEngine2`. This
/// constructor takes the resolved triple by hand — useful for
/// tests, deterministic probes, and non-Windows builds. Use
/// [`Self::connect_nmx_auto`] (Windows + `windows-com` feature) to
/// drive the auto-resolving path.
///
/// On success: a `RegisterEngine2` round-trip has completed and the
/// LMX server has acknowledged the engine registration. The
@@ -421,7 +425,66 @@ impl Session {
recovery: RecoveryPolicy,
) -> Result<Self, Error> {
recovery.validate()?;
let nmx = NmxClient::connect(addr, service_ipid, ntlm)
.await
.map_err(map_nmx)?;
Self::from_nmx_client(nmx, options, resolver).await
}
/// Auto-resolving NMX session bring-up. Mirrors `MxNativeSession.Open`'s
/// .NET shape (`MxNativeSession.cs:127-147`) end-to-end — including
/// the COM-activation step that
/// [`Self::connect_nmx`] expects the caller to have done by hand.
///
/// Internally this drives [`mxaccess_nmx::NmxClient::create`] (F12)
/// to discover `(host, port, service_ipid)` via:
/// `marshal_activated_iunknown_objref("NmxSvc.NmxService")` →
/// `ComObjRef::parse` → `ResolveOxid` against `127.0.0.1:135` →
/// `IRemUnknown::RemQueryInterface` → bind `INmxService2`. Then
/// runs the same post-NMX setup as [`Self::connect_nmx`]
/// (`CallbackExporter`, router task, `RegisterEngine2`, optional
/// heartbeat). Only available on Windows with the `windows-com`
/// feature.
///
/// `ntlm_factory` is invoked **three times** (once per bind in the
/// COM-activation chain). Callers typically capture credentials
/// from `MX_TEST_*` env vars and produce a fresh
/// [`NtlmClientContext`] each call.
///
/// # Errors
///
/// [`Error::Connection`] for any failure in the COM-activation /
/// OXID-resolution / RemQI chain (mapped from
/// `NmxClientError::Activation` and `NmxClientError::EndpointResolution`),
/// plus the same failures `connect_nmx` can return after the NMX
/// client is built.
#[cfg(all(windows, feature = "windows-com"))]
pub async fn connect_nmx_auto(
ntlm_factory: impl FnMut() -> NtlmClientContext,
options: SessionOptions,
resolver: Arc<dyn Resolver>,
recovery: RecoveryPolicy,
) -> Result<Self, Error> {
recovery.validate()?;
let nmx = NmxClient::create(ntlm_factory).await.map_err(map_nmx)?;
Self::from_nmx_client(nmx, options, resolver).await
}
/// Shared post-NMX-bind orchestration: stand up the
/// `CallbackExporter` + router task, call `RegisterEngine2` with
/// the local callback OBJREF, optionally configure heartbeats, and
/// assemble [`SessionInner`].
///
/// Mirrors steps 1, 2, 4, 5 of the .NET `MxNativeSession.Open`
/// shape (`cs:148-170`). Both [`Self::connect_nmx`] and
/// [`Self::connect_nmx_auto`] funnel through here so the
/// callback-exporter / register / heartbeat policy stays in one
/// place.
async fn from_nmx_client(
mut nmx: NmxClient,
options: SessionOptions,
resolver: Arc<dyn Resolver>,
) -> Result<Self, Error> {
// 1. Bind a local CallbackExporter on an OS-assigned ephemeral
// port, then build the OBJREF advertising it. Hostname comes
// from `local_hostname()` (env-var lookup); falls back to
@@ -447,12 +510,7 @@ impl Session {
let (callback_tx, _) = broadcast::channel(CALLBACK_BROADCAST_CAPACITY);
let router_handle = tokio::spawn(callback_router(callback_events, callback_tx.clone()));
// 3. Open the NMX transport + bind.
let mut nmx = NmxClient::connect(addr, service_ipid, ntlm)
.await
.map_err(map_nmx)?;
// 4. RegisterEngine2 with the callback OBJREF. Mirrors cs:163-175.
// 3. RegisterEngine2 with the callback OBJREF. Mirrors cs:163-175.
let hr = nmx
.register_engine_2(
options.local_engine_id,
@@ -468,7 +526,7 @@ impl Session {
return Err(Error::Connection(ConnectionError::EngineNotRegistered));
}
// 5. Optional heartbeat-interval setup (cs:165-167).
// 4. Optional heartbeat-interval setup (cs:165-167).
if let Some(ticks) = options.heartbeat_ticks_per_beat {
let hr = nmx
.set_heartbeat_send_interval(ticks, options.heartbeat_max_missed_ticks)