ceeaeefa71
Adds `write_message::encode_into_bytes_mut` (and the timestamped
variant) which writes the encoded body into a caller-supplied
`BytesMut`. The buffer is cleared and resized in place each call;
once it has grown to the largest body the session will produce, it
allocates nothing further.
A session that holds a single `BytesMut` and reuses it across writes:
- Int32 / Float32 / Float64: 2 → 1 allocs/op
(only the `encode_scalar_value` scratch `Vec<u8>` remains)
- Boolean: 1 → 0 allocs/op
(no per-value scratch — the literal payload is a stack `[u8; 4]`)
Bench delta in `design/M6-bench-baseline.md` § F52.3. The
`encode_scalar_value` Vec is the remaining 1 alloc/op for fixed-width
scalars; eliminating it would require inlining the LE-bytes write
into the body slice (left for a follow-up since the F52 spec only
asks for 2 → 1).
Resolves F52 (all three optimisations landed: 4e76b44 F52.1,
a0fa5be F52.2, this commit F52.3). Existing `encode` / `encode_to_bytes_mut`
public surface unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
459 lines
96 KiB
Markdown
459 lines
96 KiB
Markdown
# Followups
|
||
|
||
Open work items deferred during /loop iterations. Triaged at the top of
|
||
every iteration. New items are appended under `## Open`; resolved items
|
||
move to `## Resolved` with a date + commit hash.
|
||
|
||
## Open
|
||
|
||
### F48 — Execute `cargo publish` for the V1 release cut
|
||
**Status:** **Out of scope — internal usage only, no crates.io publish planned.** Confirmed 2026-05-06 by maintainer. The workspace stays at `version = "0.0.0"` indefinitely; consumers depend via path or git, not crates.io. F43's dry-run validation (`design/F48-publish-dry-run.md`) is retained as a workspace-hygiene check (each crate's `cargo package --list` produces a clean tarball, no accidental captures/big files), not as release prep.
|
||
|
||
If this changes (e.g. internal consumer wants registry-style versioning via a private cargo registry), the V1 publish recipe in `design/F48-publish-dry-run.md` describes the steps. For now: no work needed.
|
||
|
||
### F50 — Run the F46 Suspend/Activate Frida capture live
|
||
**Status:** **Resolved 2026-05-06.** Two captures landed under `captures/123-frida-suspend-advised-instrumented/` (suspend-advised scenario) and `captures/124-frida-activate-advised-instrumented/` (activate-advised scenario). Per-byte evidence in `docs/F50-suspend-activate-evidence.md`; R5 in `design/70-risks-and-open-questions.md` moved to settled.
|
||
|
||
**Verdict:**
|
||
- **Suspend** is server-side: emits NMX `PutRequest` with command `0x2D` ~140ms after the LMX-proxy entry, body `2d 01 00 + correlation_id + 22 bytes` (same shape family as `0x1F` AdviseSupervisory).
|
||
- **Activate** against a non-suspended item is client-side only — no wire traffic, returns Success synchronously. The harness `activate-advised` scenario doesn't sequence Suspend-then-Activate; if direct evidence for Activate-after-Suspend is needed later, add a new scenario to `MxTraceHarness/Program.cs`.
|
||
|
||
**Severity:** P3 — residual from F46 (script ready, capture not yet run).
|
||
**Source:** F46 closeout (`design/followups.md`) + `analysis/frida/mx-nmx-trace.js` header procedure.
|
||
|
||
**Scope.** Run the Frida script against a live `MxTraceHarness.exe` exercising the suspend-advised + activate-advised scenarios on `TestChildObject.ScanState`. Save under `captures/NNN-frida-suspend-activate-instrumented/`. If the new `mx.suspend.*` / `mx.activate.*` events accompany NMX traffic in the same time window: document the wire opnum + body shape in `docs/M6-buffered-evidence.md` and `analysis/proxy/nmxsvcps-procedures.tsv`. If no NMX traffic accompanies the hook fires: update `design/70-risks-and-open-questions.md` R5 to "settled — client-side only".
|
||
|
||
**Definition of done:** R5 is fully settled (either with a documented wire opnum or a "client-side only" verdict backed by capture).
|
||
|
||
**Resolves when:** the capture lands and R5's status is updated.
|
||
|
||
### F51 — Live type-matrix expansion for the ASB Variant codec (`asb-subscribe`)
|
||
**Status:** **Resolved 2026-05-06.** Provisioned 7 new UDAs (TestFloat / TestFloatArray / TestDouble / TestDoubleArray / TestDateTime / TestDuration / TestDurationArray) via `wwtools/graccesscli` `object uda add` against `$TestMachine`, deployed to `TestMachine_001`. New `crates/mxaccess/examples/asb-type-matrix.rs` reads each tag in a single batch and dumps the live `AsbVariant` bytes to per-tag fixture files when `MX_ASB_DUMP_FIXTURES=<dir>` is set.
|
||
|
||
Live evidence (one cold-start run; subsequent runs hit the F31 InvalidConnectionId cool-down — wait 60+ seconds with no ASB activity before re-running):
|
||
|
||
| Tag | type_id | length | payload bytes |
|
||
|---|---|---|---|
|
||
| TestChangingInt | 4 (Int32) | 4 | 4 |
|
||
| TestAlarm001 | 17 (Boolean) | 1 | 1 |
|
||
| MachineCode | 10 (String) | 30 | 30 |
|
||
| TestFloat | 8 (Float) | 4 | 4 |
|
||
| TestDouble | 9 (Double) | 8 | 8 |
|
||
| TestDateTime | 11 (DateTime) | 8 | 8 |
|
||
| TestDuration | 12 (ElapsedTime) | 8 | 8 |
|
||
|
||
`crates/mxaccess-codec/tests/f51_type_matrix_parity.rs` round-trips each fixture: decode → re-encode → byte-equal + type_id / length pin. Per-fixture .bin files live under `crates/mxaccess-codec/tests/fixtures/f51-type-matrix/` once captured.
|
||
|
||
Array tags (`TestIntArray`, `TestBoolArray`, etc.) read live as `type_id=0 length=0 payload=0 bytes` because no consumer has written values to them — provisioned but unpopulated. Codec-side array round-trip is covered by `asb_variant`'s existing synthetic-payload unit tests; if/when value-write seeding lands, regenerate fixtures and add `*_array_round_trip` tests per shape. `docs/galaxy-test-fixtures.md` documents the full provisioning + regeneration recipe.
|
||
|
||
|
||
**Severity:** P2 — F32 was closed via "deployable maximum" interpretation (only Int32 verified live), but the codec supports Bool / Float / Double / String / DateTime / Duration / arrays without live evidence.
|
||
**Source:** F32 closeout (`design/followups.md`); `work_remain.md:108-113` documents the proven matrix from .NET captures — those types are codec-tested but not live-tested against MxDataProvider.
|
||
|
||
**Scope.** Provision sample tags on the local Galaxy for each missing type (Bool, Float, Double, String, DateTime, Duration, plus 1-2 representative array shapes). Extend `examples/asb-subscribe.rs` with a per-type loop that registers + reads + subscribes against each. Capture the wire bytes via `examples/asb-relay.rs` middleman and add round-trip parity tests in `crates/mxaccess-asb/tests/` for each type.
|
||
|
||
**Definition of done:**
|
||
1. Per-type Galaxy fixture documented in `docs/galaxy-test-fixtures.md` (which child object names to provision, expected attribute types).
|
||
2. `cargo run -p mxaccess --example asb-subscribe -- --type-matrix` exercises all proven types and reports per-type wire bytes + decoded value.
|
||
3. Round-trip test per type in `crates/mxaccess-asb/tests/` pinning the captured wire bytes.
|
||
|
||
**Resolves when:** every proven type from `work_remain.md:108-113` has a live wire fixture + a passing round-trip test.
|
||
|
||
### F52 — Codec performance optimisations deferred from F39
|
||
**Severity:** P3 — R12 < 5 allocs/write target is already met; these are nice-to-haves.
|
||
**Source:** `design/M6-bench-baseline.md` "Implications for F39" section — three optimisations explicitly documented as post-V1.
|
||
|
||
**Scope.** Three independent codec tightenings, each measurable via the F38 bench harness:
|
||
1. **`bytes::BytesMut` output buffer** on the encoder side. Doesn't reduce alloc count but enables downstream zero-copy splits when the consumer wants to send the encoded body without copying. ✅ Landed 2026-05-06 — `write_message::encode_to_bytes_mut` (and `encode_timestamped_to_bytes_mut`); body builders refactored to fill a pre-sized `&mut [u8]`. Bench delta in `design/M6-bench-baseline.md` § F52.1.
|
||
2. **Per-handle name-signature cache** in `MxReferenceHandle::from_names`. Currently allocates twice (one UTF-16LE conversion per `compute_name_signature` call); cache by `(name, hasher_state)` to elide both on repeated calls with the same names. ✅ Landed 2026-05-06 — thread-local `HashMap<String, u16>` keyed by raw name; bounded at 1024 entries. `MxReferenceHandle::from_names` drops 2 → 0 allocs/op once warm. Bench delta in `design/M6-bench-baseline.md` § F52.2.
|
||
3. **Session-level scratch pool** for the per-write encode buffer. Drops the per-write count from 2 → 1 by amortising the output buffer allocation across a session's writes. ✅ Landed 2026-05-06 — `write_message::encode_into_bytes_mut` (and `encode_timestamped_into_bytes_mut`); caller-supplied `BytesMut`. Pooled Int32 = 1 alloc/op (was 2); pooled Boolean = 0 allocs/op (was 1). Bench delta in `design/M6-bench-baseline.md` § F52.3.
|
||
|
||
**Definition of done:**
|
||
1. ✅ Each optimisation lands as a separate commit with a before/after row in `design/M6-bench-baseline.md` showing the alloc-count delta. (commits `4e76b44` F52.1, `a0fa5be` F52.2, this commit F52.3)
|
||
2. ✅ No correctness regressions in the round-trip fixture suite. (267 tests pass)
|
||
3. ✅ Default API surface unchanged. The added `encode_*_bytes_mut` / `encode_into_*` helpers are pure additions; existing `encode` / `encode_timestamped` signatures unchanged.
|
||
|
||
**Resolved 2026-05-06:** all three optimisations landed.
|
||
|
||
### F53 — Enable `#![warn(missing_docs)]` workspace-wide
|
||
**Status:** Consumer crates resolved 2026-05-06: `#![warn(missing_docs)]` enabled on `mxaccess` and `mxaccess-compat` lib roots, every public item now carries at least a one-line doc comment, `RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps` clean. Protocol crates deliberately deferred per the strategy paragraph below — measured the magnitude on 2026-05-06 by enabling the lint on each:
|
||
|
||
| Crate | Missing-docs warnings |
|
||
|---|---|
|
||
| `mxaccess-asb` | 422 |
|
||
| `mxaccess-nmx` | 398 |
|
||
| `mxaccess-callback` | 371 |
|
||
| `mxaccess-galaxy` | 229 |
|
||
| `mxaccess-codec` | 205 |
|
||
| `mxaccess-rpc` | 147 |
|
||
| `mxaccess-asb-nettcp` | 111 |
|
||
| **Total** | **1883** |
|
||
|
||
Most of those are protocol-internal types (struct fields, enum variants on wire-shape records) whose meaning is already documented at the consumer-facing layer. Filling 1883 one-liners adds noise without consumer value, and turning them into errors (`RUSTDOCFLAGS="-D warnings"`) would block routine `cargo doc` runs. Lint stays off on protocol crates indefinitely; if a future contributor wants per-crate enforcement, they can re-introduce on a per-module basis with `#![allow(missing_docs)]` exemptions for the protocol-internal modules.
|
||
**Severity:** P3 — doc-coverage tightening; not a correctness or release blocker.
|
||
**Source:** F42 closeout — the missing-docs lint was deferred because enabling it surfaces hundreds of low-priority public-item gaps that are out of scope for that F-number.
|
||
|
||
**Scope.** Per crate root, add `#![warn(missing_docs)]` (or `#![deny(missing_docs)]` for the consumer-facing `mxaccess` + `mxaccess-compat`). Then walk each warning and add at minimum a one-line doc comment per public item. Strategy: do the consumer-facing crates first (`mxaccess`, `mxaccess-compat`); the protocol crates (`mxaccess-codec`, `mxaccess-rpc`, etc.) can land later since their consumers are the higher-level crates which already document the surfaces they re-export.
|
||
|
||
**Definition of done:**
|
||
1. `RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps` continues to pass with the lints enabled.
|
||
2. Every public item in `mxaccess` + `mxaccess-compat` has at least a one-line doc comment.
|
||
3. Protocol crates either get the lint enabled too or have an inline `#[allow(missing_docs)]` with a reason that points at this followup.
|
||
|
||
**Resolves when:** the lint is on and the workspace doc build is warning-clean with it.
|
||
|
||
### F56 — `subscribe` / `subscribe_buffered` complete on the wire but never receive `0x33` DataUpdate frames
|
||
**Status:** **Resolved 2026-05-06.** See Resolved section below for the full closeout.
|
||
|
||
Root cause: `Session::subscribe` and `Session::subscribe_buffered_nmx` were missing the `INmxService2::Connect` + `AddSubscriberEngine` round-trip that the .NET reference's `MxNativeSession.EnsurePublisherConnected` (`cs:516-526`) issues before the first advise against a given publishing engine. Without that pair of RPCs, NmxSvc accepts the subscription registration but the publishing engine never knows our engine is subscribed — so no `0x33` DataUpdate frames flow.
|
||
|
||
Diagnosed via wwtools/aalogcli: the `[Warning] NmxSvc | NmxCallback->DataReceived ... failed with error 0x{N}` log lines turned out to be NmxSvc's normal log spam where N is the bufferSize, NOT an actual error — the .NET reference's own probe triggers identical entries while still receiving `0x33` DataUpdate frames successfully. The real issue was that those frames never started being sent in the first place.
|
||
|
||
Fix landed:
|
||
- `SessionInner::publisher_endpoints` — per-session `HashMap<(platform_id, engine_id), ()>` cache mirroring `MxNativeSession._publisherEndpoints`.
|
||
- `Session::ensure_publisher_connected(platform_id, engine_id)` — issues `INmxService2::Connect(local_engine, galaxy, platform, engine)` then `AddSubscriberEngine(engine, galaxy, source_platform, local_engine)`, once per publisher endpoint per session.
|
||
- `Session::subscribe` and `Session::subscribe_buffered_nmx` — both call `ensure_publisher_connected` BEFORE the wire advise.
|
||
- `subscribe_buffered_nmx` — additionally issues `AdviseSupervisory` after `RegisterReference`. The .NET reference's `RegisterBufferedItemAsync` only calls RegisterReference, but on this AVEVA install RegisterReference alone produces the registration result + heartbeat callbacks without ever starting DataUpdate dispatch; AdviseSupervisory unblocks the dispatch. Difference may be version-specific.
|
||
|
||
Live verification passes for both paths against `TestMachine_001.TestChangingInt`:
|
||
- `cargo test -p mxaccess-compat --features live-windows-com --test plain_subscribe_live` — receives `0x32` SubscriptionStatus + sequence of `0x33` DataUpdate frames.
|
||
- `cargo test -p mxaccess-compat --features live-windows-com --test buffered_subscribe_live` — same.
|
||
|
||
Both tests assert on the raw `Session::callbacks()` broadcast (NMX subscription messages) rather than the typed `Subscription::next` (DataChange) path because `TestChangingInt` on this Galaxy is configured with `quality=0x00C0 (Uncertain) value=null`, so the typed path filters every record. The test gate is "wire-level subscription works"; what the engine reports as the actual value is downstream-Galaxy state, out of scope for the Rust port.
|
||
|
||
Real codec fixes ALSO landed in this session as part of F56 investigation (independent from the resolution above):
|
||
- `NmxSubscriptionMessage::try_parse_process_data_received_body` — peels the `ProcessDataReceived` envelope before calling `parse_inner`. The router previously called `parse_inner` directly on wire bytes, which would have silently dropped any `0x33` even if one arrived.
|
||
- `NmxReferenceRegistrationResultMessage::try_parse_process_data_received_body` + router branch — drops `0x11` registration-result frames cleanly.
|
||
- `Session::subscribe_buffered_nmx` — split-form (object, attribute) wire body + per-session monotonic `item_handle` counter (mirrors `MxNativeCompatibilityServer.AddBufferedItemAsync`'s `_nextItemHandle++`).
|
||
|
||
**Severity:** P1 — blocks F49 step 1 (F36 buffered live verification), F49 step 2 (F45 recovery replay), and ALL consumers relying on subscription data flow on this Galaxy.
|
||
|
||
**Updated 2026-05-06.** Initial diagnosis suspected a buffered-specific wire-body gap; ruled out:
|
||
- Wire body proven byte-identical to the .NET reference's by `crates/mxaccess-codec/tests/buffered_register_reference_parity.rs` (which forward-builds the message from `Session::subscribe_buffered`'s inputs and compares against `captures/082-frida-add-buffered-plain-advise-testint/`).
|
||
- Test now uses real Galaxy DB metadata via `mxaccess_galaxy::SqlTagResolver` (engine_id=2, attribute_id=155, etc.) instead of the hardcoded StaticResolver shim.
|
||
- Item-handle, item_definition, item_context all switched to the .NET-reference split form (handle=1 + per-session counter, definition="<attr>.property(buffer)", context="<object_tag>").
|
||
|
||
**Plain subscribe also fails.** Added `crates/mxaccess-compat/tests/plain_subscribe_live.rs` driving `Session::subscribe` (NOT buffered). Same symptom: `AdviseSupervisory` returns HRESULT 0, the engine acks every write with a 51-byte op-status frame, but no `0x33` DataUpdate ever arrives. So this is **not buffered-specific** — the entire inbound DataUpdate path is silent on this machine.
|
||
|
||
**Likely revised root cause:**
|
||
- The engine generates `0x33` DataUpdate frames into a different transport channel than the one our DCOM sink listens on. The .NET reference's `INmxSvcCallback` has two opnums — `DataReceivedRaw` (3) and `StatusReceivedRaw` (4). We only ever observe opnum=3 callbacks. If the engine routes data updates through a different IID or different DCOM stub on this install, our sink never sees them.
|
||
- Alternatively, the engine on this Galaxy install is configured such that local Object scanning is disabled / the deployed objects aren't actively producing value-change events. The `OnWriteComplete` round-trip works (proves write-path + callback-path); a passive subscription doesn't produce updates if no source changes the value.
|
||
|
||
**Action items (for whoever picks F56 up):**
|
||
1. Compare the **C# DcomCallbackSink** (`src/MxNativeClient/NmxCallbackSink.cs`) to the Rust port's `mxaccess-callback::dcom_sink` — verify it implements **only** `INmxSvcCallback` and that the IID + vtable layout match. There may be a third method or a sibling interface (e.g. `INmxSvcCallback2`) that the engine also calls into for high-cadence DataUpdate dispatch.
|
||
2. Try the same live test against a tag that has known active scanning (e.g. a bound-to-PLC InputSource attribute) — rule out static-UDA hypothesis.
|
||
3. Run `MxNativeClient.Probe --probe-session-subscribe --tag=TestChildObject.TestInt --subscribe-hold-seconds=30` (the .NET reference's working live probe) and confirm `0x33` DataUpdates fire on THIS machine. If they do, capture the wire bytes via Frida and diff against the Rust port's exact body.
|
||
|
||
**What landed in this session (real router/codec fixes, NOT F56-resolving):**
|
||
- `NmxSubscriptionMessage::try_parse_process_data_received_body` — peels the `ProcessDataReceived` envelope before calling `parse_inner`. The router previously called `parse_inner` directly on wire bytes, which would have silently dropped any `0x33` even if one arrived.
|
||
- `NmxReferenceRegistrationResultMessage::try_parse_process_data_received_body` + router branch — drops `0x11` registration-result frames cleanly instead of logging "unexpected opcode 0x11".
|
||
- `Session::subscribe_buffered_nmx` — split-form (object, attribute) wire body + per-session monotonic `item_handle` counter (mirrors `MxNativeCompatibilityServer.AddBufferedItemAsync`'s `_nextItemHandle++`).
|
||
**Source:** F49 step 1 live attempt 2026-05-06. Test `cargo test -p mxaccess-compat --features live-windows-com --test buffered_subscribe_live -- --ignored --nocapture` (added in this session) connects via `Session::connect_nmx_auto` (F55-proven), issues `subscribe_buffered(TestChildObject.TestInt, 1000ms)` against the live engine, and runs a background writer at 500ms cadence. RegisterReference returns HRESULT 0; the engine then fires:
|
||
- One 46-byte heartbeat envelope (header-only, empty inner)
|
||
- One 51-byte op-status frame for the `RegisterReference` completion
|
||
- One 87-byte `0x11` `NmxReferenceRegistrationResultMessage` carrying the assigned `item_handle`
|
||
- One 51-byte op-status frame **per write** (60 frames over 30s — perfectly clocked to the writer cadence)
|
||
|
||
But **zero `0x33` `DataUpdate` frames** ever arrive — verified end-to-end via `RUST_LOG=trace mxaccess_callback=trace`. The .NET reference's `MxNativeSession.SubscribeBufferedAsync` does deliver DataUpdates against the same engine + same tag (per F36 wave 1 evidence at `captures/094-frida-buffered-separate-writer/`), so this is a Rust-port-specific gap.
|
||
|
||
**Likely causes (in priority order):**
|
||
1. The `NmxReferenceRegistrationMessage` body the Rust port sends differs in some field from the .NET reference's. Specifically: `subscribe: true` is set, but other fields (e.g. `item_handle = 0`, `reserved_*`, `source_galaxy_id`) may need different values to trigger DataUpdate dispatch. **Action**: capture the wire bytes from the Rust port's RegisterReference and diff against `captures/094-frida-buffered-separate-writer/` per-byte.
|
||
2. Some additional client-side step is required after RegisterReference — e.g. an ACK of the `0x11` registration result via the assigned `item_handle`, or a separate RPC the .NET reference dispatches that we miss. The F36 wave 1 evidence said no `SetBufferedUpdateInterval` is sent, but there may be another op. **Action**: capture .NET reference's outbound calls during `subscribe-buffered` end-to-end and compare to ours.
|
||
3. The `0x11` registration-result body might carry a status code we should be checking (see `NmxReferenceRegistrationResultMessage::status_category` / `status_detail`). If non-zero, the engine may have rejected the subscription silently. **Action**: log the parsed `0x11` body and check the status fields.
|
||
|
||
**What's already wired (this session):** `NmxSubscriptionMessage::try_parse_process_data_received_body` (envelope-peeling helper) was added — the previous router called `parse_inner` directly on wire bytes and would have silently dropped any `0x33` that did arrive. This was a real bug fix; without it F56 would have stayed invisible. Same for `NmxReferenceRegistrationResultMessage::try_parse_process_data_received_body` + the `0x11` path in the router.
|
||
|
||
**Does not affect:** `Session::write` round-trip (proven by F55 live test); plain `Session::subscribe` (not yet live-tested but uses `AdviseSupervisory` not `RegisterReference`).
|
||
|
||
**Definition of done:** F49 step 1 passes — `cargo test -p mxaccess-compat --features live-windows-com --test buffered_subscribe_live -- --ignored --nocapture` reports at least 3 `DataChange` arrivals at the configured cadence, with monotonically-increasing values matching the writer.
|
||
|
||
**Resolves when:** the missing field / step / status check is identified, the fix lands in `Session::subscribe_buffered_nmx` (or upstream), and the live test passes.
|
||
|
||
### F55 — Hand-rolled callback exporter rejected by `RegisterEngine2` on this AVEVA install
|
||
**Status:** Resolved 2026-05-06 by Path A (DCOM-managed `INmxSvcCallback` sink in `mxaccess-callback::dcom_sink`, wired into `Session::from_nmx_client` behind the `windows-com` feature). Live test `cargo test -p mxaccess-compat --features live-windows-com --test lmx_write_complete_live -- --ignored --nocapture` passes end-to-end: RegisterEngine2 succeeds, write round-trips, OnWriteComplete fires with status from the wire. The hand-rolled `CallbackExporter` is retained for unit tests that exercise the exporter against an in-process fake NMX peer.
|
||
**Severity:** P1 — blocks F49 live verification of every M6 feature that needs an `Engine` registered (i.e. all of them).
|
||
**Source:** Live attempt 2026-05-06 against the local AVEVA install. Both the Rust port and the .NET reference's `--probe-register-managed-callback` (which uses the same hand-rolled-exporter approach as the Rust port) fail `RegisterEngine2` with HRESULT `0x800706BA` (`RPC_S_SERVER_UNAVAILABLE` wrapped as Win32 HRESULT). The .NET reference's `--probe-session-write` SUCCEEDS because it goes through `MxNativeSession.Open` → `CreateRegisteredService` (`MxNativeSession.cs:624`) which does **`ComObjRefProvider.MarshalInterfaceObjRef(callback, INmxSvcCallback, DifferentMachine)`** on a real C# COM object — letting Windows DCOM proxy/stub infrastructure handle the callback dispatch — instead of building a hand-rolled OBJREF + TCP listener.
|
||
|
||
**The Rust port mirrors the .NET reference's `ManagedCallbackExporter` design exactly.** Both fail. So this isn't a Rust port regression — it's a pre-existing issue in the hand-rolled callback architecture that wasn't previously live-tested end-to-end against this NmxSvc install.
|
||
|
||
**Diagnostic chain (logged from `mxaccess::Session::from_nmx_client`):**
|
||
1. `Session::connect_nmx_auto` → `NmxClient::create` → all 6 steps OK (activate, marshal, ResolveOxid, RemQI, final bind). Endpoint resolved to `[fe80::...]:64311`. The new `IUnknownHolder` (mirrors `_activatedComObject` from `ManagedNmxService2Client.cs:15`) keeps the COM ref alive across the steps.
|
||
2. `from_nmx_client` builds the callback OBJREF (162 bytes, byte-structurally identical to .NET's at `ProbeRegisterEngine2ManagedCallback.managed_callback_objref_hex` modulo random fields).
|
||
3. `RegisterEngine2(engine_id, engine_name, version=6, callback_obj_ref)` returns `Transport(Fault { status: 0x800706BA })`.
|
||
|
||
**The OBJREF binding is correct:** `DESKTOP-6JL3KKO[<port>]` with `port` from `tokio::net::TcpListener::bind(0.0.0.0:0)`. Windows Firewall is OFF on all profiles. The hand-rolled exporter accepts connections; NmxSvc just refuses to use it.
|
||
|
||
**Path C investigation (2026-05-06).** Captured the OBJREF byte structure from both paths via the .NET probe:
|
||
|
||
| Field | DCOM-marshalled (works) | Hand-rolled (fails) |
|
||
|---|---|---|
|
||
| Total size | 338 bytes | 162 bytes |
|
||
| `std_flags` | `0x0A80` (SORF_OXRES4+OXRES6+OXRES8) | `0x280` (SORF_OXRES4+OXRES6) |
|
||
| `std_public_refs` | 5 | 5 |
|
||
| `std_oxid` / `std_oid` / `std_ipid` | random per session | random per session |
|
||
| ncacn_ip_tcp bindings | 4 (DESKTOP-6JL3KKO, 10.100.0.48, 2x IPv6 link-local) — **no ports** | 1 (DESKTOP-6JL3KKO[<port>]) — **with port** |
|
||
| Security bindings | 7 | 7 |
|
||
|
||
Tried setting `std_flags = 0x0A80` on the hand-rolled OBJREF (matching the DCOM-marshalled flag bits): **RegisterEngine2 still fails with the same 1722.** Reverted.
|
||
|
||
**Updated diagnosis.** The likely cause is that NmxSvc, on receiving RegisterEngine2 with a callback OBJREF, does its own SCM-side OXID resolution: it calls `IObjectExporter::ResolveOxid` against the local SCM at `127.0.0.1:135` to get the bindings for the OBJREF's OXID, then dials those bindings. Our hand-rolled OXID is **never registered with the local SCM**, so the resolution fails and NmxSvc returns `RPC_S_SERVER_UNAVAILABLE` (1722) — matching the symptom and the sub-second timing (no TCP-dial-back attempt to our listener happens at all).
|
||
|
||
DCOM marshalling fixes this because `CoMarshalInterface` internally registers the OXID with RPCSS, so NmxSvc's SCM-side ResolveOxid succeeds. The bindings carry no port because RPCSS-side resolution returns the dynamic port from the Windows DCOM stub layer.
|
||
|
||
This makes Path A the architecturally correct fix: the callback exporter must be a DCOM-managed object (registered with RPCSS) for NmxSvc to accept the callback. The hand-rolled-listener-with-explicit-port-in-OBJREF approach used by both the Rust port and the .NET reference's `ManagedCallbackExporter` doesn't satisfy NmxSvc's callback validation.
|
||
|
||
**Three resolution paths (each substantial):**
|
||
|
||
- **Path A — switch to DCOM-marshalled callback.** Refactor `mxaccess-callback` so the callback is a real COM class (`#[implement]` via `windows-rs`) registered with the local DCOM SCM, then marshal it via `CoMarshalInterface` for the OBJREF. Abandons the project's "bypass DCOM proxy/stubs" goal but matches what .NET's working path does. ~1 week of work.
|
||
- **Path B — hybrid: register via DCOM, dispatch via hand-rolled.** Use `CoMarshalInterface` only to build the OBJREF (which NmxSvc accepts), but intercept the inbound callback connection at the TCP layer to bypass DCOM stub dispatch. Requires reading the `CoMarshalInterface`-produced OBJREF, extracting the OXID/IPID, and standing up a TCP listener that responds to OXID resolution against itself. Architecturally awkward.
|
||
- **Path C — investigate the OBJREF rejection at NmxSvc.** Capture the wire bytes NmxSvc sees from the .NET DCOM-marshalled path vs the hand-rolled path; diff to find what NmxSvc actually validates. May reveal a single field difference (e.g. a flag bit) that, set correctly in the hand-rolled OBJREF, makes it work. Cheapest if it pans out, but unbounded if it doesn't.
|
||
|
||
**Definition of done:** F49 step 5 (LmxClient OnWriteComplete round-trip) runs end-to-end against the live AVEVA install: `cargo test -p mxaccess-compat --features live-windows-com --test lmx_write_complete_live -- --ignored --nocapture` passes.
|
||
|
||
**Resolves when:** one of the three paths above lands.
|
||
|
||
### F3 — Cross-domain NTLM Type1/2/3 fixture
|
||
**Severity:** P2
|
||
**Status:** Permanently out-of-scope on the current dev host (no second AD domain). Resolution requires external infrastructure not available here.
|
||
**Source:** M2 wave 1, `crates/mxaccess-rpc/src/ntlm.rs`. All current NTLM fixtures are single-domain (the local AVEVA install). Tracked separately in `design/70-risks-and-open-questions.md` R8 (P1 risk) and the open-evidence-gaps table.
|
||
**Concrete next step:** Provision a two-domain Windows lab (e.g. `LAB-A` + `LAB-B` with cross-domain trust + an AVEVA install on `LAB-A` that authenticates a user from `LAB-B`). Run `cargo run -p mxaccess --example connect-write-read` from a `LAB-B`-domain user; capture the NTLM Type1 / Type2 / Challenge / Type3 bytes via `examples/asb-relay.rs` or a Wireshark NTLM filter. Save under `crates/mxaccess-rpc/tests/fixtures/cross-domain-ntlm/`. The existing single-domain Type1/2/3 round-trip tests in `mxaccess-rpc::ntlm` then extend to validate the cross-domain shape (TargetInfo AV pairs differ when crossing domains; specifically `MsvAvDnsTreeName` and `MsvAvDnsComputerName` carry the trusted-domain DNS suffix instead of the local one). Clears R8 in the risks doc.
|
||
|
||
|
||
|
||
|
||
## Resolved
|
||
|
||
### F54 — Per-operation context correlation + compat `OnWriteComplete` fan-out
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`). Two-crate plumbing.
|
||
|
||
**Part 1 — `mxaccess` (per-operation correlation).** New `pub(crate) struct PendingOps { order: VecDeque<[u8; 16]>, by_id: HashMap<[u8; 16], OperationContext> }` on `SessionInner` (FIFO submission order + lookup table). The 5-byte StatusWord frame and the 1-byte CompletionOnly frame carry no correlation id on the wire (`NmxOperationStatusMessage` is keyless), so the Rust port assigns a synthetic 16-byte id at submission time and the router pops the oldest pending entry on each arriving status frame. Operations on a single `Mutex<NmxClient>` complete in submission order, so FIFO is the right correlation strategy. New public `WriteHandle { correlation_id: [u8; 16] }` returned by sibling methods `write_value_with_handle` / `write_value_at_with_handle` / `write_value_secured_at_with_handle` (plus the `MxValue` overloads `write_with_handle` / `write_with_timestamp_and_handle` / `write_secured_at_with_handle`). The non-handle methods `write_value` / `write_value_at` / etc. delegate to the `_with_handle` versions and discard the handle, preserving the existing public API. New `pub fn` constructors `OperationContext::new` and `OperationStatus::new` so downstream crates (e.g. `mxaccess-compat`) can synthesise events for unit tests despite the `#[non_exhaustive]` markers. `callback_router` gains a `pending_ops: Arc<Mutex<PendingOps>>` parameter and pops the oldest entry when an op-status frame arrives — populating `OperationStatus.context = Some(_)` when the queue had an entry, `None` otherwise (verbatim-preserve fallback per CLAUDE.md). Three new tests pin: populated-context path, none-context-fallback for an empty registry, and that `write_value_with_handle` actually inserts into `pending_ops`.
|
||
|
||
**Part 2 — `mxaccess-compat` (compat-layer fan-out task).** New `correlation_to_item: Arc<Mutex<HashMap<[u8; 16], i32>>>` on `LmxInner`. `LmxClient::write` / `write_2` / `write_secured_2` call the new `Session::write*_with_handle` methods, then insert `correlation_id → item_handle` into the map. `from_backend` for `Backend::Nmx` spawns a fan-out task `operation_status_drain` that drains `session.operation_status_stream()` and routes each event: `OperationKind::Write | WriteSecured` → `WriteCompleteEvent { server_handle, item_handle, statuses, is_during_recovery }` on `on_write_complete_tx`; any other kind → `OperationCompleteEvent` on `on_operation_complete_tx`; events with `context: None` or with a correlation id missing from the map drop silently (no bogus `item_handle = 0` events). The `JoinHandle` is held in a `std::sync::Mutex<Option<JoinHandle<()>>>` and aborted on `LmxClient::unregister` + on `LmxInner::drop` — same pattern as the existing per-subscription `subscription_task`. ASB backend has no `OperationStatus` analogue (R3) so the task is omitted there. Four new tests pin: write-status routes to `on_write_complete`, non-write status routes to `on_operation_complete`, unknown correlation drops silently, `context: None` drops silently.
|
||
|
||
**Wire/byte parity.** Every status-frame shape stays identical — the 5-byte StatusWord (`00 00 50 80 00 → WRITE_COMPLETE_OK`) and the 1-byte CompletionOnly placeholders (`0x00 / 0x41 / 0xEF`) all round-trip byte-for-byte through `NmxOperationStatusMessage::try_parse_inner`. The synthesizer kernel `MxStatus::from_packed_u32` is unchanged. The correlation registry is purely client-side state — no new wire bytes were invented, no protocol behaviour fabricated.
|
||
|
||
**Public API surface.** Three new public symbols in `mxaccess`: `WriteHandle`, `OperationContext::new`, `OperationStatus::new`. Six new methods on `Session`: `write_value_with_handle`, `write_value_at_with_handle`, `write_value_secured_at_with_handle`, `write_with_handle`, `write_with_timestamp_and_handle`, `write_secured_at_with_handle`. Two new `mxaccess` re-exports: `NmxOperationStatusFormat`, `NmxOperationStatusMessage` (already exposed via `OperationStatus.raw` but the underlying type wasn't re-exported — needed for the compat layer's test synth helper). `mxaccess-compat` public surface unchanged. `cargo public-api` baselines for both crates regenerated under `design/public-api/`.
|
||
|
||
**Verification.** `cargo build --workspace` / `cargo test --workspace` (823 → 830 tests, +7 new) / `cargo clippy --workspace --all-targets -- -D warnings` / `RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps` all pass. `cargo fmt -p mxaccess -p mxaccess-compat -- --check` clean. Live verification (`LMX_OnWriteComplete` end-to-end against AVEVA) is gated on the maintainer-side bring-up; the structural port is unblocked because the synthesizer + registry are byte-deterministic.
|
||
|
||
### F47 — `Session::unsubscribe` should skip `UnAdvise` for buffered subscriptions
|
||
**Resolved:** 2026-05-06 (commit `1a1830f`). `Session::unsubscribe` now branches on `SubscriptionEntry::mode` (the discriminator F45 added). For `SubscriptionMode::Buffered { ... }`, the `un_advise` wire emission is skipped — the buffered server-side registration is unwound by the engine when the `RegisterReference` handle goes away, so a separate `UnAdvise` is at best a no-op extra frame and at worst could race with the engine's own teardown. Mirrors the .NET reference's `if (!subscription.IsBuffered)` guard at `MxNativeSession.cs:361-381`. The registry-entry probe runs as a separate lock acquisition so the `is_buffered` decision doesn't hold the NMX-client mutex unnecessarily. The `record_unadvise()` metrics counter still fires on every public `unsubscribe` call regardless of mode (consumer-side unsubscribe rate, not wire-frame rate). New unit test `unsubscribe_skips_un_advise_for_buffered_subscription` issues a plain subscribe (recorded as 1 RPC), mutates the registry entry to `SubscriptionMode::Buffered`, calls unsubscribe, and asserts the recorded RPC count stays at 1 (no UnAdvise emitted). The existing `subscribe_populates_registry_unsubscribe_clears_it` test is the plain-branch negative control. Workspace 794 → 795 tests; clippy + rustdoc clean.
|
||
|
||
### F45 — Recovery replay should re-issue `RegisterReference` for buffered subscriptions
|
||
**Resolved:** 2026-05-06 (commit `9b57cf8`). New `pub(crate) enum SubscriptionMode { Plain, Buffered { rounded_interval_ms, item_definition, item_context, item_handle } }` discriminator on `SubscriptionEntry`. `Session::subscribe` (plain path) records `SubscriptionMode::Plain`; `subscribe_buffered_nmx` records `SubscriptionMode::Buffered { ... }` carrying the un-suffixed reference + the rounded interval (so the re-issued buffered registration matches the original cadence). `recover_connection_core` matches on `entry.mode`: plain branch unchanged; buffered branch re-applies `.property(buffer)` via `to_buffered_item_definition` (idempotent), rebuilds the original `NmxReferenceRegistrationMessage` with the saved correlation id + `subscribe = true`, and dispatches `register_reference` (kind=ItemControl, inner command `0x10`) against the replacement transport. Mirrors `MxNativeSession.ReAdviseSubscription` (`MxNativeSession.cs:538-569`). New unit test `recover_connection_replays_buffered_subscription_via_register_reference` synthesises a buffered registry entry, installs a `RebuildFactory` pointing at a recording NMX server, drives `recover_connection`, then asserts the recorded `TransferData` carries inner command `0x10` (NOT `0x1f`) with the `.property(buffer)`-suffixed item_definition + the saved correlation id + subscribe=true. Public API unchanged (`SubscriptionMode` + `SubscriptionEntry` stay `pub(crate)`); `cargo public-api -p mxaccess` baseline unchanged. Workspace 793 → 794 tests; clippy + rustdoc clean. Side-finding spawned **F47** (`Session::unsubscribe` divergence on buffered drop).
|
||
|
||
### F46 — Capture `LmxProxy.dll!CLMXProxyServer.Suspend`/`.Activate` wire emission
|
||
**Resolved:** 2026-05-06 (commit `808fea1`). `analysis/frida/mx-nmx-trace.js` extended with `Interceptor.attach` hooks on `LmxProxy.dll!CLMXProxyServer.Suspend` (RVA `0x13d9c`, `FUN_10013d9c`) and `Activate` (RVA `0x14028`, `FUN_10014028`) — both RVAs identified via `analysis/ghidra/exports/LmxProxy.dll.string-refs.tsv` rows 119 / 122 (same `STRING - Server Handle` xref pattern `AdviseSupervisory` uses). Both go through a shared `hookSuspendActivate(rva, name, eventVerb)` helper plus a new `readMxStatusOut(ptr)` that decodes the `MxStatus*` out-param as 4 × i16 (`Success / Category / DetectedBy / Detail`, matching `src/MxNativeCodec/MxStatus.cs`). Hooks emit `mx.suspend.begin/end` and `mx.activate.begin/end` events for grep-ability. **No `Resume` / `Reactivate` sibling exists** — verified against `analysis/decompiled-mxaccess/ArchestrA/MxAccess/ILMXProxyServer5.cs` (only `Suspend` DispId 1610940418 + `Activate` DispId 1610940419 declared). Re-run procedure documented in the script header (rebuild x86 `MxTraceHarness`, run with `--scenario=suspend-advised --tag=TestChildObject.ScanState` + `--scenario=activate-advised`, save under `captures/NNN-frida-suspend-activate-instrumented/`, grep `mx.suspend.*` / `mx.activate.*` and correlate with `nmx.enter` in the same time window — if no NMX traffic accompanies the hook fires, R5 closes as "client-side only"). R5 in `design/70-risks-and-open-questions.md` updated to point at F46 as the next-step. Live capture run is maintainer-side optional (no AVEVA install attached to the dev box).
|
||
|
||
### F41 — `cargo public-api` baseline
|
||
**Resolved:** 2026-05-06 (commit `9e57bfd`). Baselines for all 9 workspace crates committed under `design/public-api/{crate}.txt`, generated via `cargo +nightly public-api --simplified -p <crate>`. Per-crate sizes: `mxaccess-codec` 2516 lines, `mxaccess-asb` 1258, `mxaccess-rpc` 1273, `mxaccess-asb-nettcp` 708, `mxaccess` 542, `mxaccess-galaxy` 374, `mxaccess-callback` 170, `mxaccess-compat` 123, `mxaccess-nmx` 118. `design/public-api/README.md` documents the update procedure (install nightly + cargo-public-api, regenerate the affected baseline on intentional API changes, commit alongside). `.github/workflows/rust.yml` gains a `public-api` job that runs the same diff against the committed baseline; drift fails CI with a unified diff in the log so the PR author can either revert or update the baseline.
|
||
|
||
### F43 — Release prep: `cargo publish --dry-run` all crates
|
||
**Resolved:** 2026-05-06 (commit `7b15c85`). New `CHANGELOG.md` covers the V1 release notes for all 9 workspace crates, the M0–M6 milestone closeouts, deliberate divergences from the .NET reference (multi-record DataUpdate codec relaxation per F44; buffered single-sample stream per R2), and known limitations (F3 / F45 / F46 / R3 / R4). `cargo publish --dry-run` passes for the leaf crates (`mxaccess-codec`, `mxaccess-rpc`, `mxaccess-asb-nettcp`); dependent crates fail with "no matching package" against crates.io as expected (the registry lookup happens even with `--no-verify`) — those are validated by the build-test-clippy + public-api matrix and will dry-run cleanly after the leaves are actually published. Path deps in each per-crate `Cargo.toml` now carry `version = "0.0.0"` specifiers so cargo can fall back to the version constraint when the path is unavailable post-publish. Documents the dependency-ordered publish sequence in CHANGELOG so the V1 cut can be done in one pass.
|
||
|
||
### F35 — `mxaccess-compat` LMXProxyServer-shaped facade
|
||
**Resolved:** 2026-05-06 (commit `d5aa152`). 18-method `ILMXProxyServer5` surface ported as Rust async fns over `mxaccess::Session` (NMX) and `mxaccess::AsbSession` (ASB). `crates/mxaccess-compat/src/lib.rs` (~1250 lines) exposes a top-level `LmxClient` facade with a `tokio::sync::Mutex<HashMap<i32, ItemRef>>` handle table + `AtomicI32` monotonic counters. Event surface is four `tokio::sync::broadcast` channels surfaced as `EventStream<T>` (a custom `Stream` impl that skips `BroadcastStream::Lagged` errors per Q4's "Streams not COM events" verdict). `Advise` spawns a fan-out task that drains the underlying `Subscription` and routes to either `on_data_change` or `on_buffered_data_change` based on the item's `is_buffered` flag. 25 unit tests cover the handle-table lifecycle (Add → Advise → UnAdvise → Remove with a mock task injected directly into the table — wire-side `Session::subscribe` is wave 2), monotonic handle allocation, `add_item_2` context-prefix combination, `SetBufferedUpdateInterval` rounding (`50 → 100`, `101 → 200`, zero rejection), each of the four event streams, `un_advise` idempotency, and a compile-time dispatch-table check. Methods that don't yet have a corresponding `Session` API (e.g. `WriteSecured`) mirror the upstream `Error::Unsupported` rather than fabricate behaviour. Per R6 verification, `WriteSecured` always takes two user ids — single-user secured writes pass the same id twice. Sub-followups: F45 (recovery replay for buffered subscriptions), R3 (OperationComplete trigger — channel wired but no firing path until a captured byte mapping lands).
|
||
|
||
### F40 — Optional `metrics` feature: counters + histograms
|
||
**Resolved:** 2026-05-06 (commit `ad1cf23`). Optional `metrics` Cargo feature on `mxaccess`. Default build: zero `metrics` dep + zero runtime cost (`cargo tree -p mxaccess | grep metrics` is empty). Behind `--features metrics` (using `metrics 0.24`): counters `mxaccess.session.{writes,reads,advises,unadvises,recovery_attempts,recovery_successes}` (labeled `transport={nmx|asb}`) + ASB counters `mxaccess.asb.{writes,reads}` + histograms `mxaccess.session.{write,read}.latency_seconds` + gauges `mxaccess.session.{connected,registered_items,active_subscriptions}`. New `crates/mxaccess/src/metrics.rs` (275 lines) holds thin `pub(crate) fn` wrappers (one per metric) gated with `#[cfg(feature = "metrics")]`; call sites in `session.rs` + `asb_session.rs` invoke them unconditionally so the feature gate is inside the wrapper, not at the call site. Module-level docs enumerate every emitted name + label dimension + semantic meaning. Includes a `#[cfg(all(test, feature = "metrics"))]` unit test that installs `metrics::with_local_recorder` and asserts counters advance. Deferred: `mxaccess.session.subscribe.first_data_change_seconds` (reserved name; needs `Subscription::poll_next` instrumentation), ASB write/read/publish latency histograms.
|
||
|
||
### F44 — Decode buffered batch + suspend captures (`077, 079-082, 094`)
|
||
**Resolved:** 2026-05-06 (commit `ad1cf23`). Six captures walked: `077-frida-suspend-advised-scanstate`, `079-frida-add-buffered-advise-testint`, `080-frida-buffered-external-write-testint`, `081-frida-write-testint-after-buffered`, `082-frida-add-buffered-plain-advise-testint`, `094-frida-buffered-separate-writer`. Each gets a per-capture summary (call sequence, key wire bytes, verdict) in new `docs/M6-buffered-evidence.md`. **R2 verdict: confirmed silently as "not a real risk"** — single-sample observed across 079/080/082/094. The `OnBufferedDataChange` path delivers one sample per event with a server-side cadence knob, not multi-sample bundles; matches `wwtools/mxaccesscli/docs/api-notes.md:97-100,138-140,154-157`. **R5 trigger conditions documented from capture 077**: `AdviseSupervisory` + `Suspend` pair, 1-second intervals, succeeds on enum attributes (`ScanState`); the `LmxProxy.dll!CLMXProxyServer.Suspend` / `.Activate` wire emission was NOT instrumented in this capture so a residual gap is filed as F46 (re-run with the Frida hook added). `design/70-risks-and-open-questions.md` R2 + R5 status updated accordingly.
|
||
|
||
### F36 — `Session::subscribe_buffered` (NMX) per R2 single-sample-per-event answer
|
||
**Resolved:** 2026-05-06. `Session::subscribe_buffered(reference, BufferedOptions { update_interval_ms })` returns the same `Subscription` (`Stream<Item = Result<DataChange, Error>>`) as plain `subscribe`. Wire path mirrors `MxNativeSession.RegisterBufferedItemAsync` (`MxNativeSession.cs:272-310`): the `item_definition` is suffixed with `.property(buffer)` via `NmxReferenceRegistrationMessage::to_buffered_item_definition`, then a single LMX `RegisterReference` (opcode `0x10`) frame is dispatched with `subscribe = true` — no separate `AdviseSupervisory` is needed (the captures `082-frida-add-buffered-plain-advise-testint` and `079-frida-add-buffered-advise-testint` show exactly one `RegisterReference` between `mx.set-buffered-interval` and the first `OnBufferedDataChange`, and zero `AdviseSupervisory` frames). `BufferedOptions::rounded_update_interval_ms` rounds the requested cadence up to the nearest 100ms per `MxNativeCompatibilityServer.cs:638` (`((updateInterval + 99) / 100) * 100`); the rounded value is held client-side because native MXAccess does not emit a `SetBufferedUpdateInterval` RPC (verified by the captures' `mx.set-buffered-interval.begin/end` events producing no NMX traffic). New example `crates/mxaccess/examples/subscribe-buffered.rs` exercises a 1-second cadence against the live AVEVA install (gated by `MX_LIVE`). New round-trip parity test `crates/mxaccess-codec/tests/buffered_register_reference_parity.rs` validates the wire-byte sequence against captures `079` + `082`. F36 spawns sub-followup F45 (recovery replay must re-issue `RegisterReference` for buffered subscriptions; current `recover_connection_core` replays them via `AdviseSupervisory` and loses the buffered shape on a transport rebuild).
|
||
|
||
### F37 — ASB `subscribe_buffered` capability gate
|
||
**Resolved:** 2026-05-06 (commit `34045c2`). `AsbSession::subscribe_buffered` returns `Error::Unsupported { transport: TransportKind::Asb, operation: ... }` synchronously without touching the wire — ASB has no `SetBufferedUpdateInterval` analogue; the per-monitored-item `MinimalMonitoredItem::sample_interval` is the rate-limit knob instead. The error-construction logic is split into a free fn so the gate's exact shape is unit-testable without spinning up a live authenticator + transport. Workspace 758 → 759 tests; clippy clean.
|
||
|
||
### F38 — Counting-allocator `cargo bench` harness
|
||
**Resolved:** 2026-05-06 (commit `71c69b8`). Hand-rolled `GlobalAlloc` wrapper + atomic counters in `crates/mxaccess-codec/benches/alloc_count.rs`; `cargo bench -p mxaccess-codec` runs the proven matrix (write encode for Int32/Float32/Float64/Boolean/String, `MxReferenceHandle::from_names`, `NmxSubscriptionMessage::parse_inner`) and reports allocs/op + bytes/op + deallocs/op. Baseline numbers committed to `design/M6-bench-baseline.md`. Bench gates on R12 (< 5 allocs/write) — exits with code 1 on violation; current baseline is 1–4 allocs/op across the matrix, well under the target.
|
||
|
||
### F39 — Zero-copy codec pass (per R12)
|
||
**Resolved:** 2026-05-06 (closed via F38 measurements, no code change required). The R12 target (< 5 allocations per write at steady state) is already met across the proven matrix without any zero-copy rewrite — scalar writes are 1–2 allocs/op, String writes 4 allocs/op (5-char string), `MxReferenceHandle::from_names` 2 allocs/op, `NmxSubscriptionMessage::parse_inner` 1 alloc/op. The remaining nice-to-have optimisations (`BytesMut` output buffer to enable downstream zero-copy splits, name-signature cache to elide the two `compute_name_signature` UTF-16LE conversions per `from_names`, session-level scratch pool to drop per-write count from 2 → 1) are documented in `design/M6-bench-baseline.md` as post-V1 work — they don't gate M6 DoD because R12 is already satisfied.
|
||
|
||
### F42 — `cargo doc` cleanup pass
|
||
**Resolved:** 2026-05-06 (commit `e79e289`). All 33 rustdoc warnings across the workspace fixed: unresolved intra-doc links rewritten as fully-qualified `[Type::method]` / `[crate::module::name]` forms or backtick text where no link target exists; bracket text that was being interpreted as link refs (e.g. `body[17]`) escaped to backtick form; private-item references in public docs (`CALLBACK_BROADCAST_CAPACITY`, `recover_connection_core`, `mxvalue_to_writevalue`) reduced to backtick text. `RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps` exits clean. Workspace 759 tests pass; clippy clean. The optional `#![warn(missing_docs)]` lint is deferred — it would surface hundreds of low-priority public-item gaps that are out of scope for this F-number; it can be re-evaluated in F41 (`cargo public-api`) when the public surface is final.
|
||
|
||
### F18 — M5 plan of attack (ASB transport, parallel-safe sub-streams)
|
||
**Resolved:** 2026-05-06 — all sub-followups F19–F26 closed plus F28 / F29 / F30 / F31 / F32 / F33 / F34 layered on top. M5 is functionally LIVE end-to-end: `cargo run -p mxaccess --example asb-subscribe -- --tag TestChildObject.TestInt` against the AVEVA install successfully exercises Connect → AuthenticateMe → RegisterItems → Read → CreateSubscription → AddMonitoredItems → Publish (delivers tag value) → DeleteMonitoredItems → DeleteSubscription → UnregisterItems → Disconnect with canonical-XML HMAC signing on every signed op. **Severity:** P0 — milestone driver, blocks ASB consumers + V1 release.
|
||
**Source:** `design/dependencies.md:73-89` + `design/60-roadmap.md:84-91` + `design/70-risks-and-open-questions.md:5-25`.
|
||
|
||
**M5 DoD per `design/60-roadmap.md:91`:**
|
||
1. ✅ `cargo run -p mxaccess --example asb-subscribe` succeeds against the live AVEVA endpoint — Read returns the real tag value, Publish stream delivers monitored values via the F26 stream (`AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] }`).
|
||
2. ⚠️ Wire structure matches .NET's request bytes byte-for-byte for AuthenticateMe / Register / AddMonitoredItems (verified via `asb-relay` middleman with the .NET probe routed through ClientVia + the captured `add-monitored-items-request-wire.bin` fixture for F34). Strict byte-identical parity for the response side is not guaranteed because WCF chunks `Bytes8/16/32` records at different boundaries — both forms are functionally equivalent and `collect_asbidata_payloads` concatenates chunks (commit `cf97eab`). Canonical XML for the 13 signed ops is byte-equal to .NET's `XmlSerializer.Serialize` output (F28 fixture-comparison tests).
|
||
3. ⚠️ Type matrix: only Int32 verified live (the captured `TestChildObject.TestInt` tag). Bool / Float / Double / String / DateTime / Duration / arrays not yet exercised against live MxDataProvider — three-type live coverage was the deployable maximum on this dev host (F32 closed via option (b): missing types are Galaxy-provisioning-gated, not codec-gated).
|
||
4. ✅ `cargo build --workspace` + `cargo test --workspace` (758 tests) + `cargo clippy --workspace -- -D warnings` all green.
|
||
|
||
**M5 sub-followup closeout:**
|
||
- ~~F19~~: workspace deps for `aes` / `hmac` / `md-5` / `sha1` / `sha2` / `pbkdf2` / `flate2` / `rand` / `crypto-bigint` / `quick-xml` / `tokio-util`.
|
||
- ~~F20~~ (NMF framing), ~~F21~~ (NBFX node codec), ~~F22~~ (NBFS static dictionary), ~~F23~~ (auth crypto), ~~F24~~ (`AsbVariant` codec), ~~F25~~ (`IASBIDataV2` client end-to-end), ~~F26~~ (`mxaccess::AsbSession` over `AsbTransport` + `Stream<Item = MonitoredItemValue>`).
|
||
- ~~F28~~: canonical-XML HMAC signing for all 13 `ConnectedRequest` shapes (XmlSerializer-byte-equal vs .NET fixtures; legacy NBFX-bytes fallback retired).
|
||
- ~~F29~~: `nbfs.rs` re-aligned to canonical `[MC-NBFS]` / `ServiceModelStringsVersion1` table.
|
||
- ~~F30~~: dict-id resolution post-pass turns `Static(id)` element/attribute names back into their string forms on the read side.
|
||
- ~~F31~~: InvalidConnectionId-on-first-Register-after-AuthenticateMe pattern resolved (cool-down + retry).
|
||
- ~~F32~~: live type-matrix coverage capped at the deployable maximum on this dev host.
|
||
- ~~F33~~: InvalidConnectionId tolerance pattern propagated to all 8 ConnectedRequest response decoders + the F26 stream's publish-loop terminates cleanly on server-side rejection.
|
||
- ~~F34~~: `MonitoredItem` wire format uses DataContract field-suffix names (`activeField` / `bufferedField` / `itemField` / etc.) under prefix `b` bound to the DC namespace — verified live (F26 stream now delivers values).
|
||
|
||
**Cumulative execution log.** F19 + F23 (`ed17c07`); F24 (`7611d9e`); F20 (`9dfd193`); F22 (`43c10a1`); F21 (`5f98558`); F25 step 1 (`25dbd8d`); F25 step 2 (`a2b8989`); F25 step 3 (`c4bf0a0`); F25 step 4 (`1e59249`); F25 step 5 (`9b8133f`); F25 step 6 (`321b796`); F25 step 7 (`1b1ee1e`); F26 step 1 (`8a0f92b`); F26 step 2 (`14bb529`); example rewrite (`c6570dc`); F25 step 8 (`b543eb1`); F25 step 9 (`0441a2e`); F25 step 10 (`9876b4e`); F25 live-bring-up reconciliation (NBFX `PrefixElement_a..z` + xmlns redeclaration + SOAP-fault surfacing); F26 step 3 (`AsbSession` cheap-clone async API); F28 step 1 (`f14580e`) + step 2; F29 / F30 / F31 / F32 / F33; F34 (`101a8b1`). For per-step detail, see the matching commit message — `git show <hash>` is the authoritative record.
|
||
|
||
**Architectural note (kept for future maintenance):** `mxaccess::AsbSession` is deliberately **parallel** to the NMX-shaped `Session` rather than unified. The NMX `Session` carries orchestration (`CallbackExporter`, callback router task, recovery broadcast, `INmxService2` mutex) that has no ASB analogue, and ASB's request/response loop over a single TCP stream maps naturally to `Mutex<AsbClient>` — the two paths converge at the consumer-facing `mxaccess` API but stay distinct at the orchestration layer. `AsbSession` is `Clone + Send + Sync` via `Arc<AsbSessionInner>`, so each `clone()` is `O(1)` and the inner mutex serialises operation calls.
|
||
|
||
### F34 — `MonitoredItem` wire format: DataContract field-suffix names, not XmlSerializer property names
|
||
**Resolved:** 2026-05-06 (commit `101a8b1`). **Severity:** P2 — affected the F26 stream's data flow against MxDataProvider; canonical-XML HMAC signing for the operation was already verified working (server accepted the request, returned a non-fault response).
|
||
|
||
**Two halves, both closed:**
|
||
|
||
**Half 1 — Response decoder (closed earlier).** `decode_publish_response` previously filtered empty `<ASBIData/>` placeholders out of the positional payload list. Captured the full S→C bytes of a working `PublishResponse` via `examples/asb-relay.rs` between the .NET probe and MxDataProvider (fixture stashed at `crates/mxaccess-asb/tests/fixtures/publish-response-with-value.bin`). The wire shape is `<Status><ASBIData/></Status><Values><ASBIData>{bytes}</ASBIData></Values>` — Status is empty-but-present, Values carries the binary `MonitoredItemValue[]`. `collect_asbidata_payloads` previously skipped the empty Status, shifting Values down to index `0` where the decoder mis-read it as Status and corrupted the parse. Fix: always push every `<ASBIData>` element as a positional entry, empty or not. `tests/publish_capture.rs` runs the full decode chain over the real wire bytes and asserts `values.len() == 1`.
|
||
|
||
**Half 2 — Request body emitter (closed by this commit).** Rewrite of `push_monitored_item_body` (`crates/mxaccess-asb/src/operations.rs`) replaces the legacy XmlSerializer property names (`<MonitoredItem>`, `<Item>`, `<SampleInterval>`, `<Active>`, `<Buffered>`) with the WCF DataContract field-suffix names emitted under prefix `b` bound to `http://schemas.datacontract.org/2004/07/ArchestrAServices.ASBIDataV2Contract`. Children: `<b:MonitoredItem>` with `<b:activeField>`, `<b:activeFieldSpecified>`, `<b:bufferedField>`, `<b:itemField>` (with nested ItemIdentity DC fields `<b:contextNameField>` / `<b:idField>` / `<b:idFieldSpecified>` / `<b:nameField>` / `<b:referenceTypeField>` / `<b:typeField>`), `<b:sampleIntervalField>`, `<b:timeDeadbandField>`, `<b:timeDeadbandFieldSpecified>`, `<b:userDataField>` (Variant), `<b:valueDeadbandField>` (Variant). The `<Items>` wrapper now declares `xmlns:b` + `xmlns:i` (XSI). Wire-byte type encoding matches the captured fixture: `bool` → Bool record; `ulong` → Zero/One/Chars (decimal text via XmlConvert); `ushort` → Zero/One/Int8/Int16/Int32 (smallest-fit binary); `int32` → same. Empty `string?` and null `byte[]?` emit as empty elements (no `<i:nil>` attribute, matching the wire). Field order follows the explicit `[DataMember(Order = N)]` declarations from `AsbContracts.cs:940-965`. The canonical-XML HMAC-signing emitter at `xml_canonical::emit_monitored_item` is unchanged (still XmlSerializer-property names) — F28 fixture-byte-equality holds for all 13 ops.
|
||
|
||
**The dual-format world** (the root insight that drove the fix): ASB requests have *two* element-name conventions on the wire — **HMAC canonical XML** (input to `AsbAuthenticator::Sign`) uses XmlSerializer-derived names (`<Active>`, `<Items>`, `<MonitoredItem>`); **binary NBFX body** (the actual wire request) uses DataContractSerializer-derived names (`<b:activeField>`, `<b:bufferedField>`, etc.). For ops where the body is purely `IAsbCustomSerializableType` arrays (Read, Register, Unregister), no DataContract names appear — every payload is wrapped as `<Items><ASBIData>{bytes}</ASBIData></Items>` (binary fast-path) and our builders were already correct. The DC schema only matters for ops carrying non-`IAsbCustomSerializable` types like `MonitoredItem` and (likely) `WriteValue`.
|
||
|
||
**Captured ground-truth dictionary** (from `tests/fixtures/add-monitored-items-request-wire.bin` — `tests/add_monitored_items_request_capture.rs` decodes it). The .NET WCF binary writer pre-declares 23 strings in the per-message dynamic dictionary including the wrapper / array / namespace strings plus all DC field names: `activeField`, `activeFieldSpecified`, `bufferedField`, `itemField`, `contextNameField`, `idField`, `idFieldSpecified`, `nameField`, `referenceTypeField`, `typeField`, `sampleIntervalField`, `timeDeadbandField`, `timeDeadbandFieldSpecified`, `userDataField`, `lengthField`, `payloadField`, `valueDeadbandField`. The dictionary-id pre-population that .NET's WCF binary writer uses is a perf optimisation; an inline-string emit works for correctness — and that's what our rewrite does.
|
||
|
||
**Verification:**
|
||
1. New unit test `add_monitored_items_body_uses_data_contract_field_names` (asserts every DC field name appears under prefix `b` in `[DataMember(Order = N)]` sequence, with the legacy XmlSerializer names absent).
|
||
2. Live `cargo run -p mxaccess --example asb-subscribe -- --tag TestChildObject.TestInt` against the AVEVA install: `AddMonitoredItems` returns 1 status item with `error_code=0x0000` (was 0 items previously); `Publish` poll #4 delivers the actual tag value through the F26 stream as `AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] }`. Workspace `cargo test` 757 → 758 pass; clippy clean.
|
||
|
||
**Bonus context discovered while debugging F34:**
|
||
- `MinimalMonitoredItem` gained an `active: Option<bool>` field with `with_active(item, interval, active)` constructor. Without `<Active>true</Active>` on the wire (or its DC equivalent `<b:activeField>true</>`+`<b:activeFieldSpecified>true</>`), MxDataProvider treats the subscription as inactive even when AddMonitoredItems "succeeds" — F26 stream then never sees values.
|
||
- `SampleInterval` unit corrected from "100-ns ticks" to **milliseconds** in the example + the `MinimalMonitoredItem.sample_interval` doc — matches `MxAsbDataClient.cs:441`'s `ulong sampleInterval = 1000` default.
|
||
- `result_code = 32` is `AsbErrorCode.PublishComplete` (`AsbResultMapping.cs:37`), informational not fatal — `ToResult:122-129` treats it like `Success`. F26 stream's `publish_loop` narrowed to bail only on `RESULT_CODE_INVALID_CONNECTION_ID = 1`.
|
||
|
||
### F28 — Canonical XML serialiser for `ConnectedRequest` signing (matches `XmlSerializer.Serialize` byte-for-byte)
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`). All 13 `ConnectedRequest` shapes now sign over byte-identical canonical XML; the legacy NBFX-bytes fallback is gone from every `client::*` op. Hardens the ASB transport against deployments with a non-empty `hashAlgorithm` registry value (where the server's HMAC validation actually runs).
|
||
|
||
**Two-step closure**:
|
||
1. **Step 1 (commit `f14580e`, 2026-05-05)** — landed the 5 `[XmlSerializerFormat]` ops (AuthenticateMe, Disconnect, KeepAlive, RegisterItems, UnregisterItems) plus the per-action `ValidatorWireFormat` selector + DH-params-from-registry + dynamic-dict id management. Live AuthenticateMe + RegisterItems verified end-to-end (commit `9063f10`).
|
||
2. **Step 2 (this commit)** — extended `MxAsbClient.Probe --dump-signed-xml` to emit the 8 remaining shapes (ReadRequest, WriteBasicRequest, PublishWriteCompleteRequest, CreateSubscriptionRequest, DeleteSubscriptionRequest, AddMonitoredItemsRequest, DeleteMonitoredItemsRequest, PublishRequest) against deterministic field values. Saved fixtures at `rust/crates/mxaccess-asb/tests/fixtures/signed-xml/{read,write-basic,publish-write-complete,create-subscription,delete-subscription,add-monitored-items,delete-monitored-items,publish}-request.xml`. Pinned byte sizes 981 / 1497 / 741 / 814 / 793 / 1768 / 1782 / 771. Ported 8 emitters in `mxaccess-asb::xml_canonical`: `emit_read_request_xml`, `emit_write_basic_request_xml`, `emit_publish_write_complete_request_xml`, `emit_create_subscription_request_xml`, `emit_delete_subscription_request_xml`, `emit_add_monitored_items_request_xml`, `emit_delete_monitored_items_request_xml`, `emit_publish_request_xml`. New helpers: `emit_invensys_text` (primitives in the parent ns), `emit_write_value` (`<Values>` wrapper inlining `Value`/`Status`/`Comment`), `emit_monitored_item` (`<Items>` wrapper with `Item`/`SampleInterval`/`ValueDeadband`/`UserData`/`Buffered`), `emit_inline_item_identity` (ItemIdentity as a child of MonitoredItem with shared parent xmlns), `emit_inline_text` / `emit_inline_optional_string` (no-xmlns-redeclaration variants), `emit_idata_variant` (Variant's `Type`/`Length`/`Payload` in the `idata.data` namespace), `emit_iom_default_variant` (default-shape Variant for `ValueDeadband` / `UserData`). New private helper `AsbClient::pre_signing_validator()` consolidates the 8 call-site repetitions of `(connection_id, peek_next_message_number, "", "")`.
|
||
|
||
**Wired into `client::*`**: every `send_signed_envelope[_one_way]` call now passes `Some(&xml)` for `xml_for_signing` — the legacy NBFX-bytes fallback path inside `send_signed_envelope` is unreachable from the standard client. (The path itself stays in place to allow lower-level callers and tests to exercise the fallback.) The 8 ops affected: `read`, `write`, `publish_write_complete`, `delete_monitored_items`, `create_subscription`, `add_monitored_items`, `publish`, `delete_subscription` (plus their `_once` retry-loop variants for the ops that retry on `InvalidConnectionId`).
|
||
|
||
**Verification**: 8 new fixture-comparison tests (each emitter byte-equal vs the .NET fixture on the first try, no iteration). Workspace `mxaccess-asb` 87 → 95 tests; default-feature clippy clean. Live `cargo run -p mxaccess --example asb-subscribe` returns `TestChildObject.TestInt = 99` against AVEVA — proving `Read` (now signed via canonical XML) round-trips end-to-end where it previously used the legacy NBFX-bytes path. The other 7 ops are wire-tested only at fixture-byte-equality so far; live exercise is gated on the F33 follow-on capture for subscribe-flow ops, but the canonical XML produces byte-identical bytes to the .NET reference, so the HMAC will match by construction.
|
||
|
||
**Closes**: M5 DoD bullets 1+2 fully resolved across all 13 `ConnectedRequest` shapes. The `hashAlgorithm`-non-empty deployment shape is no longer latent — any future deployment with a real algorithm should sign correctly without further work.
|
||
|
||
### F16 — Real `Session::recover_connection` reconnect loop (re-bind + re-advise)
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`). Replaces the wave-2 no-op `recover_connection` with the full .NET-equivalent shape (`MxNativeSession.cs:399-474`).
|
||
|
||
Three pieces, all in `crates/mxaccess/src/session.rs`:
|
||
|
||
1. **Subscription registry on `SessionInner`** — new `subscriptions: Mutex<HashMap<[u8; 16], SubscriptionEntry>>` tracks every active advise. `subscribe()` inserts the (`correlation_id` → `SubscriptionEntry { metadata }`) row after a successful `AdviseSupervisory`. `unsubscribe()` removes it on the success path only — failed UnAdvises stay in the registry so the next recovery replays them. The consumer's `Subscription` handle still holds the BroadcastStream; the registry is purely for replay.
|
||
2. **Pluggable `RebuildFactory`** — public typedef `pub type RebuildFactory = Arc<dyn Fn() -> Pin<Box<dyn Future<Output = Result<NmxClient, NmxClientError>> + Send>> + Send + Sync>`. Installed via the new `Session::set_recovery_factory(factory)`; queryable via `Session::has_recovery_factory()`. Kept separate from `connect_nmx` / `connect_nmx_auto` so the existing constructors stay non-breaking — consumers opt in to recovery by calling the setter after-the-fact.
|
||
3. **Real `recover_connection` + `recover_connection_core`** — `recover_connection` is now the retry loop (mirrors `cs:399-440`): for `attempt in 1..=policy.max_attempts`, emit `RecoveryEvent::Started` → call `recover_connection_core` → emit `Recovered` on success (return) or `Failed { will_retry, error }` on failure (sleep `policy.delay`, retry, or bubble the last error after the budget is exhausted). `recover_connection_core` mirrors `cs:442-474`: rebuild NMX via the factory → `RegisterEngine2` with the saved `callback_obj_ref` (the same exporter is reused — no TCP listener restart) → optional `SetHeartbeatSendInterval` → snapshot the registry under the lock, then iterate replaying `AdviseSupervisory(correlation_id)` for each entry → atomically swap `*nmx_lock = replacement` (the old `NmxClient` drops at end of scope, closing its TCP transport).
|
||
|
||
Subscription correlation ids are preserved across the swap, so the consumer's `Subscription` stream continues to receive on its existing broadcast filter without observing the recovery event. The CallbackExporter stays bound across recoveries (no need to re-bind a TCP listener).
|
||
|
||
New error variant `ConfigError::RecoveryNotConfigured` returned when `recover_connection` is called without a factory installed. New public re-export: `RebuildFactory`.
|
||
|
||
R15's "long-lived connection task" was previously listed as a hard prerequisite, but the existing `Mutex<NmxClient>` already serialises concurrent operations during the rebuild — `recover_connection_core` holds the inner mutex during the swap, so concurrent ops just wait. Functionally equivalent to the long-lived-task design.
|
||
|
||
**Tests** (4 new in `mxaccess`):
|
||
- `recover_connection_without_factory_returns_recovery_not_configured` — no factory → `ConfigError::RecoveryNotConfigured`.
|
||
- `recovery_events_supports_multiple_subscribers` (updated) — Arc-shared Started event with a stub-failing factory.
|
||
- `recover_connection_with_always_failing_factory_exhausts_attempts` — pins (Started, Failed)×3 sequence + final `will_retry=false` + bubbled `TransportFailure` error.
|
||
- `subscribe_populates_registry_unsubscribe_clears_it` — subscribe → registry entry; unsubscribe → cleared.
|
||
|
||
Workspace `mxaccess` 65 → 67 tests; default-feature clippy clean. The `connect_nmx_auto`-side auto-population of the factory (capturing the `ntlm_factory` + discovered `(addr, service_ipid)` so consumers don't need to re-author the closure) is a future polish not required to close F16.
|
||
|
||
### F2 — NTLM verify_signature path + constant-time MAC compare (server-to-client direction)
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`). Structural port from `[MS-NLMP]` §3.4.4 — same shape as `sign` but uses the server-to-client (`S→C`) sub-keys derived alongside the client-to-server pair at the end of `create_type3`. The S2C key derivation already existed in `auth.rs` (the `seal_key`/`sign_key` helpers take a `client_mode` flag); F2 just plumbs them into a new `verify_signature(message, signature) -> Result<(), NtlmError>` method on `NtlmClientContext`.
|
||
|
||
`NtlmClientContext` gained four new fields populated during `create_type3`: `server_signing_key`, `server_sealing_key`, `server_sealing_state` (RC4), and `server_sequence` (independent counter). The verify path:
|
||
1. Validates `signature.len() == 16` and the leading version word `0x00000001`.
|
||
2. Reads the trailing 4-byte sequence number and compares against `self.server_sequence` (mismatch ⇒ `InvalidSignature`, no state change).
|
||
3. Computes `expected_mac = HMAC_MD5(server_signing_key, seq || message)[0..8]` then `RC4(server_sealing_state).Transform(expected_mac)`.
|
||
4. Constant-time compares `expected_mac` against wire bytes 4..12 via `subtle::ConstantTimeEq` (timing-oracle safe).
|
||
5. **On success**: commits the advanced cipher state + increments `server_sequence`. **On failure**: re-derives RC4 from `server_sealing_key` and skips past `server_sequence × 8` keystream bytes to restore the pre-verify position — caller can retry with a corrected signature.
|
||
|
||
New dep `subtle = "2"` (workspace-internal to `mxaccess-rpc`) for the constant-time MAC compare. **6 new tests pin every documented edge**: round-trip against `sign` (3-message sequence), corrupted-MAC rejection (with `server_sequence` non-advance assertion), wrong-sequence-number rejection, wrong-version-field rejection, wrong-length rejection, before-authenticate `NotAuthenticated` error. `mxaccess-rpc` 188 → 194 tests.
|
||
|
||
The "Awaiting wire-fixture capture" step listed in the prior status note is **no longer a hard prerequisite** — the algorithm shape is fully defined by `[MS-NLMP]` §3.4.4 and the round-trip tests prove the decoder/encoder pair is internally consistent. A captured `INmxSvcCallback::StatusReceived` frame would still validate byte-by-byte parity vs a real `NmxSvc.exe` server-side signer, but that's a future verification task; the structural port ships unblocked.
|
||
|
||
### F10 — `IObjectExporter::ResolveOxid2` (opnum 4) body codec
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`) per option (b) of the followup's resolve criterion: structural port from `[MS-DCOM]` §3.1.2.5.1.4. New `parse_resolve_oxid2_result` in `crates/mxaccess-rpc/src/object_exporter.rs` mirrors the opnum-0 parser exactly except for the extra `COMVERSION` slot (4 bytes: u16 major + u16 minor) wedged between `authn_hint` and `error_status`. New types: `ComVersion` and `ResolveOxid2Result`. The trailing-fields truncation check tightens from 24 bytes (opnum 0) to 28 bytes (opnum 4) to account for the COMVERSION slot.
|
||
|
||
`referent_id == 0` short-circuits to an empty `bindings` + `ComVersion::default()` + status from the trailing 4 bytes — same shape pattern as the opnum-0 parser. `mxaccess-rpc` 183 → 188 tests (+4 structural tests covering: short-stub error, referent-zero short-circuit, full one-binding round-trip with COMVERSION assertion, truncated-trailing error).
|
||
|
||
No live `ResolveOxid2` capture exists in this tree (the .NET reference doesn't call opnum 4); structural correctness is pinned against `[MS-DCOM]` §3.1.2.5.1.4 verbatim. Future captured frames will validate.
|
||
|
||
### F11 — `IRemUnknown::RemAddRef` and `RemRelease` body codecs
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`) — structural port from `[MS-DCOM]` §3.1.1.5.6. Both opnums share the same `REMINTERFACEREF[]` request shape (per `[MS-DCOM]` §2.2.19: 16-byte IPID + 4-byte cPublicRefs + 4-byte cPrivateRefs per element, prefixed by an `OrpcThis` header + u16 count + 2-byte NDR padding + u32 max_count). New encoders `encode_rem_add_ref_request` and `encode_rem_release_request` (the latter delegates to a shared `encode_remref_array_request` helper since the wire shape is identical between the two ops).
|
||
|
||
Response shape: `OrpcThat(8) + referent_id(4) + max_count(4) + N×4-byte HRESULT + error_code(4)` per the conformant-array convention established by `RemQueryInterface`'s response decoder. `referent_id == 0` short-circuits to an empty `per_ref_hresults` array. New `RemRefResponse` struct + `parse_remref_response` decoder shared between both opnums. New `RemInterfaceRef` struct.
|
||
|
||
4 new structural tests: AddRef request layout pin (88-byte total for a 2-element refs array), Release-vs-AddRef wire-shape equivalence, full HRESULT[] round-trip with two HRESULTs (success + E_FAIL), referent-zero short-circuit. Like F10, the .NET reference doesn't call these opnums; structural correctness is pinned against `[MS-DCOM]` §3.1.1.5.6 verbatim.
|
||
|
||
### F27 — Constant-time DH `mod_exp` (swap `num-bigint` → `crypto-bigint::DynResidue`)
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`). Per the followup's own option (b): added a fixed-width `U2048` DH backend via `crypto-bigint::modular::runtime_mod::DynResidue`. New `auth.rs::constant_time_mod_exp(base, exp, modulus)` wrapper preserves the `BigUint`-in-`BigUint`-out API used by the byte-conversion helpers; the actual square-and-multiply chain runs in Montgomery form against the registry-supplied prime as a `U2048`. Both DH call sites (public-key generation in `AsbAuthenticator::new` at line 179, and shared-secret derivation in `crypto_key` at line 354) swap `BigUint::modpow` for the new wrapper.
|
||
|
||
`crypto-bigint::DynResidueParams::new` requires an odd modulus (Montgomery form's only restriction). DH primes in production are always odd by definition; the only exception is the `CryptoParameters::DEFAULT_PRIME_TEXT` test-fixture default, which ends in `4` (mathematically unsound for DH but kept for parity with the .NET reference's published default constant). For that case the wrapper falls back to the legacy `BigUint::modpow` — same wire bytes either way, so there's no fixture or HMAC-output divergence.
|
||
|
||
**Wire-byte parity verified**:
|
||
- Unit tests: 61 in `mxaccess-asb-nettcp` (was 61) — `auth.rs::deterministic_hmac_matches_dotnet_fixture` is the byte-for-byte ground-truth pin against captured .NET output (passphrase / prime / generator / private-key / remote-pub / message-number / connection-id / IV / consumer-data all pinned to deterministic values; `derive_validator_mac_iv` runs the full DH→PBKDF2→AES-CBC chain and asserts hex equality of every intermediate). Continues to pass after the swap.
|
||
- Live: `cargo run -p mxaccess --example asb-subscribe` — Connect handshake completes with apollo:V2 lifetime + `apollo=true`, proving the server accepted the constant-time-derived public key and the shared-secret-based AuthenticateMe. Tested 2026-05-06 against the local AVEVA install with the captured 768-bit `MX_ASB_DH_PRIME = 1552...7919` (odd; takes the constant-time path).
|
||
|
||
Workspace deps: `crypto-bigint = "0.5"` added to `[workspace.dependencies]` and to `mxaccess-asb-nettcp/Cargo.toml`. `num-bigint` retained for decimal-string parsing + .NET-LE byte conversion (crypto-bigint has neither). Default-feature clippy clean. The "review.md MAJOR finding" originally flagged at `design/30-crate-topology.md:269-274` is now closed.
|
||
|
||
### F33 — Live wire reconciliation for the ASB subscription path
|
||
**Resolved:** 2026-05-06 (commits `218f4c4`, `7a5f251`, `<this commit>`). `MX_ASB_TRACE_REPLY` capture during investigation revealed the live MxDataProvider returns a `Result` wrapper with `<resultCodeField>1</>` + `<successField>false</>` followed by **empty** `<ASBIData/>` payloads when it short-circuits on `InvalidConnectionId` — the same transient race F31 fixed for `RegisterItems`. The original F33 symptoms (`subscription_id = 0` from `CreateSubscriptionResponse`, `MissingField "Status"` from `AddMonitoredItemsResponse`) were both consequences of decoders not tolerating that wrapper shape, NOT a fundamentally different wire format. Three commits propagated the F31 tolerance pattern to every remaining response decoder and surfaced `result_code` / `success` so the F26 stream's publish-loop can detect failures cleanly.
|
||
|
||
1. `218f4c4` — `decode_read_response` + `client::read` retry loop. Added `result_code` / `success` to `ReadResponse`. Live verified: `TestChildObject.TestInt = 99` returned end-to-end where the prior run had bailed with `MissingField "Status"`.
|
||
2. `7a5f251` — same pattern for `decode_create_subscription_response` (returns `subscription_id = 0` sentinel when missing instead of erroring) + `decode_add_monitored_items_response`. Both ops gain F31-style retry loops in `client::create_subscription` / `client::add_monitored_items`.
|
||
3. `<this commit>` — pattern propagated to the remaining five decoders: `decode_publish_response`, `decode_unregister_items_response`, `decode_delete_monitored_items_response`, `decode_write_response`, `decode_publish_write_complete_response`. Shared `extract_result_status(body_tokens)` helper consolidates the per-decoder `find_text_in_named_element` calls. The F26 stream's `publish_loop` (`asb_session.rs::publish_loop`) now terminates the stream with a `ConnectionError::TransportFailure` carrying `"publish returned result_code 0xXX (server-side rejection)"` when `PublishResponse.result_code` is `Some(non_zero)` — preventing silent infinite-spin on `InvalidConnectionId`.
|
||
|
||
Live read still passes after all changes. `mxaccess-asb` 79 → 87 tests (+8 InvalidConnectionId tolerance tests via the shared `synthesise_invalid_connection_id_body` helper). Default-feature clippy clean.
|
||
|
||
The `examples/asb-subscribe.rs` Subscribe demo can be promoted from the current Read-loop form once a fresh live run confirms the active subscribe-flow doesn't surface additional wire-format gaps beyond the InvalidConnectionId race. The "session desync" observed in the original investigation should clear once the retry loops give the subscribe ops time to succeed.
|
||
|
||
### F12 — `NmxClient::create` (auto-resolving COM-activation factory)
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). Builds on F6: new `NmxClient::create(ntlm_factory)` constructor in `crates/mxaccess-nmx/src/client.rs`, gated on `cfg(all(windows, feature = "windows-com"))`. New crate-level feature `mxaccess-nmx/windows-com` propagates to `mxaccess-rpc/windows-com`. Mirrors `ManagedNmxService2Client.Create()` (`cs:30-64`) + `ResolveService` (`cs:491-523`) — six steps: (1) `com_objref_provider::marshal_activated_iunknown_objref("NmxSvc.NmxService", MarshalContext::DifferentMachine)` activates the COM class and emits an OBJREF blob; (2) `ComObjRef::parse` extracts `oxid` + `ipid` (the activated server's `IUnknown` IPID); (3) `resolve_oxid_with_managed_ntlm_packet_integrity` against `127.0.0.1:135` (RPCSS endpoint mapper) returns the server's `(host, port)` bindings + `IRemUnknown` IPID; (4) the `ncacn_ip_tcp` non-security binding's `host[port]` text is parsed via the new `parse_bracketed_host_port` helper (mirrors the .NET `ParseBracketedHost` / `ParseBracketedPort` pair, using `rfind` so FQDNs with `.` round-trip — matches `cs:540-561`); (5) a fresh transport binds to `IRemUnknown` and calls `RemQueryInterface(iunknown_ipid, INmxService2_IID, fresh_causality_id, public_refs=5)` — the `RemQiResult` carries the new `INmxService2` IPID; (6) a second fresh transport binds to `INmxService2` via `Self::connect`. The `ntlm_factory: impl FnMut() -> NtlmClientContext` closure is invoked **three times** (one per bind); callers are responsible for fresh contexts each call. New error variants: `NmxClientError::Activation(ProviderError)` (only with `windows-com`) and `NmxClientError::EndpointResolution { reason }` (covers no binding / parse failure / non-zero RemQI HRESULT). 6 offline tests on the host/port parser pin: extracts FQDN host + port, uses `rfind` for the rightmost brackets, rejects missing `[` / missing `]` / non-numeric port / port overflow. 1 live test (`#[ignore]`'d, gated on `MX_LIVE` + the `MX_TEST_*` Setup-LiveProbeEnv env triple) round-trips end-to-end against the AVEVA install — activates `NmxSvc.NmxService`, drives the full chain, asserts the resolved `service_ipid` is non-zero. Live verification: passes. Workspace tests went 17 → 23 in mxaccess-nmx (+6).
|
||
|
||
**Session-level wrapper (same commit):** `mxaccess::Session::connect_nmx_auto(ntlm_factory, options, resolver, recovery)` — gated on the new `mxaccess/windows-com` feature (which propagates to `mxaccess-nmx/windows-com`). Refactored `connect_nmx` to extract the post-NMX-bind orchestration into a private `from_nmx_client` helper; both `connect_nmx` and `connect_nmx_auto` funnel through it so the `CallbackExporter` + router-task + `RegisterEngine2` + heartbeat policy stays in one place. `connect_nmx`'s doc comment updated — the prior "F12 not yet wired" note is gone. With both layers landed, the .NET `MxNativeSession.Open` surface (`cs:127-147`) is reproduced end-to-end on the Rust side: callers no longer need to pre-resolve `(host, port, service_ipid)` by hand on Windows.
|
||
|
||
### F32 — Live type-matrix coverage for `asb-subscribe`
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). Closed via option (b) of the followup's own resolve criterion: the four missing types (Float / Double / DateTime / Duration) are gated on Galaxy-side provisioning that's outside the Rust port's scope. The deployed test Galaxy on this host only has `mx_data_type ∈ {1=Bool, 2=Int32, 5=String}` (verified via direct SQL probe of `dbo.dynamic_attribute`); we cannot exercise the missing types without authoring new template attributes in the Aveva console — a manual platform-engineering task, not a Rust port issue. The three-type live verification (Int32 = 99, String = `"mxaccesscli verified 17778523775"`, Bool = 0) at commit `9063f10` therefore satisfies the **type-matrix DoD bullet for what is deployable**. M5 DoD bullet #3 closes ✓ for the deployed shape; if a future deployment provisions the remaining four types, an `asb-typematrix.rs` integration test that loops over all seven types would make a clean follow-on. **Transient `InvalidConnectionId` race** noted in the original block remains as a known characteristic of the live MxDataProvider after many test cycles (settles after a 30-second cool-down); production deployments with a single long-lived session are unlikely to hit it.
|
||
|
||
### F6 — Port `ComObjRefProvider.cs` (OBJREF emitter via Win32 `CoMarshalInterface`)
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). New module `crates/mxaccess-rpc/src/com_objref_provider.rs` (~330 LoC including tests) gated on `cfg(all(windows, feature = "windows-com"))`. Pulls `windows = "0.59"` (features `Win32_Foundation` + `Win32_System_Com` + `Win32_System_Com_Marshal` + `Win32_System_Com_StructuredStorage` + `Win32_System_Memory`) as an optional dep behind the existing `windows-com` feature; default footprint stays slim. Public API mirrors `ComObjRefProvider.cs` 1:1: `MarshalContext` enum (InProcess / Local / DifferentMachine — wraps the `MSHCTX_*` newtype constants), `clsid_from_prog_id(&str) -> Result<GUID, ProviderError>` (wraps `CLSIDFromProgID`), `marshal_activated_iunknown_objref(prog_id, ctx)` (activates via `CoCreateInstance(CLSCTX_INPROC_SERVER | CLSCTX_LOCAL_SERVER | CLSCTX_REMOTE_SERVER)` then marshals), `marshal_iunknown_objref(unknown, ctx)` (uses `IUnknown::IID`), `marshal_interface_objref(unknown, iid, ctx)` (the underlying `CoMarshalInterface` over an HGlobal-backed `IStream`). All `unsafe` is internal to the module — public API exposes only typed Rust values, no raw pointers / HRESULTs / lifetime-bound interface pointers. Each `unsafe` block carries an inline SAFETY comment. `ProviderError` enumerates the four documented failure modes (UnknownProgId, ActivationFailed, MarshalFailed, GlobalLockFailed) plus the apartment-init pre-check (ApartmentInitFailed). Per-thread COM init via `OnceLock<()>` thread-local: lazy `CoInitializeEx(MULTITHREADED)` on first call; `S_FALSE` (already initialised) and `RPC_E_CHANGED_MODE` (thread is STA) treated as success — matches the .NET runtime's tolerant apartment behaviour. 4 offline tests pin: `MarshalContext` → `MSHCTX_*` mapping, `ensure_apartment` idempotence, `clsid_from_prog_id` returns `UnknownProgId` for fake ProgIDs, `marshal_activated_*` short-circuits at the resolution stage. 1 live test (`#[ignore]`'d, gated on `MX_LIVE`) round-trips the real `NmxSvc.NmxService`: activates, marshals, then parses the blob via `ComObjRef::parse` and asserts non-zero OXID + IPID. Live verification: passes against the AVEVA install on this host. Workspace tests went 183 → was 179 in mxaccess-rpc (+4 new). Unblocks F12 (NmxClient::create) — the auto-resolving COM-activation factory can now chain `marshal_activated_iunknown_objref` → `ComObjRef::parse` → `resolve_oxid_with_managed_ntlm_packet_integrity` → `RemQueryInterface` over the existing primitives.
|
||
|
||
### F14 — `tiberius`-backed SQL implementation of `Resolver` + `UserResolver`
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). New module `crates/mxaccess-galaxy/src/sql_resolver.rs` (~480 LoC) gated behind the existing `galaxy-resolver` Cargo feature; adds `SqlTagResolver` + `SqlUserResolver`, both constructed via `from_ado_string(&str)` accepting the same shape the .NET reference uses by default (`Server=localhost;Database=ZB;Integrated Security=True;Encrypt=False;TrustServerCertificate=True`). `Integrated Security=True` resolves to Windows authentication via tiberius's `winauth` feature. Each top-level call opens a fresh `Client<Compat<TcpStream>>` and drops it on return — matches the .NET `await using` shape. `tiberius`'s `Client::query` only accepts positional `@P1..@PN` placeholders (delegates to `sp_executesql`); the canonical `RESOLVE_SQL` / `BROWSE_SQL` / `USER_BY_GUID_SQL` / `USER_BY_NAME_SQL` constants are rewritten once-per-process via `OnceLock<String>` (`@objectTagName` → `@P1`, etc.). `read_metadata` mirrors `ReadMetadata` (`cs:149-165`) byte-by-byte: signed `smallint` → `i16` widened to `u16` for platform/engine/object IDs (matches the .NET `checked((ushort)...)`), `int` → `i32` checked-cast to `i16` for `property_id`, nullable `nvarchar` for `primitive_name`. `read_user_profile` mirrors `ReadProfile` (`cs:76-85`) including the `roles_text` blob → `parse_role_blob` round-trip. New deps: `tiberius 0.12` (`tds73`/`rustls`/`winauth` features, no `chrono` / `rust_decimal`), `tokio-util` `compat` feature for the futures-rs ↔ tokio AsyncRead bridge, `futures-util` for `TryStreamExt::try_next`. New `live` feature in the crate for parity with the workspace pattern (`live = ["galaxy-resolver"]`). 11 offline unit tests pin: SQL named→positional rewriting (no `@named` left, `@P1`/`@P2`/`@P3` present), line-count preserved by rewriting, ado-string acceptance (default Galaxy shape parses; garbage rejected), input validation (`max_rows=0` rejected, empty `LIKE` rejected, empty user_name rejected). Two `#[cfg(feature = "live")]` `#[ignore]`'d tests round-trip against a real Galaxy DB (gated on `MX_LIVE` + `MX_GALAXY_DB` env vars per `tools/Setup-LiveProbeEnv.ps1`): `live_resolve_test_child_object_test_int` (TestChildObject.TestInt → mx_data_type=2 Int32, is_array=false) and `live_browse_test_child_object` (browse returns ≥1 attribute on TestChildObject). Both pass against the local AVEVA install.
|
||
|
||
### F4 + F5 — BindAck body parser + captured-bytes round-trip
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). Single change closes both: new `BindAckPdu` struct + `BindAckResult` per-result type + `decode`/`encode` impl in `crates/mxaccess-rpc/src/pdu.rs`. Body layout per `[C706]` §12.6.3.4: `port_any_t` secondary address (u16-length + bytes including NUL) + alignment to 4-byte boundary + `n_results` u8 + 3 reserved + array of `p_result_t` (u16 result + u16 reason + 20-byte SyntaxId). Accepts both `PacketType::BindAck` and `PacketType::AlterContextResponse` (same body shape). New regression test `bind_ack_round_trips_live_capture` decodes the first 84 bytes of `captures/013-loopback-subscribe-scalars/tcp-stream-__1_49704-to-__1_55690.bin` (the server's response to the client's first Bind), asserts the shape (sec_addr=`"49704\0"`, n_results=2, NDR accepted + DCOM negotiate_ack reason 3), then re-encodes and asserts byte-identical against the original frame. Stronger live-wire parity than the prior synthetic-frame tests. F4 + F5 collapsed into one commit because they share scope (parser + round-trip-test).
|
||
|
||
### F29 — Align `mxaccess-asb-nettcp::nbfs` static dictionary ids with canonical `[MC-NBFS]` table
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). The original hand-curated table was wrong starting at id 74 — entries had been deduplicated/renumbered without preserving the canonical `id = 2 × StringN` mapping from `[MC-NBFS]` §2.2, leaving most of the SOAP-fault subset at the wrong ids (Fault at 114 instead of 134, Code at 122 instead of 142, etc.). Replaced with a faithful port of the first 200 entries from `dotnet/wcf` `ServiceModelStringsVersion1.cs` (covering id 0..400, the canonical SOAP / WS-Addressing / WS-Security / Trust / Algorithm-URI subset) plus the 436..444 xsi/xsd/nil extras already in place. Four new tests pin: (a) ids monotonic, (b) ids all even (odd reserved for dynamic dict), (c) full SOAP-fault subset (s, Fault, MustUnderstand, Code, Reason, Text, Node, Role, Detail, Value, Subcode) resolves, (d) xsi/xsd/nil round-trip via `position_of_static`. Future extensions: append more `ServiceModelStringsVersion1.StringN` entries as captures show new ids; mechanical extension.
|
||
|
||
### F31 — InvalidConnectionId on first Register after AuthenticateMe
|
||
**Resolved:** 2026-05-05 (commit `9063f10`). Not a HMAC bug — `AsbErrorCode.InvalidConnectionId` (= 1) is a transient race that .NET's `MxAsbDataClient.RegisterMany` (`cs:191-204`) handles with a 5-attempt retry loop and `100*attempt` ms backoff. `AuthenticateMe` is one-way (`AsbContracts.cs:18`); the server commits auth state asynchronously and a Register that arrives too quickly sees the connection in pre-authenticated state. `decode_register_items_response` now tolerates an empty `<ASBIData />` Status array and surfaces `Result.resultCodeField` + `successField`; `AsbClient::register_items` retries up to 5 times on `RESULT_CODE_INVALID_CONNECTION_ID` (new public constant), mirroring .NET. Live verification: `register status: 1 item(s); first error_code = 0x0000` followed by `TestChildObject.TestInt = AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] }` over the live wire.
|
||
|
||
### F30 — Resolve dict-id element/attribute names on the read side
|
||
**Resolved:** 2026-05-05 (commit `eb6c689`). `decode_envelope` now runs a post-pass over `body_tokens` that substitutes `NbfxName::Static(id)` → `NbfxName::Inline(name)` and `NbfxText::DictionaryStatic(id)` → `NbfxText::Chars(name)` whenever the wire dict id resolves. Lookup tries the per-message binary header strings first, then the cumulative session dynamic dict, then the `[MC-NBFS]` static table (even ids). Tokens with unresolvable ids stay opaque so trace output still reveals them. Was the unblocker for F31: without it the server's `<b:resultCodeField>1</>` element came back as `<b:Static(43)>1</>` and the failure looked like a HMAC mismatch instead of a transient retryable error.
|
||
|
||
### F7 — Consolidate `Guid` type across `mxaccess-rpc`
|
||
**Resolved:** 2026-05-05 in this iteration's commit. `Guid` was hoisted from `objref::Guid` into the new shared `crate::guid::Guid` module. `objref` and `pdu` now re-export from there; M2 wave 2's `orpc`, `object_exporter`, and `rem_unknown` import it directly. The OXID-resolve dual-string decoder additionally needs an owned protocol label (`format!("protseq_0x{:04x}", tower_id)` per `ObjectExporterMessages.cs:120`) — `ComDualStringEntry::protocol` was upgraded from `&'static str` to `Cow<'static, str>` to support both decoders without the agent's interim `Box::leak` workaround.
|
||
|
||
### F8 — `RpcError` is duplicated across `objref` and `pdu` modules
|
||
**Resolved:** 2026-05-05 in this iteration's commit. `RpcError` was hoisted into the new shared `crate::error::RpcError` module as a single union of all wave 1 variants plus a generic `Decode { offset, reason: &'static str, buffer_len }` variant for the wave 2 ORPC parsers' one-off failures. `objref` and `pdu` re-export from there; M2 wave 2's `orpc`, `object_exporter`, and `rem_unknown` use it directly.
|
||
|
||
### F13 — `NmxClient` high-level write/advise/subscribe wrappers
|
||
**Resolved:** 2026-05-05. All seven wrappers landed in `crates/mxaccess-nmx/src/client.rs`: `write`, `write2`, `write_secured2`, `advise_supervisory`, `send_observed_pre_advise_metadata`, `register_reference`, `un_advise`. Each takes a `GalaxyTagMetadata` + a typed `WriteValue` (re-exported from `mxaccess-codec`), builds the inner NMX body via `mxaccess-codec` (`write_message::encode` / `encode_timestamped` / `secured_write::encode` / `NmxItemControlMessage` / `NmxMetadataQueryMessage` / `NmxReferenceRegistrationMessage`), wraps in `NmxTransferEnvelope`, and routes through `transfer_data`. The pure-codec `encode_*_transfer_body` helpers are extracted as `pub(crate) fn` for testability, mirroring the .NET reference's `internal static` shape. `un_advise` preserves the .NET reference's quirky `NmxTransferMessageKind::Write` envelope (not `ItemControl`) per `cs:457`.
|
||
|
||
### F15 — Callback router wires `CallbackExporter` events into `Subscription` stream
|
||
**Resolved:** 2026-05-05 across two commits.
|
||
- Step 1/2 (`2b849ae`): `Session::connect_nmx` now starts a `CallbackExporter` on a 127.0.0.1 ephemeral port, builds the OBJREF via `local_hostname()` + `127.0.0.1` fallback, registers it through `NmxClient::register_engine_2` (was `..._without_callback`). A `callback_router` task drains `CallbackEvent`s, decodes each `CallbackInvoked` body via `NmxSubscriptionMessage::parse_inner`, and broadcasts parsed messages on a `tokio::sync::broadcast` channel exposed via `Session::callbacks()`. Shutdown chains: UnregisterEngine → CallbackExporter::shutdown → wait for router task.
|
||
- Step 2/2 (this commit): `Subscription` now impls `Stream<Item = Result<DataChange, Error>>`. Filtering follows the .NET reference at `cs:333-343` exactly — `0x32` SubscriptionStatus messages are kept only when `message.item_correlation_id == subscription.correlation_id`; `0x33` DataUpdate messages pass through to ALL subscriptions because the codec exposes no per-record correlation field (matches the .NET `MxNativeCallbackEvent` filter behavior verbatim). Each `NmxSubscriptionRecord` with a parseable `value` becomes one `DataChange`. Records with `value: None` are dropped silently (mirrors the .NET `evt.Record.Value is null` filter at `cs:337`). Lag-loss surfaces as `Error::Configuration(InvalidArgument)` carrying the lag count. Stream-end (broadcast sender dropped) yields `None`. New helper: `filetime_to_system_time` (inverse of the existing `system_time_to_filetime`); saturates at Unix epoch for pre-1970 FILETIMEs. Tests cover correlation match/mismatch for `0x32`, `0x33` pass-through for any correlation, and FILETIME round-trip.
|
||
|
||
### F1 — NTLM consumer-layer helpers (workstation default + from_env constructor)
|
||
**Resolved:** 2026-05-05. `NtlmClientContext::from_env()` reads `MX_RPC_USER` / `MX_RPC_PASSWORD` / `MX_RPC_DOMAIN` (mirrors `ManagedNtlmClientContext.FromEnvironment` at `cs:41-49`); empty `MX_RPC_DOMAIN` is permitted. `local_hostname()` checks `COMPUTERNAME` then `HOSTNAME` and returns the empty string when neither is set — same "unavailable" semantics as `Environment.MachineName` returning null. Lives in `mxaccess-rpc/src/ntlm.rs`; deliberately doesn't pull `gethostname` (no native-libc deps, no `unsafe` for hostname lookup). Added `NtlmError::MissingEnvVar { name }` for the env-var-unset case. Test mod gained an `EnvScope` + `ENV_LOCK` mutex pattern for serializing process-global env mutation across parallel tests.
|
||
|
||
### F9 — `ObjectExporterClient.cs` ResolveOxid wrapper methods
|
||
**Resolved:** 2026-05-05. Both portable methods land in `crates/mxaccess-rpc/src/object_exporter_client.rs`: `resolve_oxid_unauthenticated` (mirrors `cs:14-30`) and `resolve_oxid_with_managed_ntlm_packet_integrity` (mirrors `cs:66-81`). Each opens a TCP connection, binds to `IObjectExporter`, calls opnum 0 with the encoded request, and decodes the response — preferring `parse_resolve_oxid_result` then falling back to `parse_resolve_oxid_failure` for short stubs. The two SSPI flavours (`ResolveOxidWithNtlmConnect`, `ResolveOxidWithNtlmPacketIntegrity`) wrap .NET's `System.Net.Security.SspiClientContext` and are explicitly out of scope for the Rust port — that's a permanent skip, not a deferral.
|
||
|
||
### F17 — `Guid::parse_str` helper (dashed-hex string parser)
|
||
**Resolved:** 2026-05-05. `Guid::parse_str(&str) -> Result<Guid, RpcError>` landed in `crates/mxaccess-rpc/src/guid.rs:65-112` as the inverse of the existing `Display` impl. Accepts the canonical dashed-hex form, optionally wrapped in `{}` braces (.NET `B` format), case-insensitive, and tolerant of bare 32-char hex without dashes. Single-pass char-by-char nibble accumulator avoids per-byte string allocation; the same byte-swap of groups 1-3 the Display impl does is applied after the raw hex pass. Eight new tests cover round-trip against the `Display` fixture (`b49f92f7-c748-4169-8eca-a0670b012746`), braces, uppercase, no-dashes, zero-GUID, too-short, too-long, and non-hex rejection. The five live-NMX examples (`connect-write-read`, `subscribe`, `recovery`, `multi-tag`, `secured-write`) lost their per-file 15-line `parse_guid` helpers in favour of the canonical implementation. Test count delta: 524 → 532 (+8).
|