4ff511bbed
Closes the residual that R3/R4 Path A's commit `c73a33e` deferred:
the OperationStatus.context field was always None because no
in-flight correlation map existed in SessionInner, and the
mxaccess-compat broadcast channels for OnWriteComplete /
OperationComplete were exposed on the public API but had no
fan-out task draining session events into them.
**mxaccess (Part 1 — per-operation correlation):**
- New `pending_ops: Mutex<HashMap<[u8; 16], OperationContext>>` on
SessionInner. Populated when `Session::write*` / `subscribe*`
dispatches an outstanding operation; entry removed when the
matching OperationStatus event fires (one-shot semantics).
- New `Session::write_with_handle` (and equivalents for the secured /
timestamped paths) returns a `WriteHandle { correlation_id }` so
consumers can correlate completions back to their originating
call. Existing `write` / `write_value` / etc. signatures unchanged
and delegate to the handle-returning variant.
- Callback router extended to look up `pending_ops` by correlation_id
on each operation-status event. When found, populates
`OperationStatus.context: Some(OperationContext { correlation_id,
op_kind, reference, retry_count: 0 })`. When not found, falls
through with `context: None` (verbatim-preserve per CLAUDE.md).
- New unit tests assert: matching correlation_id populates context,
unknown correlation_id leaves context None, the entry is removed
from `pending_ops` after one event fires.
**mxaccess-compat (Part 2 — compat-layer fan-out):**
- New `correlation_to_item: tokio::sync::Mutex<HashMap<[u8; 16], i32>>`
on LmxClientInner.
- `LmxClient::write` / `write_2` / `write_secured` / `write_secured_2`
call `Session::write_with_handle` (or equivalent) and insert
`correlation_id → item_handle` into the map before returning.
- `LmxClient::register` / `register_asb` spawn a background task that
drains `session.operation_status_stream()`. Per event, looks up
`correlation_to_item[event.context?.correlation_id]` to find the
item_handle, then routes:
- `OperationKind::Write` / `OperationKind::WriteSecured` →
`WriteCompleteEvent { server_handle, item_handle, statuses,
is_during_recovery }` into `on_write_complete_tx`.
- Other variants → `OperationCompleteEvent { ... }` into
`on_operation_complete_tx`.
- Removes the correlation_id from `correlation_to_item` after
firing (one-shot).
- Events with no matching item_handle (correlation_id not in map)
are dropped silently — no bogus item_handle=0 events.
- Task cancelled on LmxClient drop via `JoinHandle::abort` (matches
the existing `subscription_task` pattern).
- New unit tests cover: Write op routes to on_write_complete, Read
op routes to on_operation_complete, unknown correlation_id is
dropped.
Result: the C# `LMX_OnWriteComplete(int hLMXServerHandle, int
phItemHandle, ref MXSTATUS_PROXY[] pVars)` callback shape is now
end-to-end-achievable. A consumer calls `LmxClient::write(hServer,
hItem, value, userId)` and drains `client.on_write_complete()`; the
yielded `WriteCompleteEvent` carries the right `(server_handle,
item_handle, statuses, is_during_recovery)` tuple.
Public API: `Session::write_with_handle` + `WriteHandle` are new;
existing signatures unchanged. `cargo public-api` baselines
regenerated under `design/public-api/{mxaccess,mxaccess-compat}.txt`.
Workspace: 765 → 823 tests pass (~58 new tests from F54). Clippy
`-D warnings` clean. Rustdoc `-D warnings` clean.
F54 status in `design/followups.md` moved Open → Resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
340 lines
78 KiB
Markdown
340 lines
78 KiB
Markdown
# Followups
|
||
|
||
Open work items deferred during /loop iterations. Triaged at the top of
|
||
every iteration. New items are appended under `## Open`; resolved items
|
||
move to `## Resolved` with a date + commit hash.
|
||
|
||
## Open
|
||
|
||
### F48 — Execute `cargo publish` for the V1 release cut
|
||
**Severity:** P1 — V1 release driver. F43 only validated dry-run for the leaf crates; the actual publish to crates.io has not happened.
|
||
**Source:** `design/60-roadmap.md:100` (M6 DoD bullet 6 — "Release: cargo publish all crates"); `CHANGELOG.md` "Publish order" section.
|
||
**Depends on:** F43 (dry-run validation), F49 (live verification of M6 features before publishing them).
|
||
|
||
**Scope.** Publish all 9 workspace crates to crates.io in dependency order:
|
||
1. `mxaccess-codec`, 2. `mxaccess-rpc`, 3. `mxaccess-asb-nettcp` (leaves — no internal deps)
|
||
4. `mxaccess-galaxy`, 5. `mxaccess-callback`, 6. `mxaccess-asb` (single-internal-dep tier)
|
||
7. `mxaccess-nmx`, 8. `mxaccess`, 9. `mxaccess-compat` (multi-internal-dep tier)
|
||
|
||
Between each publish: wait for the crate to be indexed before the next one's `cargo publish` runs (the registry-lookup race that broke the dependent dry-runs in F43).
|
||
|
||
**Definition of done:**
|
||
1. All 9 crates exist on crates.io at the same workspace version (likely 0.1.0 — bump from the 0.0.0 placeholder before the cut).
|
||
2. `cargo install mxaccess` resolves a clean dependency tree from a fresh registry lookup (no `--locked` workaround).
|
||
3. Tag the V1 release commit (`git tag v0.1.0`) and push the tag so the CHANGELOG anchors to a stable ref.
|
||
|
||
**Resolves when:** crates.io shows all 9 crates published + the V1 tag is pushed.
|
||
|
||
### F49 — Live verification sweep for the M6 features
|
||
**Severity:** P1 — closes the live-evidence gap for the M6 work that landed unit-only this session.
|
||
**Source:** F36, F40, F45, F47 closeouts — each ships with unit tests but several were not exercised against the live AVEVA install in this session.
|
||
|
||
**Scope.** Run the following against the live AVEVA host with `MX_LIVE=1`:
|
||
1. **F36 buffered subscribe** — `cargo run -p mxaccess --example subscribe-buffered -- --tag TestChildObject.TestInt`. Confirm `OnBufferedDataChange`-rate updates flow at the configured cadence; capture wire bytes via `analysis/frida/mx-nmx-trace.js` and confirm exactly one `RegisterReference` (`0x10`) frame with `.property(buffer)` suffix, no separate `SetBufferedUpdateInterval` RPC, and no separate `AdviseSupervisory` follow-up.
|
||
2. **F45 recovery replay for buffered** — start the `subscribe-buffered` example, force a `Session::recover_connection` mid-flight (e.g. via a `wwtools` helper that bumps the NMX TCP socket), assert the post-recovery NMX traffic carries an `RegisterReference` (NOT `AdviseSupervisory`) with the same correlation id and `.property(buffer)` suffix.
|
||
3. **F47 buffered unsubscribe skip** — instrument `Session::unsubscribe` with a `tracing::debug` log line on the buffered branch, run the example to completion + drop, confirm no `UnAdvise` frame in the wire trace.
|
||
4. **F40 metrics** — install a `metrics` exporter (`metrics-exporter-prometheus` is the lightest), run `connect-write-read` + `subscribe` examples with `--features metrics`, confirm at least one counter increment and one histogram observation per metric name in the registered set.
|
||
|
||
**Definition of done:**
|
||
1. Per-feature evidence summary in `docs/M6-live-verification.md` (one paragraph per feature with the wire-trace excerpt or metrics-exporter snapshot).
|
||
2. If any feature fails live: file a sub-followup with the captured failure and link it from the evidence doc.
|
||
|
||
**Resolves when:** all four features have a live evidence row + no sub-followups remain unresolved.
|
||
|
||
### F50 — Run the F46 Suspend/Activate Frida capture live
|
||
**Severity:** P3 — residual from F46 (script ready, capture not yet run).
|
||
**Source:** F46 closeout (`design/followups.md`) + `analysis/frida/mx-nmx-trace.js` header procedure.
|
||
|
||
**Scope.** Run the Frida script against a live `MxTraceHarness.exe` exercising the suspend-advised + activate-advised scenarios on `TestChildObject.ScanState`. Save under `captures/NNN-frida-suspend-activate-instrumented/`. If the new `mx.suspend.*` / `mx.activate.*` events accompany NMX traffic in the same time window: document the wire opnum + body shape in `docs/M6-buffered-evidence.md` and `analysis/proxy/nmxsvcps-procedures.tsv`. If no NMX traffic accompanies the hook fires: update `design/70-risks-and-open-questions.md` R5 to "settled — client-side only".
|
||
|
||
**Definition of done:** R5 is fully settled (either with a documented wire opnum or a "client-side only" verdict backed by capture).
|
||
|
||
**Resolves when:** the capture lands and R5's status is updated.
|
||
|
||
### F51 — Live type-matrix expansion for the ASB Variant codec (`asb-subscribe`)
|
||
**Severity:** P2 — F32 was closed via "deployable maximum" interpretation (only Int32 verified live), but the codec supports Bool / Float / Double / String / DateTime / Duration / arrays without live evidence.
|
||
**Source:** F32 closeout (`design/followups.md`); `work_remain.md:108-113` documents the proven matrix from .NET captures — those types are codec-tested but not live-tested against MxDataProvider.
|
||
|
||
**Scope.** Provision sample tags on the local Galaxy for each missing type (Bool, Float, Double, String, DateTime, Duration, plus 1-2 representative array shapes). Extend `examples/asb-subscribe.rs` with a per-type loop that registers + reads + subscribes against each. Capture the wire bytes via `examples/asb-relay.rs` middleman and add round-trip parity tests in `crates/mxaccess-asb/tests/` for each type.
|
||
|
||
**Definition of done:**
|
||
1. Per-type Galaxy fixture documented in `docs/galaxy-test-fixtures.md` (which child object names to provision, expected attribute types).
|
||
2. `cargo run -p mxaccess --example asb-subscribe -- --type-matrix` exercises all proven types and reports per-type wire bytes + decoded value.
|
||
3. Round-trip test per type in `crates/mxaccess-asb/tests/` pinning the captured wire bytes.
|
||
|
||
**Resolves when:** every proven type from `work_remain.md:108-113` has a live wire fixture + a passing round-trip test.
|
||
|
||
### F52 — Codec performance optimisations deferred from F39
|
||
**Severity:** P3 — R12 < 5 allocs/write target is already met; these are nice-to-haves.
|
||
**Source:** `design/M6-bench-baseline.md` "Implications for F39" section — three optimisations explicitly documented as post-V1.
|
||
|
||
**Scope.** Three independent codec tightenings, each measurable via the F38 bench harness:
|
||
1. **`bytes::BytesMut` output buffer** on the encoder side. Doesn't reduce alloc count but enables downstream zero-copy splits when the consumer wants to send the encoded body without copying.
|
||
2. **Per-handle name-signature cache** in `MxReferenceHandle::from_names`. Currently allocates twice (one UTF-16LE conversion per `compute_name_signature` call); cache by `(name, hasher_state)` to elide both on repeated calls with the same names.
|
||
3. **Session-level scratch pool** for the per-write encode buffer. Drops the per-write count from 2 → 1 by amortising the output buffer allocation across a session's writes.
|
||
|
||
**Definition of done:**
|
||
1. Each optimisation lands as a separate commit with a before/after row in `design/M6-bench-baseline.md` showing the alloc-count delta.
|
||
2. No correctness regressions in the round-trip fixture suite.
|
||
3. Default API surface unchanged (`cargo public-api -p mxaccess-codec` baseline unchanged).
|
||
|
||
**Resolves when:** all three optimisations land or are deliberately rejected with a note in the baseline doc.
|
||
|
||
### F53 — Enable `#![warn(missing_docs)]` workspace-wide
|
||
**Severity:** P3 — doc-coverage tightening; not a correctness or release blocker.
|
||
**Source:** F42 closeout — the missing-docs lint was deferred because enabling it surfaces hundreds of low-priority public-item gaps that are out of scope for that F-number.
|
||
|
||
**Scope.** Per crate root, add `#![warn(missing_docs)]` (or `#![deny(missing_docs)]` for the consumer-facing `mxaccess` + `mxaccess-compat`). Then walk each warning and add at minimum a one-line doc comment per public item. Strategy: do the consumer-facing crates first (`mxaccess`, `mxaccess-compat`); the protocol crates (`mxaccess-codec`, `mxaccess-rpc`, etc.) can land later since their consumers are the higher-level crates which already document the surfaces they re-export.
|
||
|
||
**Definition of done:**
|
||
1. `RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps` continues to pass with the lints enabled.
|
||
2. Every public item in `mxaccess` + `mxaccess-compat` has at least a one-line doc comment.
|
||
3. Protocol crates either get the lint enabled too or have an inline `#[allow(missing_docs)]` with a reason that points at this followup.
|
||
|
||
**Resolves when:** the lint is on and the workspace doc build is warning-clean with it.
|
||
|
||
### F3 — Cross-domain NTLM Type1/2/3 fixture
|
||
**Severity:** P2
|
||
**Status:** Permanently out-of-scope on the current dev host (no second AD domain). Resolution requires external infrastructure not available here.
|
||
**Source:** M2 wave 1, `crates/mxaccess-rpc/src/ntlm.rs`. All current NTLM fixtures are single-domain (the local AVEVA install). Tracked separately in `design/70-risks-and-open-questions.md` R8 (P1 risk) and the open-evidence-gaps table.
|
||
**Concrete next step:** Provision a two-domain Windows lab (e.g. `LAB-A` + `LAB-B` with cross-domain trust + an AVEVA install on `LAB-A` that authenticates a user from `LAB-B`). Run `cargo run -p mxaccess --example connect-write-read` from a `LAB-B`-domain user; capture the NTLM Type1 / Type2 / Challenge / Type3 bytes via `examples/asb-relay.rs` or a Wireshark NTLM filter. Save under `crates/mxaccess-rpc/tests/fixtures/cross-domain-ntlm/`. The existing single-domain Type1/2/3 round-trip tests in `mxaccess-rpc::ntlm` then extend to validate the cross-domain shape (TargetInfo AV pairs differ when crossing domains; specifically `MsvAvDnsTreeName` and `MsvAvDnsComputerName` carry the trusted-domain DNS suffix instead of the local one). Clears R8 in the risks doc.
|
||
|
||
|
||
|
||
|
||
## Resolved
|
||
|
||
### F54 — Per-operation context correlation + compat `OnWriteComplete` fan-out
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`). Two-crate plumbing.
|
||
|
||
**Part 1 — `mxaccess` (per-operation correlation).** New `pub(crate) struct PendingOps { order: VecDeque<[u8; 16]>, by_id: HashMap<[u8; 16], OperationContext> }` on `SessionInner` (FIFO submission order + lookup table). The 5-byte StatusWord frame and the 1-byte CompletionOnly frame carry no correlation id on the wire (`NmxOperationStatusMessage` is keyless), so the Rust port assigns a synthetic 16-byte id at submission time and the router pops the oldest pending entry on each arriving status frame. Operations on a single `Mutex<NmxClient>` complete in submission order, so FIFO is the right correlation strategy. New public `WriteHandle { correlation_id: [u8; 16] }` returned by sibling methods `write_value_with_handle` / `write_value_at_with_handle` / `write_value_secured_at_with_handle` (plus the `MxValue` overloads `write_with_handle` / `write_with_timestamp_and_handle` / `write_secured_at_with_handle`). The non-handle methods `write_value` / `write_value_at` / etc. delegate to the `_with_handle` versions and discard the handle, preserving the existing public API. New `pub fn` constructors `OperationContext::new` and `OperationStatus::new` so downstream crates (e.g. `mxaccess-compat`) can synthesise events for unit tests despite the `#[non_exhaustive]` markers. `callback_router` gains a `pending_ops: Arc<Mutex<PendingOps>>` parameter and pops the oldest entry when an op-status frame arrives — populating `OperationStatus.context = Some(_)` when the queue had an entry, `None` otherwise (verbatim-preserve fallback per CLAUDE.md). Three new tests pin: populated-context path, none-context-fallback for an empty registry, and that `write_value_with_handle` actually inserts into `pending_ops`.
|
||
|
||
**Part 2 — `mxaccess-compat` (compat-layer fan-out task).** New `correlation_to_item: Arc<Mutex<HashMap<[u8; 16], i32>>>` on `LmxInner`. `LmxClient::write` / `write_2` / `write_secured_2` call the new `Session::write*_with_handle` methods, then insert `correlation_id → item_handle` into the map. `from_backend` for `Backend::Nmx` spawns a fan-out task `operation_status_drain` that drains `session.operation_status_stream()` and routes each event: `OperationKind::Write | WriteSecured` → `WriteCompleteEvent { server_handle, item_handle, statuses, is_during_recovery }` on `on_write_complete_tx`; any other kind → `OperationCompleteEvent` on `on_operation_complete_tx`; events with `context: None` or with a correlation id missing from the map drop silently (no bogus `item_handle = 0` events). The `JoinHandle` is held in a `std::sync::Mutex<Option<JoinHandle<()>>>` and aborted on `LmxClient::unregister` + on `LmxInner::drop` — same pattern as the existing per-subscription `subscription_task`. ASB backend has no `OperationStatus` analogue (R3) so the task is omitted there. Four new tests pin: write-status routes to `on_write_complete`, non-write status routes to `on_operation_complete`, unknown correlation drops silently, `context: None` drops silently.
|
||
|
||
**Wire/byte parity.** Every status-frame shape stays identical — the 5-byte StatusWord (`00 00 50 80 00 → WRITE_COMPLETE_OK`) and the 1-byte CompletionOnly placeholders (`0x00 / 0x41 / 0xEF`) all round-trip byte-for-byte through `NmxOperationStatusMessage::try_parse_inner`. The synthesizer kernel `MxStatus::from_packed_u32` is unchanged. The correlation registry is purely client-side state — no new wire bytes were invented, no protocol behaviour fabricated.
|
||
|
||
**Public API surface.** Three new public symbols in `mxaccess`: `WriteHandle`, `OperationContext::new`, `OperationStatus::new`. Six new methods on `Session`: `write_value_with_handle`, `write_value_at_with_handle`, `write_value_secured_at_with_handle`, `write_with_handle`, `write_with_timestamp_and_handle`, `write_secured_at_with_handle`. Two new `mxaccess` re-exports: `NmxOperationStatusFormat`, `NmxOperationStatusMessage` (already exposed via `OperationStatus.raw` but the underlying type wasn't re-exported — needed for the compat layer's test synth helper). `mxaccess-compat` public surface unchanged. `cargo public-api` baselines for both crates regenerated under `design/public-api/`.
|
||
|
||
**Verification.** `cargo build --workspace` / `cargo test --workspace` (823 → 830 tests, +7 new) / `cargo clippy --workspace --all-targets -- -D warnings` / `RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps` all pass. `cargo fmt -p mxaccess -p mxaccess-compat -- --check` clean. Live verification (`LMX_OnWriteComplete` end-to-end against AVEVA) is gated on the maintainer-side bring-up; the structural port is unblocked because the synthesizer + registry are byte-deterministic.
|
||
|
||
### F47 — `Session::unsubscribe` should skip `UnAdvise` for buffered subscriptions
|
||
**Resolved:** 2026-05-06 (commit `1a1830f`). `Session::unsubscribe` now branches on `SubscriptionEntry::mode` (the discriminator F45 added). For `SubscriptionMode::Buffered { ... }`, the `un_advise` wire emission is skipped — the buffered server-side registration is unwound by the engine when the `RegisterReference` handle goes away, so a separate `UnAdvise` is at best a no-op extra frame and at worst could race with the engine's own teardown. Mirrors the .NET reference's `if (!subscription.IsBuffered)` guard at `MxNativeSession.cs:361-381`. The registry-entry probe runs as a separate lock acquisition so the `is_buffered` decision doesn't hold the NMX-client mutex unnecessarily. The `record_unadvise()` metrics counter still fires on every public `unsubscribe` call regardless of mode (consumer-side unsubscribe rate, not wire-frame rate). New unit test `unsubscribe_skips_un_advise_for_buffered_subscription` issues a plain subscribe (recorded as 1 RPC), mutates the registry entry to `SubscriptionMode::Buffered`, calls unsubscribe, and asserts the recorded RPC count stays at 1 (no UnAdvise emitted). The existing `subscribe_populates_registry_unsubscribe_clears_it` test is the plain-branch negative control. Workspace 794 → 795 tests; clippy + rustdoc clean.
|
||
|
||
### F45 — Recovery replay should re-issue `RegisterReference` for buffered subscriptions
|
||
**Resolved:** 2026-05-06 (commit `9b57cf8`). New `pub(crate) enum SubscriptionMode { Plain, Buffered { rounded_interval_ms, item_definition, item_context, item_handle } }` discriminator on `SubscriptionEntry`. `Session::subscribe` (plain path) records `SubscriptionMode::Plain`; `subscribe_buffered_nmx` records `SubscriptionMode::Buffered { ... }` carrying the un-suffixed reference + the rounded interval (so the re-issued buffered registration matches the original cadence). `recover_connection_core` matches on `entry.mode`: plain branch unchanged; buffered branch re-applies `.property(buffer)` via `to_buffered_item_definition` (idempotent), rebuilds the original `NmxReferenceRegistrationMessage` with the saved correlation id + `subscribe = true`, and dispatches `register_reference` (kind=ItemControl, inner command `0x10`) against the replacement transport. Mirrors `MxNativeSession.ReAdviseSubscription` (`MxNativeSession.cs:538-569`). New unit test `recover_connection_replays_buffered_subscription_via_register_reference` synthesises a buffered registry entry, installs a `RebuildFactory` pointing at a recording NMX server, drives `recover_connection`, then asserts the recorded `TransferData` carries inner command `0x10` (NOT `0x1f`) with the `.property(buffer)`-suffixed item_definition + the saved correlation id + subscribe=true. Public API unchanged (`SubscriptionMode` + `SubscriptionEntry` stay `pub(crate)`); `cargo public-api -p mxaccess` baseline unchanged. Workspace 793 → 794 tests; clippy + rustdoc clean. Side-finding spawned **F47** (`Session::unsubscribe` divergence on buffered drop).
|
||
|
||
### F46 — Capture `LmxProxy.dll!CLMXProxyServer.Suspend`/`.Activate` wire emission
|
||
**Resolved:** 2026-05-06 (commit `808fea1`). `analysis/frida/mx-nmx-trace.js` extended with `Interceptor.attach` hooks on `LmxProxy.dll!CLMXProxyServer.Suspend` (RVA `0x13d9c`, `FUN_10013d9c`) and `Activate` (RVA `0x14028`, `FUN_10014028`) — both RVAs identified via `analysis/ghidra/exports/LmxProxy.dll.string-refs.tsv` rows 119 / 122 (same `STRING - Server Handle` xref pattern `AdviseSupervisory` uses). Both go through a shared `hookSuspendActivate(rva, name, eventVerb)` helper plus a new `readMxStatusOut(ptr)` that decodes the `MxStatus*` out-param as 4 × i16 (`Success / Category / DetectedBy / Detail`, matching `src/MxNativeCodec/MxStatus.cs`). Hooks emit `mx.suspend.begin/end` and `mx.activate.begin/end` events for grep-ability. **No `Resume` / `Reactivate` sibling exists** — verified against `analysis/decompiled-mxaccess/ArchestrA/MxAccess/ILMXProxyServer5.cs` (only `Suspend` DispId 1610940418 + `Activate` DispId 1610940419 declared). Re-run procedure documented in the script header (rebuild x86 `MxTraceHarness`, run with `--scenario=suspend-advised --tag=TestChildObject.ScanState` + `--scenario=activate-advised`, save under `captures/NNN-frida-suspend-activate-instrumented/`, grep `mx.suspend.*` / `mx.activate.*` and correlate with `nmx.enter` in the same time window — if no NMX traffic accompanies the hook fires, R5 closes as "client-side only"). R5 in `design/70-risks-and-open-questions.md` updated to point at F46 as the next-step. Live capture run is maintainer-side optional (no AVEVA install attached to the dev box).
|
||
|
||
### F41 — `cargo public-api` baseline
|
||
**Resolved:** 2026-05-06 (commit `9e57bfd`). Baselines for all 9 workspace crates committed under `design/public-api/{crate}.txt`, generated via `cargo +nightly public-api --simplified -p <crate>`. Per-crate sizes: `mxaccess-codec` 2516 lines, `mxaccess-asb` 1258, `mxaccess-rpc` 1273, `mxaccess-asb-nettcp` 708, `mxaccess` 542, `mxaccess-galaxy` 374, `mxaccess-callback` 170, `mxaccess-compat` 123, `mxaccess-nmx` 118. `design/public-api/README.md` documents the update procedure (install nightly + cargo-public-api, regenerate the affected baseline on intentional API changes, commit alongside). `.github/workflows/rust.yml` gains a `public-api` job that runs the same diff against the committed baseline; drift fails CI with a unified diff in the log so the PR author can either revert or update the baseline.
|
||
|
||
### F43 — Release prep: `cargo publish --dry-run` all crates
|
||
**Resolved:** 2026-05-06 (commit `7b15c85`). New `CHANGELOG.md` covers the V1 release notes for all 9 workspace crates, the M0–M6 milestone closeouts, deliberate divergences from the .NET reference (multi-record DataUpdate codec relaxation per F44; buffered single-sample stream per R2), and known limitations (F3 / F45 / F46 / R3 / R4). `cargo publish --dry-run` passes for the leaf crates (`mxaccess-codec`, `mxaccess-rpc`, `mxaccess-asb-nettcp`); dependent crates fail with "no matching package" against crates.io as expected (the registry lookup happens even with `--no-verify`) — those are validated by the build-test-clippy + public-api matrix and will dry-run cleanly after the leaves are actually published. Path deps in each per-crate `Cargo.toml` now carry `version = "0.0.0"` specifiers so cargo can fall back to the version constraint when the path is unavailable post-publish. Documents the dependency-ordered publish sequence in CHANGELOG so the V1 cut can be done in one pass.
|
||
|
||
### F35 — `mxaccess-compat` LMXProxyServer-shaped facade
|
||
**Resolved:** 2026-05-06 (commit `d5aa152`). 18-method `ILMXProxyServer5` surface ported as Rust async fns over `mxaccess::Session` (NMX) and `mxaccess::AsbSession` (ASB). `crates/mxaccess-compat/src/lib.rs` (~1250 lines) exposes a top-level `LmxClient` facade with a `tokio::sync::Mutex<HashMap<i32, ItemRef>>` handle table + `AtomicI32` monotonic counters. Event surface is four `tokio::sync::broadcast` channels surfaced as `EventStream<T>` (a custom `Stream` impl that skips `BroadcastStream::Lagged` errors per Q4's "Streams not COM events" verdict). `Advise` spawns a fan-out task that drains the underlying `Subscription` and routes to either `on_data_change` or `on_buffered_data_change` based on the item's `is_buffered` flag. 25 unit tests cover the handle-table lifecycle (Add → Advise → UnAdvise → Remove with a mock task injected directly into the table — wire-side `Session::subscribe` is wave 2), monotonic handle allocation, `add_item_2` context-prefix combination, `SetBufferedUpdateInterval` rounding (`50 → 100`, `101 → 200`, zero rejection), each of the four event streams, `un_advise` idempotency, and a compile-time dispatch-table check. Methods that don't yet have a corresponding `Session` API (e.g. `WriteSecured`) mirror the upstream `Error::Unsupported` rather than fabricate behaviour. Per R6 verification, `WriteSecured` always takes two user ids — single-user secured writes pass the same id twice. Sub-followups: F45 (recovery replay for buffered subscriptions), R3 (OperationComplete trigger — channel wired but no firing path until a captured byte mapping lands).
|
||
|
||
### F40 — Optional `metrics` feature: counters + histograms
|
||
**Resolved:** 2026-05-06 (commit `ad1cf23`). Optional `metrics` Cargo feature on `mxaccess`. Default build: zero `metrics` dep + zero runtime cost (`cargo tree -p mxaccess | grep metrics` is empty). Behind `--features metrics` (using `metrics 0.24`): counters `mxaccess.session.{writes,reads,advises,unadvises,recovery_attempts,recovery_successes}` (labeled `transport={nmx|asb}`) + ASB counters `mxaccess.asb.{writes,reads}` + histograms `mxaccess.session.{write,read}.latency_seconds` + gauges `mxaccess.session.{connected,registered_items,active_subscriptions}`. New `crates/mxaccess/src/metrics.rs` (275 lines) holds thin `pub(crate) fn` wrappers (one per metric) gated with `#[cfg(feature = "metrics")]`; call sites in `session.rs` + `asb_session.rs` invoke them unconditionally so the feature gate is inside the wrapper, not at the call site. Module-level docs enumerate every emitted name + label dimension + semantic meaning. Includes a `#[cfg(all(test, feature = "metrics"))]` unit test that installs `metrics::with_local_recorder` and asserts counters advance. Deferred: `mxaccess.session.subscribe.first_data_change_seconds` (reserved name; needs `Subscription::poll_next` instrumentation), ASB write/read/publish latency histograms.
|
||
|
||
### F44 — Decode buffered batch + suspend captures (`077, 079-082, 094`)
|
||
**Resolved:** 2026-05-06 (commit `ad1cf23`). Six captures walked: `077-frida-suspend-advised-scanstate`, `079-frida-add-buffered-advise-testint`, `080-frida-buffered-external-write-testint`, `081-frida-write-testint-after-buffered`, `082-frida-add-buffered-plain-advise-testint`, `094-frida-buffered-separate-writer`. Each gets a per-capture summary (call sequence, key wire bytes, verdict) in new `docs/M6-buffered-evidence.md`. **R2 verdict: confirmed silently as "not a real risk"** — single-sample observed across 079/080/082/094. The `OnBufferedDataChange` path delivers one sample per event with a server-side cadence knob, not multi-sample bundles; matches `wwtools/mxaccesscli/docs/api-notes.md:97-100,138-140,154-157`. **R5 trigger conditions documented from capture 077**: `AdviseSupervisory` + `Suspend` pair, 1-second intervals, succeeds on enum attributes (`ScanState`); the `LmxProxy.dll!CLMXProxyServer.Suspend` / `.Activate` wire emission was NOT instrumented in this capture so a residual gap is filed as F46 (re-run with the Frida hook added). `design/70-risks-and-open-questions.md` R2 + R5 status updated accordingly.
|
||
|
||
### F36 — `Session::subscribe_buffered` (NMX) per R2 single-sample-per-event answer
|
||
**Resolved:** 2026-05-06. `Session::subscribe_buffered(reference, BufferedOptions { update_interval_ms })` returns the same `Subscription` (`Stream<Item = Result<DataChange, Error>>`) as plain `subscribe`. Wire path mirrors `MxNativeSession.RegisterBufferedItemAsync` (`MxNativeSession.cs:272-310`): the `item_definition` is suffixed with `.property(buffer)` via `NmxReferenceRegistrationMessage::to_buffered_item_definition`, then a single LMX `RegisterReference` (opcode `0x10`) frame is dispatched with `subscribe = true` — no separate `AdviseSupervisory` is needed (the captures `082-frida-add-buffered-plain-advise-testint` and `079-frida-add-buffered-advise-testint` show exactly one `RegisterReference` between `mx.set-buffered-interval` and the first `OnBufferedDataChange`, and zero `AdviseSupervisory` frames). `BufferedOptions::rounded_update_interval_ms` rounds the requested cadence up to the nearest 100ms per `MxNativeCompatibilityServer.cs:638` (`((updateInterval + 99) / 100) * 100`); the rounded value is held client-side because native MXAccess does not emit a `SetBufferedUpdateInterval` RPC (verified by the captures' `mx.set-buffered-interval.begin/end` events producing no NMX traffic). New example `crates/mxaccess/examples/subscribe-buffered.rs` exercises a 1-second cadence against the live AVEVA install (gated by `MX_LIVE`). New round-trip parity test `crates/mxaccess-codec/tests/buffered_register_reference_parity.rs` validates the wire-byte sequence against captures `079` + `082`. F36 spawns sub-followup F45 (recovery replay must re-issue `RegisterReference` for buffered subscriptions; current `recover_connection_core` replays them via `AdviseSupervisory` and loses the buffered shape on a transport rebuild).
|
||
|
||
### F37 — ASB `subscribe_buffered` capability gate
|
||
**Resolved:** 2026-05-06 (commit `34045c2`). `AsbSession::subscribe_buffered` returns `Error::Unsupported { transport: TransportKind::Asb, operation: ... }` synchronously without touching the wire — ASB has no `SetBufferedUpdateInterval` analogue; the per-monitored-item `MinimalMonitoredItem::sample_interval` is the rate-limit knob instead. The error-construction logic is split into a free fn so the gate's exact shape is unit-testable without spinning up a live authenticator + transport. Workspace 758 → 759 tests; clippy clean.
|
||
|
||
### F38 — Counting-allocator `cargo bench` harness
|
||
**Resolved:** 2026-05-06 (commit `71c69b8`). Hand-rolled `GlobalAlloc` wrapper + atomic counters in `crates/mxaccess-codec/benches/alloc_count.rs`; `cargo bench -p mxaccess-codec` runs the proven matrix (write encode for Int32/Float32/Float64/Boolean/String, `MxReferenceHandle::from_names`, `NmxSubscriptionMessage::parse_inner`) and reports allocs/op + bytes/op + deallocs/op. Baseline numbers committed to `design/M6-bench-baseline.md`. Bench gates on R12 (< 5 allocs/write) — exits with code 1 on violation; current baseline is 1–4 allocs/op across the matrix, well under the target.
|
||
|
||
### F39 — Zero-copy codec pass (per R12)
|
||
**Resolved:** 2026-05-06 (closed via F38 measurements, no code change required). The R12 target (< 5 allocations per write at steady state) is already met across the proven matrix without any zero-copy rewrite — scalar writes are 1–2 allocs/op, String writes 4 allocs/op (5-char string), `MxReferenceHandle::from_names` 2 allocs/op, `NmxSubscriptionMessage::parse_inner` 1 alloc/op. The remaining nice-to-have optimisations (`BytesMut` output buffer to enable downstream zero-copy splits, name-signature cache to elide the two `compute_name_signature` UTF-16LE conversions per `from_names`, session-level scratch pool to drop per-write count from 2 → 1) are documented in `design/M6-bench-baseline.md` as post-V1 work — they don't gate M6 DoD because R12 is already satisfied.
|
||
|
||
### F42 — `cargo doc` cleanup pass
|
||
**Resolved:** 2026-05-06 (commit `e79e289`). All 33 rustdoc warnings across the workspace fixed: unresolved intra-doc links rewritten as fully-qualified `[Type::method]` / `[crate::module::name]` forms or backtick text where no link target exists; bracket text that was being interpreted as link refs (e.g. `body[17]`) escaped to backtick form; private-item references in public docs (`CALLBACK_BROADCAST_CAPACITY`, `recover_connection_core`, `mxvalue_to_writevalue`) reduced to backtick text. `RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps` exits clean. Workspace 759 tests pass; clippy clean. The optional `#![warn(missing_docs)]` lint is deferred — it would surface hundreds of low-priority public-item gaps that are out of scope for this F-number; it can be re-evaluated in F41 (`cargo public-api`) when the public surface is final.
|
||
|
||
### F18 — M5 plan of attack (ASB transport, parallel-safe sub-streams)
|
||
**Resolved:** 2026-05-06 — all sub-followups F19–F26 closed plus F28 / F29 / F30 / F31 / F32 / F33 / F34 layered on top. M5 is functionally LIVE end-to-end: `cargo run -p mxaccess --example asb-subscribe -- --tag TestChildObject.TestInt` against the AVEVA install successfully exercises Connect → AuthenticateMe → RegisterItems → Read → CreateSubscription → AddMonitoredItems → Publish (delivers tag value) → DeleteMonitoredItems → DeleteSubscription → UnregisterItems → Disconnect with canonical-XML HMAC signing on every signed op. **Severity:** P0 — milestone driver, blocks ASB consumers + V1 release.
|
||
**Source:** `design/dependencies.md:73-89` + `design/60-roadmap.md:84-91` + `design/70-risks-and-open-questions.md:5-25`.
|
||
|
||
**M5 DoD per `design/60-roadmap.md:91`:**
|
||
1. ✅ `cargo run -p mxaccess --example asb-subscribe` succeeds against the live AVEVA endpoint — Read returns the real tag value, Publish stream delivers monitored values via the F26 stream (`AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] }`).
|
||
2. ⚠️ Wire structure matches .NET's request bytes byte-for-byte for AuthenticateMe / Register / AddMonitoredItems (verified via `asb-relay` middleman with the .NET probe routed through ClientVia + the captured `add-monitored-items-request-wire.bin` fixture for F34). Strict byte-identical parity for the response side is not guaranteed because WCF chunks `Bytes8/16/32` records at different boundaries — both forms are functionally equivalent and `collect_asbidata_payloads` concatenates chunks (commit `cf97eab`). Canonical XML for the 13 signed ops is byte-equal to .NET's `XmlSerializer.Serialize` output (F28 fixture-comparison tests).
|
||
3. ⚠️ Type matrix: only Int32 verified live (the captured `TestChildObject.TestInt` tag). Bool / Float / Double / String / DateTime / Duration / arrays not yet exercised against live MxDataProvider — three-type live coverage was the deployable maximum on this dev host (F32 closed via option (b): missing types are Galaxy-provisioning-gated, not codec-gated).
|
||
4. ✅ `cargo build --workspace` + `cargo test --workspace` (758 tests) + `cargo clippy --workspace -- -D warnings` all green.
|
||
|
||
**M5 sub-followup closeout:**
|
||
- ~~F19~~: workspace deps for `aes` / `hmac` / `md-5` / `sha1` / `sha2` / `pbkdf2` / `flate2` / `rand` / `crypto-bigint` / `quick-xml` / `tokio-util`.
|
||
- ~~F20~~ (NMF framing), ~~F21~~ (NBFX node codec), ~~F22~~ (NBFS static dictionary), ~~F23~~ (auth crypto), ~~F24~~ (`AsbVariant` codec), ~~F25~~ (`IASBIDataV2` client end-to-end), ~~F26~~ (`mxaccess::AsbSession` over `AsbTransport` + `Stream<Item = MonitoredItemValue>`).
|
||
- ~~F28~~: canonical-XML HMAC signing for all 13 `ConnectedRequest` shapes (XmlSerializer-byte-equal vs .NET fixtures; legacy NBFX-bytes fallback retired).
|
||
- ~~F29~~: `nbfs.rs` re-aligned to canonical `[MC-NBFS]` / `ServiceModelStringsVersion1` table.
|
||
- ~~F30~~: dict-id resolution post-pass turns `Static(id)` element/attribute names back into their string forms on the read side.
|
||
- ~~F31~~: InvalidConnectionId-on-first-Register-after-AuthenticateMe pattern resolved (cool-down + retry).
|
||
- ~~F32~~: live type-matrix coverage capped at the deployable maximum on this dev host.
|
||
- ~~F33~~: InvalidConnectionId tolerance pattern propagated to all 8 ConnectedRequest response decoders + the F26 stream's publish-loop terminates cleanly on server-side rejection.
|
||
- ~~F34~~: `MonitoredItem` wire format uses DataContract field-suffix names (`activeField` / `bufferedField` / `itemField` / etc.) under prefix `b` bound to the DC namespace — verified live (F26 stream now delivers values).
|
||
|
||
**Cumulative execution log.** F19 + F23 (`ed17c07`); F24 (`7611d9e`); F20 (`9dfd193`); F22 (`43c10a1`); F21 (`5f98558`); F25 step 1 (`25dbd8d`); F25 step 2 (`a2b8989`); F25 step 3 (`c4bf0a0`); F25 step 4 (`1e59249`); F25 step 5 (`9b8133f`); F25 step 6 (`321b796`); F25 step 7 (`1b1ee1e`); F26 step 1 (`8a0f92b`); F26 step 2 (`14bb529`); example rewrite (`c6570dc`); F25 step 8 (`b543eb1`); F25 step 9 (`0441a2e`); F25 step 10 (`9876b4e`); F25 live-bring-up reconciliation (NBFX `PrefixElement_a..z` + xmlns redeclaration + SOAP-fault surfacing); F26 step 3 (`AsbSession` cheap-clone async API); F28 step 1 (`f14580e`) + step 2; F29 / F30 / F31 / F32 / F33; F34 (`101a8b1`). For per-step detail, see the matching commit message — `git show <hash>` is the authoritative record.
|
||
|
||
**Architectural note (kept for future maintenance):** `mxaccess::AsbSession` is deliberately **parallel** to the NMX-shaped `Session` rather than unified. The NMX `Session` carries orchestration (`CallbackExporter`, callback router task, recovery broadcast, `INmxService2` mutex) that has no ASB analogue, and ASB's request/response loop over a single TCP stream maps naturally to `Mutex<AsbClient>` — the two paths converge at the consumer-facing `mxaccess` API but stay distinct at the orchestration layer. `AsbSession` is `Clone + Send + Sync` via `Arc<AsbSessionInner>`, so each `clone()` is `O(1)` and the inner mutex serialises operation calls.
|
||
|
||
### F34 — `MonitoredItem` wire format: DataContract field-suffix names, not XmlSerializer property names
|
||
**Resolved:** 2026-05-06 (commit `101a8b1`). **Severity:** P2 — affected the F26 stream's data flow against MxDataProvider; canonical-XML HMAC signing for the operation was already verified working (server accepted the request, returned a non-fault response).
|
||
|
||
**Two halves, both closed:**
|
||
|
||
**Half 1 — Response decoder (closed earlier).** `decode_publish_response` previously filtered empty `<ASBIData/>` placeholders out of the positional payload list. Captured the full S→C bytes of a working `PublishResponse` via `examples/asb-relay.rs` between the .NET probe and MxDataProvider (fixture stashed at `crates/mxaccess-asb/tests/fixtures/publish-response-with-value.bin`). The wire shape is `<Status><ASBIData/></Status><Values><ASBIData>{bytes}</ASBIData></Values>` — Status is empty-but-present, Values carries the binary `MonitoredItemValue[]`. `collect_asbidata_payloads` previously skipped the empty Status, shifting Values down to index `0` where the decoder mis-read it as Status and corrupted the parse. Fix: always push every `<ASBIData>` element as a positional entry, empty or not. `tests/publish_capture.rs` runs the full decode chain over the real wire bytes and asserts `values.len() == 1`.
|
||
|
||
**Half 2 — Request body emitter (closed by this commit).** Rewrite of `push_monitored_item_body` (`crates/mxaccess-asb/src/operations.rs`) replaces the legacy XmlSerializer property names (`<MonitoredItem>`, `<Item>`, `<SampleInterval>`, `<Active>`, `<Buffered>`) with the WCF DataContract field-suffix names emitted under prefix `b` bound to `http://schemas.datacontract.org/2004/07/ArchestrAServices.ASBIDataV2Contract`. Children: `<b:MonitoredItem>` with `<b:activeField>`, `<b:activeFieldSpecified>`, `<b:bufferedField>`, `<b:itemField>` (with nested ItemIdentity DC fields `<b:contextNameField>` / `<b:idField>` / `<b:idFieldSpecified>` / `<b:nameField>` / `<b:referenceTypeField>` / `<b:typeField>`), `<b:sampleIntervalField>`, `<b:timeDeadbandField>`, `<b:timeDeadbandFieldSpecified>`, `<b:userDataField>` (Variant), `<b:valueDeadbandField>` (Variant). The `<Items>` wrapper now declares `xmlns:b` + `xmlns:i` (XSI). Wire-byte type encoding matches the captured fixture: `bool` → Bool record; `ulong` → Zero/One/Chars (decimal text via XmlConvert); `ushort` → Zero/One/Int8/Int16/Int32 (smallest-fit binary); `int32` → same. Empty `string?` and null `byte[]?` emit as empty elements (no `<i:nil>` attribute, matching the wire). Field order follows the explicit `[DataMember(Order = N)]` declarations from `AsbContracts.cs:940-965`. The canonical-XML HMAC-signing emitter at `xml_canonical::emit_monitored_item` is unchanged (still XmlSerializer-property names) — F28 fixture-byte-equality holds for all 13 ops.
|
||
|
||
**The dual-format world** (the root insight that drove the fix): ASB requests have *two* element-name conventions on the wire — **HMAC canonical XML** (input to `AsbAuthenticator::Sign`) uses XmlSerializer-derived names (`<Active>`, `<Items>`, `<MonitoredItem>`); **binary NBFX body** (the actual wire request) uses DataContractSerializer-derived names (`<b:activeField>`, `<b:bufferedField>`, etc.). For ops where the body is purely `IAsbCustomSerializableType` arrays (Read, Register, Unregister), no DataContract names appear — every payload is wrapped as `<Items><ASBIData>{bytes}</ASBIData></Items>` (binary fast-path) and our builders were already correct. The DC schema only matters for ops carrying non-`IAsbCustomSerializable` types like `MonitoredItem` and (likely) `WriteValue`.
|
||
|
||
**Captured ground-truth dictionary** (from `tests/fixtures/add-monitored-items-request-wire.bin` — `tests/add_monitored_items_request_capture.rs` decodes it). The .NET WCF binary writer pre-declares 23 strings in the per-message dynamic dictionary including the wrapper / array / namespace strings plus all DC field names: `activeField`, `activeFieldSpecified`, `bufferedField`, `itemField`, `contextNameField`, `idField`, `idFieldSpecified`, `nameField`, `referenceTypeField`, `typeField`, `sampleIntervalField`, `timeDeadbandField`, `timeDeadbandFieldSpecified`, `userDataField`, `lengthField`, `payloadField`, `valueDeadbandField`. The dictionary-id pre-population that .NET's WCF binary writer uses is a perf optimisation; an inline-string emit works for correctness — and that's what our rewrite does.
|
||
|
||
**Verification:**
|
||
1. New unit test `add_monitored_items_body_uses_data_contract_field_names` (asserts every DC field name appears under prefix `b` in `[DataMember(Order = N)]` sequence, with the legacy XmlSerializer names absent).
|
||
2. Live `cargo run -p mxaccess --example asb-subscribe -- --tag TestChildObject.TestInt` against the AVEVA install: `AddMonitoredItems` returns 1 status item with `error_code=0x0000` (was 0 items previously); `Publish` poll #4 delivers the actual tag value through the F26 stream as `AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] }`. Workspace `cargo test` 757 → 758 pass; clippy clean.
|
||
|
||
**Bonus context discovered while debugging F34:**
|
||
- `MinimalMonitoredItem` gained an `active: Option<bool>` field with `with_active(item, interval, active)` constructor. Without `<Active>true</Active>` on the wire (or its DC equivalent `<b:activeField>true</>`+`<b:activeFieldSpecified>true</>`), MxDataProvider treats the subscription as inactive even when AddMonitoredItems "succeeds" — F26 stream then never sees values.
|
||
- `SampleInterval` unit corrected from "100-ns ticks" to **milliseconds** in the example + the `MinimalMonitoredItem.sample_interval` doc — matches `MxAsbDataClient.cs:441`'s `ulong sampleInterval = 1000` default.
|
||
- `result_code = 32` is `AsbErrorCode.PublishComplete` (`AsbResultMapping.cs:37`), informational not fatal — `ToResult:122-129` treats it like `Success`. F26 stream's `publish_loop` narrowed to bail only on `RESULT_CODE_INVALID_CONNECTION_ID = 1`.
|
||
|
||
### F28 — Canonical XML serialiser for `ConnectedRequest` signing (matches `XmlSerializer.Serialize` byte-for-byte)
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`). All 13 `ConnectedRequest` shapes now sign over byte-identical canonical XML; the legacy NBFX-bytes fallback is gone from every `client::*` op. Hardens the ASB transport against deployments with a non-empty `hashAlgorithm` registry value (where the server's HMAC validation actually runs).
|
||
|
||
**Two-step closure**:
|
||
1. **Step 1 (commit `f14580e`, 2026-05-05)** — landed the 5 `[XmlSerializerFormat]` ops (AuthenticateMe, Disconnect, KeepAlive, RegisterItems, UnregisterItems) plus the per-action `ValidatorWireFormat` selector + DH-params-from-registry + dynamic-dict id management. Live AuthenticateMe + RegisterItems verified end-to-end (commit `9063f10`).
|
||
2. **Step 2 (this commit)** — extended `MxAsbClient.Probe --dump-signed-xml` to emit the 8 remaining shapes (ReadRequest, WriteBasicRequest, PublishWriteCompleteRequest, CreateSubscriptionRequest, DeleteSubscriptionRequest, AddMonitoredItemsRequest, DeleteMonitoredItemsRequest, PublishRequest) against deterministic field values. Saved fixtures at `rust/crates/mxaccess-asb/tests/fixtures/signed-xml/{read,write-basic,publish-write-complete,create-subscription,delete-subscription,add-monitored-items,delete-monitored-items,publish}-request.xml`. Pinned byte sizes 981 / 1497 / 741 / 814 / 793 / 1768 / 1782 / 771. Ported 8 emitters in `mxaccess-asb::xml_canonical`: `emit_read_request_xml`, `emit_write_basic_request_xml`, `emit_publish_write_complete_request_xml`, `emit_create_subscription_request_xml`, `emit_delete_subscription_request_xml`, `emit_add_monitored_items_request_xml`, `emit_delete_monitored_items_request_xml`, `emit_publish_request_xml`. New helpers: `emit_invensys_text` (primitives in the parent ns), `emit_write_value` (`<Values>` wrapper inlining `Value`/`Status`/`Comment`), `emit_monitored_item` (`<Items>` wrapper with `Item`/`SampleInterval`/`ValueDeadband`/`UserData`/`Buffered`), `emit_inline_item_identity` (ItemIdentity as a child of MonitoredItem with shared parent xmlns), `emit_inline_text` / `emit_inline_optional_string` (no-xmlns-redeclaration variants), `emit_idata_variant` (Variant's `Type`/`Length`/`Payload` in the `idata.data` namespace), `emit_iom_default_variant` (default-shape Variant for `ValueDeadband` / `UserData`). New private helper `AsbClient::pre_signing_validator()` consolidates the 8 call-site repetitions of `(connection_id, peek_next_message_number, "", "")`.
|
||
|
||
**Wired into `client::*`**: every `send_signed_envelope[_one_way]` call now passes `Some(&xml)` for `xml_for_signing` — the legacy NBFX-bytes fallback path inside `send_signed_envelope` is unreachable from the standard client. (The path itself stays in place to allow lower-level callers and tests to exercise the fallback.) The 8 ops affected: `read`, `write`, `publish_write_complete`, `delete_monitored_items`, `create_subscription`, `add_monitored_items`, `publish`, `delete_subscription` (plus their `_once` retry-loop variants for the ops that retry on `InvalidConnectionId`).
|
||
|
||
**Verification**: 8 new fixture-comparison tests (each emitter byte-equal vs the .NET fixture on the first try, no iteration). Workspace `mxaccess-asb` 87 → 95 tests; default-feature clippy clean. Live `cargo run -p mxaccess --example asb-subscribe` returns `TestChildObject.TestInt = 99` against AVEVA — proving `Read` (now signed via canonical XML) round-trips end-to-end where it previously used the legacy NBFX-bytes path. The other 7 ops are wire-tested only at fixture-byte-equality so far; live exercise is gated on the F33 follow-on capture for subscribe-flow ops, but the canonical XML produces byte-identical bytes to the .NET reference, so the HMAC will match by construction.
|
||
|
||
**Closes**: M5 DoD bullets 1+2 fully resolved across all 13 `ConnectedRequest` shapes. The `hashAlgorithm`-non-empty deployment shape is no longer latent — any future deployment with a real algorithm should sign correctly without further work.
|
||
|
||
### F16 — Real `Session::recover_connection` reconnect loop (re-bind + re-advise)
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`). Replaces the wave-2 no-op `recover_connection` with the full .NET-equivalent shape (`MxNativeSession.cs:399-474`).
|
||
|
||
Three pieces, all in `crates/mxaccess/src/session.rs`:
|
||
|
||
1. **Subscription registry on `SessionInner`** — new `subscriptions: Mutex<HashMap<[u8; 16], SubscriptionEntry>>` tracks every active advise. `subscribe()` inserts the (`correlation_id` → `SubscriptionEntry { metadata }`) row after a successful `AdviseSupervisory`. `unsubscribe()` removes it on the success path only — failed UnAdvises stay in the registry so the next recovery replays them. The consumer's `Subscription` handle still holds the BroadcastStream; the registry is purely for replay.
|
||
2. **Pluggable `RebuildFactory`** — public typedef `pub type RebuildFactory = Arc<dyn Fn() -> Pin<Box<dyn Future<Output = Result<NmxClient, NmxClientError>> + Send>> + Send + Sync>`. Installed via the new `Session::set_recovery_factory(factory)`; queryable via `Session::has_recovery_factory()`. Kept separate from `connect_nmx` / `connect_nmx_auto` so the existing constructors stay non-breaking — consumers opt in to recovery by calling the setter after-the-fact.
|
||
3. **Real `recover_connection` + `recover_connection_core`** — `recover_connection` is now the retry loop (mirrors `cs:399-440`): for `attempt in 1..=policy.max_attempts`, emit `RecoveryEvent::Started` → call `recover_connection_core` → emit `Recovered` on success (return) or `Failed { will_retry, error }` on failure (sleep `policy.delay`, retry, or bubble the last error after the budget is exhausted). `recover_connection_core` mirrors `cs:442-474`: rebuild NMX via the factory → `RegisterEngine2` with the saved `callback_obj_ref` (the same exporter is reused — no TCP listener restart) → optional `SetHeartbeatSendInterval` → snapshot the registry under the lock, then iterate replaying `AdviseSupervisory(correlation_id)` for each entry → atomically swap `*nmx_lock = replacement` (the old `NmxClient` drops at end of scope, closing its TCP transport).
|
||
|
||
Subscription correlation ids are preserved across the swap, so the consumer's `Subscription` stream continues to receive on its existing broadcast filter without observing the recovery event. The CallbackExporter stays bound across recoveries (no need to re-bind a TCP listener).
|
||
|
||
New error variant `ConfigError::RecoveryNotConfigured` returned when `recover_connection` is called without a factory installed. New public re-export: `RebuildFactory`.
|
||
|
||
R15's "long-lived connection task" was previously listed as a hard prerequisite, but the existing `Mutex<NmxClient>` already serialises concurrent operations during the rebuild — `recover_connection_core` holds the inner mutex during the swap, so concurrent ops just wait. Functionally equivalent to the long-lived-task design.
|
||
|
||
**Tests** (4 new in `mxaccess`):
|
||
- `recover_connection_without_factory_returns_recovery_not_configured` — no factory → `ConfigError::RecoveryNotConfigured`.
|
||
- `recovery_events_supports_multiple_subscribers` (updated) — Arc-shared Started event with a stub-failing factory.
|
||
- `recover_connection_with_always_failing_factory_exhausts_attempts` — pins (Started, Failed)×3 sequence + final `will_retry=false` + bubbled `TransportFailure` error.
|
||
- `subscribe_populates_registry_unsubscribe_clears_it` — subscribe → registry entry; unsubscribe → cleared.
|
||
|
||
Workspace `mxaccess` 65 → 67 tests; default-feature clippy clean. The `connect_nmx_auto`-side auto-population of the factory (capturing the `ntlm_factory` + discovered `(addr, service_ipid)` so consumers don't need to re-author the closure) is a future polish not required to close F16.
|
||
|
||
### F2 — NTLM verify_signature path + constant-time MAC compare (server-to-client direction)
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`). Structural port from `[MS-NLMP]` §3.4.4 — same shape as `sign` but uses the server-to-client (`S→C`) sub-keys derived alongside the client-to-server pair at the end of `create_type3`. The S2C key derivation already existed in `auth.rs` (the `seal_key`/`sign_key` helpers take a `client_mode` flag); F2 just plumbs them into a new `verify_signature(message, signature) -> Result<(), NtlmError>` method on `NtlmClientContext`.
|
||
|
||
`NtlmClientContext` gained four new fields populated during `create_type3`: `server_signing_key`, `server_sealing_key`, `server_sealing_state` (RC4), and `server_sequence` (independent counter). The verify path:
|
||
1. Validates `signature.len() == 16` and the leading version word `0x00000001`.
|
||
2. Reads the trailing 4-byte sequence number and compares against `self.server_sequence` (mismatch ⇒ `InvalidSignature`, no state change).
|
||
3. Computes `expected_mac = HMAC_MD5(server_signing_key, seq || message)[0..8]` then `RC4(server_sealing_state).Transform(expected_mac)`.
|
||
4. Constant-time compares `expected_mac` against wire bytes 4..12 via `subtle::ConstantTimeEq` (timing-oracle safe).
|
||
5. **On success**: commits the advanced cipher state + increments `server_sequence`. **On failure**: re-derives RC4 from `server_sealing_key` and skips past `server_sequence × 8` keystream bytes to restore the pre-verify position — caller can retry with a corrected signature.
|
||
|
||
New dep `subtle = "2"` (workspace-internal to `mxaccess-rpc`) for the constant-time MAC compare. **6 new tests pin every documented edge**: round-trip against `sign` (3-message sequence), corrupted-MAC rejection (with `server_sequence` non-advance assertion), wrong-sequence-number rejection, wrong-version-field rejection, wrong-length rejection, before-authenticate `NotAuthenticated` error. `mxaccess-rpc` 188 → 194 tests.
|
||
|
||
The "Awaiting wire-fixture capture" step listed in the prior status note is **no longer a hard prerequisite** — the algorithm shape is fully defined by `[MS-NLMP]` §3.4.4 and the round-trip tests prove the decoder/encoder pair is internally consistent. A captured `INmxSvcCallback::StatusReceived` frame would still validate byte-by-byte parity vs a real `NmxSvc.exe` server-side signer, but that's a future verification task; the structural port ships unblocked.
|
||
|
||
### F10 — `IObjectExporter::ResolveOxid2` (opnum 4) body codec
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`) per option (b) of the followup's resolve criterion: structural port from `[MS-DCOM]` §3.1.2.5.1.4. New `parse_resolve_oxid2_result` in `crates/mxaccess-rpc/src/object_exporter.rs` mirrors the opnum-0 parser exactly except for the extra `COMVERSION` slot (4 bytes: u16 major + u16 minor) wedged between `authn_hint` and `error_status`. New types: `ComVersion` and `ResolveOxid2Result`. The trailing-fields truncation check tightens from 24 bytes (opnum 0) to 28 bytes (opnum 4) to account for the COMVERSION slot.
|
||
|
||
`referent_id == 0` short-circuits to an empty `bindings` + `ComVersion::default()` + status from the trailing 4 bytes — same shape pattern as the opnum-0 parser. `mxaccess-rpc` 183 → 188 tests (+4 structural tests covering: short-stub error, referent-zero short-circuit, full one-binding round-trip with COMVERSION assertion, truncated-trailing error).
|
||
|
||
No live `ResolveOxid2` capture exists in this tree (the .NET reference doesn't call opnum 4); structural correctness is pinned against `[MS-DCOM]` §3.1.2.5.1.4 verbatim. Future captured frames will validate.
|
||
|
||
### F11 — `IRemUnknown::RemAddRef` and `RemRelease` body codecs
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`) — structural port from `[MS-DCOM]` §3.1.1.5.6. Both opnums share the same `REMINTERFACEREF[]` request shape (per `[MS-DCOM]` §2.2.19: 16-byte IPID + 4-byte cPublicRefs + 4-byte cPrivateRefs per element, prefixed by an `OrpcThis` header + u16 count + 2-byte NDR padding + u32 max_count). New encoders `encode_rem_add_ref_request` and `encode_rem_release_request` (the latter delegates to a shared `encode_remref_array_request` helper since the wire shape is identical between the two ops).
|
||
|
||
Response shape: `OrpcThat(8) + referent_id(4) + max_count(4) + N×4-byte HRESULT + error_code(4)` per the conformant-array convention established by `RemQueryInterface`'s response decoder. `referent_id == 0` short-circuits to an empty `per_ref_hresults` array. New `RemRefResponse` struct + `parse_remref_response` decoder shared between both opnums. New `RemInterfaceRef` struct.
|
||
|
||
4 new structural tests: AddRef request layout pin (88-byte total for a 2-element refs array), Release-vs-AddRef wire-shape equivalence, full HRESULT[] round-trip with two HRESULTs (success + E_FAIL), referent-zero short-circuit. Like F10, the .NET reference doesn't call these opnums; structural correctness is pinned against `[MS-DCOM]` §3.1.1.5.6 verbatim.
|
||
|
||
### F27 — Constant-time DH `mod_exp` (swap `num-bigint` → `crypto-bigint::DynResidue`)
|
||
**Resolved:** 2026-05-06 (commit `<this commit>`). Per the followup's own option (b): added a fixed-width `U2048` DH backend via `crypto-bigint::modular::runtime_mod::DynResidue`. New `auth.rs::constant_time_mod_exp(base, exp, modulus)` wrapper preserves the `BigUint`-in-`BigUint`-out API used by the byte-conversion helpers; the actual square-and-multiply chain runs in Montgomery form against the registry-supplied prime as a `U2048`. Both DH call sites (public-key generation in `AsbAuthenticator::new` at line 179, and shared-secret derivation in `crypto_key` at line 354) swap `BigUint::modpow` for the new wrapper.
|
||
|
||
`crypto-bigint::DynResidueParams::new` requires an odd modulus (Montgomery form's only restriction). DH primes in production are always odd by definition; the only exception is the `CryptoParameters::DEFAULT_PRIME_TEXT` test-fixture default, which ends in `4` (mathematically unsound for DH but kept for parity with the .NET reference's published default constant). For that case the wrapper falls back to the legacy `BigUint::modpow` — same wire bytes either way, so there's no fixture or HMAC-output divergence.
|
||
|
||
**Wire-byte parity verified**:
|
||
- Unit tests: 61 in `mxaccess-asb-nettcp` (was 61) — `auth.rs::deterministic_hmac_matches_dotnet_fixture` is the byte-for-byte ground-truth pin against captured .NET output (passphrase / prime / generator / private-key / remote-pub / message-number / connection-id / IV / consumer-data all pinned to deterministic values; `derive_validator_mac_iv` runs the full DH→PBKDF2→AES-CBC chain and asserts hex equality of every intermediate). Continues to pass after the swap.
|
||
- Live: `cargo run -p mxaccess --example asb-subscribe` — Connect handshake completes with apollo:V2 lifetime + `apollo=true`, proving the server accepted the constant-time-derived public key and the shared-secret-based AuthenticateMe. Tested 2026-05-06 against the local AVEVA install with the captured 768-bit `MX_ASB_DH_PRIME = 1552...7919` (odd; takes the constant-time path).
|
||
|
||
Workspace deps: `crypto-bigint = "0.5"` added to `[workspace.dependencies]` and to `mxaccess-asb-nettcp/Cargo.toml`. `num-bigint` retained for decimal-string parsing + .NET-LE byte conversion (crypto-bigint has neither). Default-feature clippy clean. The "review.md MAJOR finding" originally flagged at `design/30-crate-topology.md:269-274` is now closed.
|
||
|
||
### F33 — Live wire reconciliation for the ASB subscription path
|
||
**Resolved:** 2026-05-06 (commits `218f4c4`, `7a5f251`, `<this commit>`). `MX_ASB_TRACE_REPLY` capture during investigation revealed the live MxDataProvider returns a `Result` wrapper with `<resultCodeField>1</>` + `<successField>false</>` followed by **empty** `<ASBIData/>` payloads when it short-circuits on `InvalidConnectionId` — the same transient race F31 fixed for `RegisterItems`. The original F33 symptoms (`subscription_id = 0` from `CreateSubscriptionResponse`, `MissingField "Status"` from `AddMonitoredItemsResponse`) were both consequences of decoders not tolerating that wrapper shape, NOT a fundamentally different wire format. Three commits propagated the F31 tolerance pattern to every remaining response decoder and surfaced `result_code` / `success` so the F26 stream's publish-loop can detect failures cleanly.
|
||
|
||
1. `218f4c4` — `decode_read_response` + `client::read` retry loop. Added `result_code` / `success` to `ReadResponse`. Live verified: `TestChildObject.TestInt = 99` returned end-to-end where the prior run had bailed with `MissingField "Status"`.
|
||
2. `7a5f251` — same pattern for `decode_create_subscription_response` (returns `subscription_id = 0` sentinel when missing instead of erroring) + `decode_add_monitored_items_response`. Both ops gain F31-style retry loops in `client::create_subscription` / `client::add_monitored_items`.
|
||
3. `<this commit>` — pattern propagated to the remaining five decoders: `decode_publish_response`, `decode_unregister_items_response`, `decode_delete_monitored_items_response`, `decode_write_response`, `decode_publish_write_complete_response`. Shared `extract_result_status(body_tokens)` helper consolidates the per-decoder `find_text_in_named_element` calls. The F26 stream's `publish_loop` (`asb_session.rs::publish_loop`) now terminates the stream with a `ConnectionError::TransportFailure` carrying `"publish returned result_code 0xXX (server-side rejection)"` when `PublishResponse.result_code` is `Some(non_zero)` — preventing silent infinite-spin on `InvalidConnectionId`.
|
||
|
||
Live read still passes after all changes. `mxaccess-asb` 79 → 87 tests (+8 InvalidConnectionId tolerance tests via the shared `synthesise_invalid_connection_id_body` helper). Default-feature clippy clean.
|
||
|
||
The `examples/asb-subscribe.rs` Subscribe demo can be promoted from the current Read-loop form once a fresh live run confirms the active subscribe-flow doesn't surface additional wire-format gaps beyond the InvalidConnectionId race. The "session desync" observed in the original investigation should clear once the retry loops give the subscribe ops time to succeed.
|
||
|
||
### F12 — `NmxClient::create` (auto-resolving COM-activation factory)
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). Builds on F6: new `NmxClient::create(ntlm_factory)` constructor in `crates/mxaccess-nmx/src/client.rs`, gated on `cfg(all(windows, feature = "windows-com"))`. New crate-level feature `mxaccess-nmx/windows-com` propagates to `mxaccess-rpc/windows-com`. Mirrors `ManagedNmxService2Client.Create()` (`cs:30-64`) + `ResolveService` (`cs:491-523`) — six steps: (1) `com_objref_provider::marshal_activated_iunknown_objref("NmxSvc.NmxService", MarshalContext::DifferentMachine)` activates the COM class and emits an OBJREF blob; (2) `ComObjRef::parse` extracts `oxid` + `ipid` (the activated server's `IUnknown` IPID); (3) `resolve_oxid_with_managed_ntlm_packet_integrity` against `127.0.0.1:135` (RPCSS endpoint mapper) returns the server's `(host, port)` bindings + `IRemUnknown` IPID; (4) the `ncacn_ip_tcp` non-security binding's `host[port]` text is parsed via the new `parse_bracketed_host_port` helper (mirrors the .NET `ParseBracketedHost` / `ParseBracketedPort` pair, using `rfind` so FQDNs with `.` round-trip — matches `cs:540-561`); (5) a fresh transport binds to `IRemUnknown` and calls `RemQueryInterface(iunknown_ipid, INmxService2_IID, fresh_causality_id, public_refs=5)` — the `RemQiResult` carries the new `INmxService2` IPID; (6) a second fresh transport binds to `INmxService2` via `Self::connect`. The `ntlm_factory: impl FnMut() -> NtlmClientContext` closure is invoked **three times** (one per bind); callers are responsible for fresh contexts each call. New error variants: `NmxClientError::Activation(ProviderError)` (only with `windows-com`) and `NmxClientError::EndpointResolution { reason }` (covers no binding / parse failure / non-zero RemQI HRESULT). 6 offline tests on the host/port parser pin: extracts FQDN host + port, uses `rfind` for the rightmost brackets, rejects missing `[` / missing `]` / non-numeric port / port overflow. 1 live test (`#[ignore]`'d, gated on `MX_LIVE` + the `MX_TEST_*` Setup-LiveProbeEnv env triple) round-trips end-to-end against the AVEVA install — activates `NmxSvc.NmxService`, drives the full chain, asserts the resolved `service_ipid` is non-zero. Live verification: passes. Workspace tests went 17 → 23 in mxaccess-nmx (+6).
|
||
|
||
**Session-level wrapper (same commit):** `mxaccess::Session::connect_nmx_auto(ntlm_factory, options, resolver, recovery)` — gated on the new `mxaccess/windows-com` feature (which propagates to `mxaccess-nmx/windows-com`). Refactored `connect_nmx` to extract the post-NMX-bind orchestration into a private `from_nmx_client` helper; both `connect_nmx` and `connect_nmx_auto` funnel through it so the `CallbackExporter` + router-task + `RegisterEngine2` + heartbeat policy stays in one place. `connect_nmx`'s doc comment updated — the prior "F12 not yet wired" note is gone. With both layers landed, the .NET `MxNativeSession.Open` surface (`cs:127-147`) is reproduced end-to-end on the Rust side: callers no longer need to pre-resolve `(host, port, service_ipid)` by hand on Windows.
|
||
|
||
### F32 — Live type-matrix coverage for `asb-subscribe`
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). Closed via option (b) of the followup's own resolve criterion: the four missing types (Float / Double / DateTime / Duration) are gated on Galaxy-side provisioning that's outside the Rust port's scope. The deployed test Galaxy on this host only has `mx_data_type ∈ {1=Bool, 2=Int32, 5=String}` (verified via direct SQL probe of `dbo.dynamic_attribute`); we cannot exercise the missing types without authoring new template attributes in the Aveva console — a manual platform-engineering task, not a Rust port issue. The three-type live verification (Int32 = 99, String = `"mxaccesscli verified 17778523775"`, Bool = 0) at commit `9063f10` therefore satisfies the **type-matrix DoD bullet for what is deployable**. M5 DoD bullet #3 closes ✓ for the deployed shape; if a future deployment provisions the remaining four types, an `asb-typematrix.rs` integration test that loops over all seven types would make a clean follow-on. **Transient `InvalidConnectionId` race** noted in the original block remains as a known characteristic of the live MxDataProvider after many test cycles (settles after a 30-second cool-down); production deployments with a single long-lived session are unlikely to hit it.
|
||
|
||
### F6 — Port `ComObjRefProvider.cs` (OBJREF emitter via Win32 `CoMarshalInterface`)
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). New module `crates/mxaccess-rpc/src/com_objref_provider.rs` (~330 LoC including tests) gated on `cfg(all(windows, feature = "windows-com"))`. Pulls `windows = "0.59"` (features `Win32_Foundation` + `Win32_System_Com` + `Win32_System_Com_Marshal` + `Win32_System_Com_StructuredStorage` + `Win32_System_Memory`) as an optional dep behind the existing `windows-com` feature; default footprint stays slim. Public API mirrors `ComObjRefProvider.cs` 1:1: `MarshalContext` enum (InProcess / Local / DifferentMachine — wraps the `MSHCTX_*` newtype constants), `clsid_from_prog_id(&str) -> Result<GUID, ProviderError>` (wraps `CLSIDFromProgID`), `marshal_activated_iunknown_objref(prog_id, ctx)` (activates via `CoCreateInstance(CLSCTX_INPROC_SERVER | CLSCTX_LOCAL_SERVER | CLSCTX_REMOTE_SERVER)` then marshals), `marshal_iunknown_objref(unknown, ctx)` (uses `IUnknown::IID`), `marshal_interface_objref(unknown, iid, ctx)` (the underlying `CoMarshalInterface` over an HGlobal-backed `IStream`). All `unsafe` is internal to the module — public API exposes only typed Rust values, no raw pointers / HRESULTs / lifetime-bound interface pointers. Each `unsafe` block carries an inline SAFETY comment. `ProviderError` enumerates the four documented failure modes (UnknownProgId, ActivationFailed, MarshalFailed, GlobalLockFailed) plus the apartment-init pre-check (ApartmentInitFailed). Per-thread COM init via `OnceLock<()>` thread-local: lazy `CoInitializeEx(MULTITHREADED)` on first call; `S_FALSE` (already initialised) and `RPC_E_CHANGED_MODE` (thread is STA) treated as success — matches the .NET runtime's tolerant apartment behaviour. 4 offline tests pin: `MarshalContext` → `MSHCTX_*` mapping, `ensure_apartment` idempotence, `clsid_from_prog_id` returns `UnknownProgId` for fake ProgIDs, `marshal_activated_*` short-circuits at the resolution stage. 1 live test (`#[ignore]`'d, gated on `MX_LIVE`) round-trips the real `NmxSvc.NmxService`: activates, marshals, then parses the blob via `ComObjRef::parse` and asserts non-zero OXID + IPID. Live verification: passes against the AVEVA install on this host. Workspace tests went 183 → was 179 in mxaccess-rpc (+4 new). Unblocks F12 (NmxClient::create) — the auto-resolving COM-activation factory can now chain `marshal_activated_iunknown_objref` → `ComObjRef::parse` → `resolve_oxid_with_managed_ntlm_packet_integrity` → `RemQueryInterface` over the existing primitives.
|
||
|
||
### F14 — `tiberius`-backed SQL implementation of `Resolver` + `UserResolver`
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). New module `crates/mxaccess-galaxy/src/sql_resolver.rs` (~480 LoC) gated behind the existing `galaxy-resolver` Cargo feature; adds `SqlTagResolver` + `SqlUserResolver`, both constructed via `from_ado_string(&str)` accepting the same shape the .NET reference uses by default (`Server=localhost;Database=ZB;Integrated Security=True;Encrypt=False;TrustServerCertificate=True`). `Integrated Security=True` resolves to Windows authentication via tiberius's `winauth` feature. Each top-level call opens a fresh `Client<Compat<TcpStream>>` and drops it on return — matches the .NET `await using` shape. `tiberius`'s `Client::query` only accepts positional `@P1..@PN` placeholders (delegates to `sp_executesql`); the canonical `RESOLVE_SQL` / `BROWSE_SQL` / `USER_BY_GUID_SQL` / `USER_BY_NAME_SQL` constants are rewritten once-per-process via `OnceLock<String>` (`@objectTagName` → `@P1`, etc.). `read_metadata` mirrors `ReadMetadata` (`cs:149-165`) byte-by-byte: signed `smallint` → `i16` widened to `u16` for platform/engine/object IDs (matches the .NET `checked((ushort)...)`), `int` → `i32` checked-cast to `i16` for `property_id`, nullable `nvarchar` for `primitive_name`. `read_user_profile` mirrors `ReadProfile` (`cs:76-85`) including the `roles_text` blob → `parse_role_blob` round-trip. New deps: `tiberius 0.12` (`tds73`/`rustls`/`winauth` features, no `chrono` / `rust_decimal`), `tokio-util` `compat` feature for the futures-rs ↔ tokio AsyncRead bridge, `futures-util` for `TryStreamExt::try_next`. New `live` feature in the crate for parity with the workspace pattern (`live = ["galaxy-resolver"]`). 11 offline unit tests pin: SQL named→positional rewriting (no `@named` left, `@P1`/`@P2`/`@P3` present), line-count preserved by rewriting, ado-string acceptance (default Galaxy shape parses; garbage rejected), input validation (`max_rows=0` rejected, empty `LIKE` rejected, empty user_name rejected). Two `#[cfg(feature = "live")]` `#[ignore]`'d tests round-trip against a real Galaxy DB (gated on `MX_LIVE` + `MX_GALAXY_DB` env vars per `tools/Setup-LiveProbeEnv.ps1`): `live_resolve_test_child_object_test_int` (TestChildObject.TestInt → mx_data_type=2 Int32, is_array=false) and `live_browse_test_child_object` (browse returns ≥1 attribute on TestChildObject). Both pass against the local AVEVA install.
|
||
|
||
### F4 + F5 — BindAck body parser + captured-bytes round-trip
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). Single change closes both: new `BindAckPdu` struct + `BindAckResult` per-result type + `decode`/`encode` impl in `crates/mxaccess-rpc/src/pdu.rs`. Body layout per `[C706]` §12.6.3.4: `port_any_t` secondary address (u16-length + bytes including NUL) + alignment to 4-byte boundary + `n_results` u8 + 3 reserved + array of `p_result_t` (u16 result + u16 reason + 20-byte SyntaxId). Accepts both `PacketType::BindAck` and `PacketType::AlterContextResponse` (same body shape). New regression test `bind_ack_round_trips_live_capture` decodes the first 84 bytes of `captures/013-loopback-subscribe-scalars/tcp-stream-__1_49704-to-__1_55690.bin` (the server's response to the client's first Bind), asserts the shape (sec_addr=`"49704\0"`, n_results=2, NDR accepted + DCOM negotiate_ack reason 3), then re-encodes and asserts byte-identical against the original frame. Stronger live-wire parity than the prior synthetic-frame tests. F4 + F5 collapsed into one commit because they share scope (parser + round-trip-test).
|
||
|
||
### F29 — Align `mxaccess-asb-nettcp::nbfs` static dictionary ids with canonical `[MC-NBFS]` table
|
||
**Resolved:** 2026-05-05 (commit `<this commit>`). The original hand-curated table was wrong starting at id 74 — entries had been deduplicated/renumbered without preserving the canonical `id = 2 × StringN` mapping from `[MC-NBFS]` §2.2, leaving most of the SOAP-fault subset at the wrong ids (Fault at 114 instead of 134, Code at 122 instead of 142, etc.). Replaced with a faithful port of the first 200 entries from `dotnet/wcf` `ServiceModelStringsVersion1.cs` (covering id 0..400, the canonical SOAP / WS-Addressing / WS-Security / Trust / Algorithm-URI subset) plus the 436..444 xsi/xsd/nil extras already in place. Four new tests pin: (a) ids monotonic, (b) ids all even (odd reserved for dynamic dict), (c) full SOAP-fault subset (s, Fault, MustUnderstand, Code, Reason, Text, Node, Role, Detail, Value, Subcode) resolves, (d) xsi/xsd/nil round-trip via `position_of_static`. Future extensions: append more `ServiceModelStringsVersion1.StringN` entries as captures show new ids; mechanical extension.
|
||
|
||
### F31 — InvalidConnectionId on first Register after AuthenticateMe
|
||
**Resolved:** 2026-05-05 (commit `9063f10`). Not a HMAC bug — `AsbErrorCode.InvalidConnectionId` (= 1) is a transient race that .NET's `MxAsbDataClient.RegisterMany` (`cs:191-204`) handles with a 5-attempt retry loop and `100*attempt` ms backoff. `AuthenticateMe` is one-way (`AsbContracts.cs:18`); the server commits auth state asynchronously and a Register that arrives too quickly sees the connection in pre-authenticated state. `decode_register_items_response` now tolerates an empty `<ASBIData />` Status array and surfaces `Result.resultCodeField` + `successField`; `AsbClient::register_items` retries up to 5 times on `RESULT_CODE_INVALID_CONNECTION_ID` (new public constant), mirroring .NET. Live verification: `register status: 1 item(s); first error_code = 0x0000` followed by `TestChildObject.TestInt = AsbVariant { type_id: 4, length: 4, payload: [99, 0, 0, 0] }` over the live wire.
|
||
|
||
### F30 — Resolve dict-id element/attribute names on the read side
|
||
**Resolved:** 2026-05-05 (commit `eb6c689`). `decode_envelope` now runs a post-pass over `body_tokens` that substitutes `NbfxName::Static(id)` → `NbfxName::Inline(name)` and `NbfxText::DictionaryStatic(id)` → `NbfxText::Chars(name)` whenever the wire dict id resolves. Lookup tries the per-message binary header strings first, then the cumulative session dynamic dict, then the `[MC-NBFS]` static table (even ids). Tokens with unresolvable ids stay opaque so trace output still reveals them. Was the unblocker for F31: without it the server's `<b:resultCodeField>1</>` element came back as `<b:Static(43)>1</>` and the failure looked like a HMAC mismatch instead of a transient retryable error.
|
||
|
||
### F7 — Consolidate `Guid` type across `mxaccess-rpc`
|
||
**Resolved:** 2026-05-05 in this iteration's commit. `Guid` was hoisted from `objref::Guid` into the new shared `crate::guid::Guid` module. `objref` and `pdu` now re-export from there; M2 wave 2's `orpc`, `object_exporter`, and `rem_unknown` import it directly. The OXID-resolve dual-string decoder additionally needs an owned protocol label (`format!("protseq_0x{:04x}", tower_id)` per `ObjectExporterMessages.cs:120`) — `ComDualStringEntry::protocol` was upgraded from `&'static str` to `Cow<'static, str>` to support both decoders without the agent's interim `Box::leak` workaround.
|
||
|
||
### F8 — `RpcError` is duplicated across `objref` and `pdu` modules
|
||
**Resolved:** 2026-05-05 in this iteration's commit. `RpcError` was hoisted into the new shared `crate::error::RpcError` module as a single union of all wave 1 variants plus a generic `Decode { offset, reason: &'static str, buffer_len }` variant for the wave 2 ORPC parsers' one-off failures. `objref` and `pdu` re-export from there; M2 wave 2's `orpc`, `object_exporter`, and `rem_unknown` use it directly.
|
||
|
||
### F13 — `NmxClient` high-level write/advise/subscribe wrappers
|
||
**Resolved:** 2026-05-05. All seven wrappers landed in `crates/mxaccess-nmx/src/client.rs`: `write`, `write2`, `write_secured2`, `advise_supervisory`, `send_observed_pre_advise_metadata`, `register_reference`, `un_advise`. Each takes a `GalaxyTagMetadata` + a typed `WriteValue` (re-exported from `mxaccess-codec`), builds the inner NMX body via `mxaccess-codec` (`write_message::encode` / `encode_timestamped` / `secured_write::encode` / `NmxItemControlMessage` / `NmxMetadataQueryMessage` / `NmxReferenceRegistrationMessage`), wraps in `NmxTransferEnvelope`, and routes through `transfer_data`. The pure-codec `encode_*_transfer_body` helpers are extracted as `pub(crate) fn` for testability, mirroring the .NET reference's `internal static` shape. `un_advise` preserves the .NET reference's quirky `NmxTransferMessageKind::Write` envelope (not `ItemControl`) per `cs:457`.
|
||
|
||
### F15 — Callback router wires `CallbackExporter` events into `Subscription` stream
|
||
**Resolved:** 2026-05-05 across two commits.
|
||
- Step 1/2 (`2b849ae`): `Session::connect_nmx` now starts a `CallbackExporter` on a 127.0.0.1 ephemeral port, builds the OBJREF via `local_hostname()` + `127.0.0.1` fallback, registers it through `NmxClient::register_engine_2` (was `..._without_callback`). A `callback_router` task drains `CallbackEvent`s, decodes each `CallbackInvoked` body via `NmxSubscriptionMessage::parse_inner`, and broadcasts parsed messages on a `tokio::sync::broadcast` channel exposed via `Session::callbacks()`. Shutdown chains: UnregisterEngine → CallbackExporter::shutdown → wait for router task.
|
||
- Step 2/2 (this commit): `Subscription` now impls `Stream<Item = Result<DataChange, Error>>`. Filtering follows the .NET reference at `cs:333-343` exactly — `0x32` SubscriptionStatus messages are kept only when `message.item_correlation_id == subscription.correlation_id`; `0x33` DataUpdate messages pass through to ALL subscriptions because the codec exposes no per-record correlation field (matches the .NET `MxNativeCallbackEvent` filter behavior verbatim). Each `NmxSubscriptionRecord` with a parseable `value` becomes one `DataChange`. Records with `value: None` are dropped silently (mirrors the .NET `evt.Record.Value is null` filter at `cs:337`). Lag-loss surfaces as `Error::Configuration(InvalidArgument)` carrying the lag count. Stream-end (broadcast sender dropped) yields `None`. New helper: `filetime_to_system_time` (inverse of the existing `system_time_to_filetime`); saturates at Unix epoch for pre-1970 FILETIMEs. Tests cover correlation match/mismatch for `0x32`, `0x33` pass-through for any correlation, and FILETIME round-trip.
|
||
|
||
### F1 — NTLM consumer-layer helpers (workstation default + from_env constructor)
|
||
**Resolved:** 2026-05-05. `NtlmClientContext::from_env()` reads `MX_RPC_USER` / `MX_RPC_PASSWORD` / `MX_RPC_DOMAIN` (mirrors `ManagedNtlmClientContext.FromEnvironment` at `cs:41-49`); empty `MX_RPC_DOMAIN` is permitted. `local_hostname()` checks `COMPUTERNAME` then `HOSTNAME` and returns the empty string when neither is set — same "unavailable" semantics as `Environment.MachineName` returning null. Lives in `mxaccess-rpc/src/ntlm.rs`; deliberately doesn't pull `gethostname` (no native-libc deps, no `unsafe` for hostname lookup). Added `NtlmError::MissingEnvVar { name }` for the env-var-unset case. Test mod gained an `EnvScope` + `ENV_LOCK` mutex pattern for serializing process-global env mutation across parallel tests.
|
||
|
||
### F9 — `ObjectExporterClient.cs` ResolveOxid wrapper methods
|
||
**Resolved:** 2026-05-05. Both portable methods land in `crates/mxaccess-rpc/src/object_exporter_client.rs`: `resolve_oxid_unauthenticated` (mirrors `cs:14-30`) and `resolve_oxid_with_managed_ntlm_packet_integrity` (mirrors `cs:66-81`). Each opens a TCP connection, binds to `IObjectExporter`, calls opnum 0 with the encoded request, and decodes the response — preferring `parse_resolve_oxid_result` then falling back to `parse_resolve_oxid_failure` for short stubs. The two SSPI flavours (`ResolveOxidWithNtlmConnect`, `ResolveOxidWithNtlmPacketIntegrity`) wrap .NET's `System.Net.Security.SspiClientContext` and are explicitly out of scope for the Rust port — that's a permanent skip, not a deferral.
|
||
|
||
### F17 — `Guid::parse_str` helper (dashed-hex string parser)
|
||
**Resolved:** 2026-05-05. `Guid::parse_str(&str) -> Result<Guid, RpcError>` landed in `crates/mxaccess-rpc/src/guid.rs:65-112` as the inverse of the existing `Display` impl. Accepts the canonical dashed-hex form, optionally wrapped in `{}` braces (.NET `B` format), case-insensitive, and tolerant of bare 32-char hex without dashes. Single-pass char-by-char nibble accumulator avoids per-byte string allocation; the same byte-swap of groups 1-3 the Display impl does is applied after the raw hex pass. Eight new tests cover round-trip against the `Display` fixture (`b49f92f7-c748-4169-8eca-a0670b012746`), braces, uppercase, no-dashes, zero-GUID, too-short, too-long, and non-hex rejection. The five live-NMX examples (`connect-write-read`, `subscribe`, `recovery`, `multi-tag`, `secured-write`) lost their per-file 15-line `parse_guid` helpers in favour of the canonical implementation. Test count delta: 524 → 532 (+8).
|