[F36 + F40 + F44] M6 wave 1: subscribe_buffered (NMX) + metrics + evidence

Three M6 sub-followups landed in this wave (sub-agent worktrees +
manual reconciliation in main):

**F36 — Session::subscribe_buffered (NMX) per R2 single-sample**
- `BufferedOptions::rounded_update_interval_ms()` — 100ms rounding
  helper mirroring MxNativeCompatibilityServer.cs:638
  ((updateInterval + 99) / 100) * 100, saturating on overflow.
- `Session::subscribe_buffered` (public, lib.rs:604) delegates to
  the new private `subscribe_buffered_nmx` which uses the buffered
  RegisterReference path: item_definition suffixed with
  `.property(buffer)`, subscribe=true (no separate
  AdviseSupervisory follow-up — verified against capture 082).
- Per R2 verified at wwtools/mxaccesscli/docs/api-notes.md the wire
  semantic is single-sample-per-event with a server-side cadence
  knob; rounded_ms is held client-side only (native MXAccess does
  not emit a separate SetBufferedUpdateInterval RPC, verified by
  absence in 079/082 captures).
- New crates/mxaccess/examples/subscribe-buffered.rs.
- New crates/mxaccess-codec/tests/buffered_register_reference_parity.rs:
  4 tests (capture 079/082 round-trip, suffix helper, constructive
  forward-build vs capture 082).

**F40 — Optional metrics feature**
- New crates/mxaccess/src/metrics.rs (275 lines): `pub(crate)`
  thin wrappers (`record_write_latency`, `record_read_latency`,
  `inc_writes`, `inc_reads`, `inc_advises`, `inc_recovery_*`,
  `set_active_subscriptions`, etc.) that compile to no-ops under
  `#[cfg(not(feature = "metrics"))]`. Call sites in session.rs +
  asb_session.rs invoke them unconditionally; the gate is inside
  the wrapper.
- `metrics = { version = "0.24", optional = true }` added to
  workspace + mxaccess crate Cargo.toml.
- Default build: zero metrics dep, zero runtime cost.

**F44 — Buffered batch + suspend capture decode evidence**
- New docs/M6-buffered-evidence.md: per-capture summary for
  077, 079, 080, 081, 082, 094 — call sequence, key wire bytes,
  R2/R5 verdict.
- R2 confirmed silently as "not a real risk" — single-sample
  observed across 079/080/082/094.
- R5 trigger conditions documented from capture 077: AdviseSupervisory
  + Suspend pair, 1-second intervals, succeeds on enum attributes.
- design/70-risks-and-open-questions.md R2/R5 status updated.

Workspace: 759 → 792 tests, clippy clean, rustdoc -D warnings clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-06 05:12:17 -04:00
parent d5aa152b1f
commit ad1cf2351c
11 changed files with 1428 additions and 102 deletions
+31 -17
View File
@@ -41,23 +41,6 @@ move to `## Resolved` with a date + commit hash.
**Resolves when:** the .NET reference's `MxNativeCompatibilityServer.cs` has a Rust counterpart at the API-shape level (not byte-for-byte at the COM level — that's `mxaccess-compat-com`).
### F36 — `Session::subscribe_buffered` (NMX) per R2 single-sample-per-event answer
**Severity:** P1 — blocks M6 DoD bullet 2 (`subscribe_buffered` (NMX feature) — guarded by `BufferedOptions`; no synthesis if provider returns single-sample batches).
**Source:** `design/60-roadmap.md:97` + `design/70-risks-and-open-questions.md` R2 (single-sample-per-event verified against `wwtools/mxaccesscli/docs/api-notes.md:97-100,138-140,154-157`).
**Scope.** Add a `subscribe_buffered(reference, BufferedOptions { update_interval_ms })` method to `mxaccess::Session` that returns `Stream<Item = Result<DataChange, Error>>` — same item shape as plain `subscribe`, just with a per-session-cached cadence knob. Internal wiring:
1. Translate `update_interval_ms` to LMX's `SetBufferedUpdateInterval` semantics (rounded to nearest 100ms per `MxNativeCompatibilityServer.SetBufferedUpdateInterval:638`).
2. Use the existing `Session::subscribe` machinery; the buffered cadence is a server-side delivery rate knob, not a payload-shape change.
3. Surface as a separate Session method (not just an option on `subscribe`) so the API discoverably documents the cadence semantics.
**Definition of done:**
1. `Session::subscribe_buffered` returns `Stream<Item = Result<DataChange, Error>>` and internally drives the LMX `SetBufferedUpdateInterval` + `AddBufferedItem` call sequence per the captures `079-frida-add-buffered-advise-testint` and `082-frida-add-buffered-plain-advise-testint`.
2. New example `examples/subscribe-buffered.rs` exercises a 1-second cadence against the live AVEVA install. Per R2, no multi-sample synthesis.
3. Integration test asserts `Stream::Item == DataChange` (no `DataChangeBatch`); compile-time check.
4. Doc on `subscribe_buffered` cites R2's verification source and explicitly says "single-sample, cadence knob — not multi-sample payload."
**Resolves when:** `cargo run -p mxaccess --example subscribe-buffered` runs against AVEVA and the live captures `079`/`082` byte-round-trip via the new code path.
### F40 — Optional `metrics` feature: counters + histograms
**Severity:** P2 — M6 DoD bullet 4 (optional `metrics` feature emitting counters / histograms).
@@ -103,6 +86,20 @@ move to `## Resolved` with a date + commit hash.
**Resolves when:** dry-runs are green and the release notes are written.
### F45 — Recovery replay should re-issue `RegisterReference` for buffered subscriptions
**Severity:** P2 — F36 buffered subscriptions survive across `recover_connection` only via `AdviseSupervisory` replay, which loses the `.property(buffer)` registration.
**Source:** `crates/mxaccess/src/session.rs::recover_connection_core` (the loop iterates `subscriptions` and replays via `advise_supervisory`).
**Depends on:** F36 (closed by the same iteration as this followup is filed).
**Scope.** `Session::subscribe_buffered` records its `Subscription` in the same `SessionInner::subscriptions` registry as plain `subscribe` does, so the registry-walking recovery loop replays them via `AdviseSupervisory` rather than `RegisterReference` with `.property(buffer)`. The metadata stored in `SubscriptionEntry` is the original (un-suffixed) tag's `GalaxyTagMetadata`; the buffered name suffix is lost on replay. The server may continue to deliver values under the existing `.property(buffer)` registration on the engine side because the OBJREF / engine id pair survives the rebuild — but if the server tears the buffered registration down on disconnect, recovery will silently downgrade buffered → plain.
**Definition of done:**
1. `SubscriptionEntry` gains a discriminator (`enum SubscriptionMode { Plain, Buffered { rounded_interval_ms: u32 } }`) so recovery can branch on the original advise shape.
2. The buffered branch in `recover_connection_core` rebuilds the original `NmxReferenceRegistrationMessage` (with `.property(buffer)` suffix + the saved correlation id + `subscribe = true`) and dispatches `register_reference` against the rebuilt transport.
3. Live regression: `cargo run -p mxaccess --example subscribe-buffered` against AVEVA, then force a recovery via `Session::recover_connection`, and confirm subsequent `OnBufferedDataChange`-rate updates continue at the same cadence.
**Resolves when:** the recovery path treats buffered subscriptions identically to how the original advise was issued.
### F44 — Decode buffered batch + suspend captures (`077, 079-082, 094`)
**Severity:** P2 — evidence work for R2 (buffered single-sample) and R5 (Activate/Suspend), feeding F36/F35.
**Source:** `design/60-roadmap.md:40` (deferred to M6 + R2) + `design/70-risks-and-open-questions.md` R2/R5 + the captures.
@@ -120,6 +117,20 @@ For `077` (Suspend on advised ScanState): document the trigger conditions for R5
**Resolves when:** the evidence summary is committed and R2/R5 statuses are updated accordingly.
### F45 — Capture `LmxProxy.dll!CLMXProxyServer.Suspend`/`.Activate` wire emission
**Severity:** P3 — residual gap from F44's R5 walk.
**Source:** `design/70-risks-and-open-questions.md` R5 + `docs/M6-buffered-evidence.md` (capture 077 section) + `captures/077-frida-suspend-advised-scanstate/frida-events.tsv:2-17` (Frida hook list).
**Scope.** Capture 077 confirmed the .NET-reference compatibility-server's client-side gating for `Suspend` (must have an active subscription; returns `MxStatus.SuspendPending` synchronously) but did not instrument `LmxProxy.dll!CLMXProxyServer.Suspend` / `.Activate`. Open question: does the production LMX proxy issue a separate ORPC method for these, or does it also synthesise the response client-side?
**Definition of done:**
1. Extend `analysis/frida/mx-nmx-trace.js` to `Interceptor.attach` on `LmxProxy.dll!CLMXProxyServer.Suspend` and `.Activate` (and any sibling `Resume` / `Reactivate` if present in the export table). Mirror the existing `AdviseSupervisory` hook shape.
2. Re-run the `suspend-advised` scenario against `TestChildObject.ScanState`, plus a fresh `activate-advised` scenario, save under `captures/NNN-frida-suspend-activate-instrumented/`.
3. If a wire emission appears (PutRequest + TransferData with a new opnum or body shape): document it in `docs/M6-buffered-evidence.md` and `analysis/proxy/nmxsvcps-procedures.tsv`; add typed decode if the inner body is novel.
4. If no wire emission appears: confirm both operations are purely client-side and update R5 to "fully settled — client-side only".
**Resolves when:** R5 is fully settled (either with a documented wire opnum or a "client-side only" verdict backed by capture).
### F3 — Cross-domain NTLM Type1/2/3 fixture
**Severity:** P2
**Status:** Permanently out-of-scope on the current dev host (no second AD domain). Resolution requires external infrastructure not available here.
@@ -131,6 +142,9 @@ For `077` (Suspend on advised ScanState): document the trigger conditions for R5
## Resolved
### F36 — `Session::subscribe_buffered` (NMX) per R2 single-sample-per-event answer
**Resolved:** 2026-05-06. `Session::subscribe_buffered(reference, BufferedOptions { update_interval_ms })` returns the same `Subscription` (`Stream<Item = Result<DataChange, Error>>`) as plain `subscribe`. Wire path mirrors `MxNativeSession.RegisterBufferedItemAsync` (`MxNativeSession.cs:272-310`): the `item_definition` is suffixed with `.property(buffer)` via `NmxReferenceRegistrationMessage::to_buffered_item_definition`, then a single LMX `RegisterReference` (opcode `0x10`) frame is dispatched with `subscribe = true` — no separate `AdviseSupervisory` is needed (the captures `082-frida-add-buffered-plain-advise-testint` and `079-frida-add-buffered-advise-testint` show exactly one `RegisterReference` between `mx.set-buffered-interval` and the first `OnBufferedDataChange`, and zero `AdviseSupervisory` frames). `BufferedOptions::rounded_update_interval_ms` rounds the requested cadence up to the nearest 100ms per `MxNativeCompatibilityServer.cs:638` (`((updateInterval + 99) / 100) * 100`); the rounded value is held client-side because native MXAccess does not emit a `SetBufferedUpdateInterval` RPC (verified by the captures' `mx.set-buffered-interval.begin/end` events producing no NMX traffic). New example `crates/mxaccess/examples/subscribe-buffered.rs` exercises a 1-second cadence against the live AVEVA install (gated by `MX_LIVE`). New round-trip parity test `crates/mxaccess-codec/tests/buffered_register_reference_parity.rs` validates the wire-byte sequence against captures `079` + `082`. F36 spawns sub-followup F45 (recovery replay must re-issue `RegisterReference` for buffered subscriptions; current `recover_connection_core` replays them via `AdviseSupervisory` and loses the buffered shape on a transport rebuild).
### F37 — ASB `subscribe_buffered` capability gate
**Resolved:** 2026-05-06 (commit `34045c2`). `AsbSession::subscribe_buffered` returns `Error::Unsupported { transport: TransportKind::Asb, operation: ... }` synchronously without touching the wire — ASB has no `SetBufferedUpdateInterval` analogue; the per-monitored-item `MinimalMonitoredItem::sample_interval` is the rate-limit knob instead. The error-construction logic is split into a free fn so the gate's exact shape is unit-testable without spinning up a live authenticator + transport. Workspace 758 → 759 tests; clippy clean.