Files
mxaccess/docs/M6-buffered-evidence.md
T
Joseph Doherty 9ed4700eb4 docs: audit pass — fix stale F-number references
Walked all 18 docs/*.md for stale followup references and outdated
TODO markers. Two real fixes:

docs/M6-buffered-evidence.md:
- Three references to "F45" for the LMX-proxy Suspend/Activate
  Frida instrumentation were stale. That work was actually filed
  as F46 when the followups list got renumbered (F45 was reassigned
  to "Recovery replay should re-issue RegisterReference for
  buffered subscriptions"). F46 landed in commit 808fea1, and the
  follow-up live capture landed as F50 in commit 349e217.
- Updated all three references to point at F46 + F50 + the
  resolution evidence in docs/F50-suspend-activate-evidence.md.
- Renamed the "Sub-followup filed: F45" section to
  "Sub-followup F46 — RESOLVED 2026-05-06" with the verdict from
  the live capture.

docs/M6-live-verification.md:
- "Open work" section listed F50 as a residual gap. F50 closed
  2026-05-06 per docs/F50-suspend-activate-evidence.md. Updated
  to "None. F49 sweep complete; F50 closed".

Other docs scanned, no real staleness:
- Capture-Run-2026-04-25.md, Current-Sprint-State.md,
  DotNet10-Native-Library-Plan.md — historical snapshot docs,
  intentionally pinned to their dates.
- ASB-Native-Integration-Decision.md, MxNativeSession-API.md,
  NMX-COM-Contracts.md, MXAccess-* — describe the .NET reference's
  state; "not yet" wording reflects the .NET planning context, not
  the Rust port's current state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 04:32:28 -04:00

292 lines
18 KiB
Markdown

# M6 buffered evidence — captures `077, 079-082, 094`
**F44 evidence walk** for risks **R2** (buffered single-sample DataChange) and
**R5** (Activate/Suspend trigger conditions).
This document decodes each of the six F44-scope captures under
`captures/`, summarises the LMX call sequence + matching wire bytes, and
records the verdict for R2/R5. Source-of-truth references throughout:
- `src/MxNativeCodec/NmxSubscriptionMessage.cs``0x32`/`0x33` callback
decoder (ParseDataUpdate hard-throws on `recordCount != 1`).
- `src/MxNativeClient/MxNativeCompatibilityServer.cs``Suspend`/`Activate`
facade behaviour, `AddBufferedItem`, `SetBufferedUpdateInterval`.
- `wwtools/mxaccesscli/docs/api-notes.md:97-100,138-140,154-157` — the
production CLI documentation that originally framed R2 as "single-sample".
- `analysis/proxy/nmxsvcps-procedures.tsv` — decoded MIDL procedures.
Each capture provides a `harness.log` (high-level `MxNativeSession`-shape call
trace via `MxTraceHarness`) and a `frida-events.tsv` (Frida-instrumented
`LmxProxy.dll` + `Lmx.dll` + `NmxAdptr.dll` hooks). The `frida-events.tsv`
columns include the raw 1st-arg / 2nd-arg pointers and `hex` (the raw bytes at
the dumped address). Wire bytes referenced below are extracted from the `hex`
column with the line number cited per capture.
> **Capture wrapping note.** `CNmxAdapter.ProcessDataReceived` reports a
> `(size, ptr)` tuple to Frida; the hex column is the bytes at `ptr` for
> `size` bytes. Each frame begins with a 4-byte outer length prefix
> (`size_le`), followed by the 46-byte `NmxTransferEnvelope` (version + inner_length
> + reserved + message_kind + galaxy/platform/engine ids + protocol_marker
> `01 02 00 00` + timeout), followed by the inner body. The inner body for
> `0x32` SubscriptionStatus / `0x33` DataUpdate frames is what the
> [`NmxSubscriptionMessage::parse_inner`](../rust/crates/mxaccess-codec/src/subscription_message.rs)
> codec consumes. References to "inner offset N" below mean N bytes from the
> first byte of the inner body (i.e. the `0x32`/`0x33` opcode is at inner
> offset 0).
## 077 — Suspend on advised ScanState (R5 evidence)
**Scenario.** `MxTraceHarness --scenario=suspend-advised --tag=TestChildObject.ScanState`
runs `Register → AddItem(TestChildObject.ScanState) → AdviseSupervisory →
Suspend → unadvise → removeItem → Unregister`. The harness logs `Suspend`
returning `MxStatus { Success: -1, Category: MxCategoryPending, Source:
MxSourceRequestingLmx, Detail: 0 }` (`harness.log:9`).
**Frida hook coverage.** This capture's hooks (`frida-events.tsv:2-17`)
instrument `Write.variantA/B`, `WriteSecured.variantA/B`,
`AdviseSupervisory`, plus `Lmx.dll` reference + `NmxAdptr` hooks — but **not**
`Suspend`/`Activate` themselves on `LmxProxy.dll`. Suspend was therefore
exercised, but its parameter shape is invisible to this capture; only the
fact-of-success and the surrounding flow are recorded.
**LMX call sequence (from `harness.log`).**
```
mx.register.begin / .end # SessionHandle = 1
mx.additem.begin / .end # ItemHandle = 1
mx.advise-supervisory.begin / .end # AdviseSupervisory(1, 1, ...) = 0x0
mx.suspend.begin / .end # status = MxStatus.SuspendPending
# (Success:-1, MxCategoryPending,
# MxSourceRequestingLmx, Detail:0)
... 3-second hold ...
mx.unadvise.begin / .end # Unadvise(1)
mx.removeitem.begin / .end # RemoveItem(1)
mx.unregister.begin / .end # Unregister
```
**Wire bytes — register/advise.** `frida-events.tsv:30-44` shows the
`AdviseSupervisory` call (`call.enter ... ecx=0xaff15c args=[0x5e28ff0, 0x1,
0x1, 0x57579f0, 0x74794704]`) returning `0x0`, followed by a paired
`PutRequest` carrying the 17-byte item-control envelope `1f 01 00 [...
op-id 16 ...] 05 00 36 d7 02 00 69 00 0a 00 47 92 00 00 03 00 00 00`. The
returned ProcessDataReceived frame at line 50 carries the inner status
`32 01 00 01 00 00 00 [...] 03 00 00 00 c0 00 ...` (single-record
SubscriptionStatus, recordCount=1).
**R5 verdict / trigger conditions.** `Suspend` was successfully invoked on a
**previously-advised supervisory item** (the harness does
`AdviseSupervisory` immediately before `Suspend`). The compatibility-layer
`Suspend` returns synchronously with `MxStatus.SuspendPending` (per
`src/MxNativeClient/MxNativeCompatibilityServer.cs:554-569`: the .NET
reference accepts the call iff `item.Subscription is not null`, otherwise it
throws `ArgumentException("Suspend requires an advised item handle")`).
**Concrete observed trigger conditions:**
1. The target `ItemHandle` must have an active subscription (i.e. `Advise`
or `AdviseSupervisory` already succeeded). 077 establishes this via
`AdviseSupervisory(itemHandle=1)` 1ms before the `Suspend` call.
2. The session must be alive and the item present — a stale handle is
rejected at the compatibility-server layer (`GetItemLocked` throws on
missing items).
3. The .NET reference does **not** issue any `Suspend`-specific wire
message. The status is synthesised client-side
(`MxNativeCompatibilityServer.cs:568`: `status = MxStatus.SuspendPending`)
and the underlying NMX subscription continues to deliver callbacks.
Consequently no `0x32`/`0x33` frame in 077's TCP capture corresponds to
the suspend; the capture has nothing to falsify.
**R5 boundary** (was unproven at the time of this evidence walk; see "Sub-followup F46 — RESOLVED" below). Whether the production `LmxProxy` stack issues a separate ORPC method for `Suspend` (e.g. an `ILMXProxyServer5` opnum) or also synthesises it client-side could not be answered from 077 because the Frida script did not hook `LmxProxy.dll!CLMXProxyServer.Suspend`. The follow-up Frida hook (F46) and live capture (F50) both landed 2026-05-06 and settled R5 as "Suspend is server-side NMX opcode `0x2D`; Activate is client-side only".
## 079 — Buffered + supervisory advise
**Scenario.** `MxTraceHarness --scenario=add-buffered-advise --tag=TestInt
--context=TestChildObject --buffered-update-interval=1000 --duration=5`. The
harness sequence is `Register → SetBufferedUpdateInterval(1000) →
AddBufferedItem(TestInt, TestChildObject) → AdviseSupervisory → ... 5s ...
→ Unadvise → RemoveItem → Unregister`.
**Wire activity.** Only the static metadata fetch
(`DevPlatform.GR.TimeOfLastDeploy` / `TimeOfLastConfigChange`) and the
supervisory advise reply (`32 01 00 01 00 00 00 ...`,
`frida-events.tsv:40-42`) appear in the trace. **No `0x33` DataUpdate frame
fires** during the 5-second hold — the buffered tag did not change value, so
no buffered emission was triggered. The `frida-events.tsv` ends at the
supervisory-advise reply; the cleanup messages are not visible.
**R2 verdict.** No multi-sample evidence in this capture. Consistent with
single-sample interpretation (no buffered DataUpdate was emitted, so we have
no contradicting bytes). **Inconclusive in isolation but consistent with
single-sample.**
## 080 — Buffered + external write
**Scenario.** Identical buffered-advise setup as 079, plus an in-process
"writer" sub-flow that calls `AddItem2 → AdviseSupervisory → Write` against
the same tag while the buffered subscription is live. Two values are written
sequentially (126, 127) at 1.8s spacing.
**Wire activity.** Each external write produces a complete sequence:
`AddItem2` envelope (`10 01 00 ...`), supervisory advise reply, write
envelope (`37 01 00 ...` for Write.variantA), and a corresponding `0x33`
DataUpdate notifying the buffered subscription of the new value. Specifically
`frida-events.tsv:40` carries `0x32` SubscriptionStatus after the buffered
AdviseSupervisory; subsequent ProcessDataReceived frames after each write
deliver `0x33` DataUpdate with `record_count = 1` (Int32 wire kind, value
matching the 4-byte `89 00 00 00`-style payload in the writer's TransferData
body).
**R2 verdict.** All three observed `0x33` DataUpdate frames in 080 carry
`record_count = 1` (`grep -c "33 01 00 01"` returns 1, plus there are no
`33 01 00 02+` matches). Consistent with single-sample. **Verdict:
single-sample (consistent with R2 framing).**
## 081 — Plain write to advised tag (post-buffered baseline)
**Scenario.** Plain `--scenario=write` exercising
`Register → AddItem(TestChildObject.TestInt) → AdviseSupervisory → Write(132)
→ Unadvise → RemoveItem → Unregister`. No buffered surface. Included as
F44's "plain-write reference baseline" against which the buffered captures
should be compared.
**Wire activity.** `frida-events.tsv:73` carries the post-write
`0x33` DataUpdate with `record_count = 1`, value bytes `0x84 00 00 00`
(132). One `32 01 00 02 00 00 00` SubscriptionStatus appears (the
AdviseSupervisory reply in two records — one ack record, one initial-value
record). One `33 01 00 01 00 00 00` DataUpdate fires after the write. No
multi-sample DataUpdate.
**R2 verdict.** Plain (non-buffered) advise produces single-sample DataUpdate.
Consistent with the documented LMX shape. **Verdict: single-sample.**
## 082 — Buffered + plain (non-supervisory) advise
**Scenario.** Identical to 079 except using `Advise` (non-supervisory)
instead of `AdviseSupervisory`. 8-second hold, no external write.
**Wire activity.** Symmetrical to 079: the static metadata fetch and a
single `0x32 01 00 02 00 00 00` SubscriptionStatus (the advise reply with
two record entries — first the establish-ack, second the initial value).
No `0x33` DataUpdate fires (no value change during the hold).
**R2 verdict.** Inconclusive in isolation; consistent with single-sample.
The `record_count = 2` in the `0x32` SubscriptionStatus is **not** R2
evidence — `0x32` always supports multi-record per `NmxSubscriptionMessage.cs:101`,
and the codec already loops over `recordCount`. R2 is specifically about
`0x33` DataUpdate.
## 094 — Buffered + separate-session writer **(R2 contradiction)**
**Scenario.** Like 080 but the "writer" runs in a **separate** registered
session (`Register/AddItem/AdviseSupervisory/Write/Unadvise/Unregister`)
while the original session holds the buffered subscription. Two values are
written (136, 137) at 3s spacing.
**Wire activity.** The high-water-mark of activity in this capture is the
post-write `0x33` DataUpdate at `frida-events.tsv:145` (`2026-04-25T21:40:34.222Z`,
~120ms after `Write.variantA` of value 137 from the second writer session).
The full hex (107 bytes) breaks down as:
```
6b 00 00 00 # outer length prefix = 107
01 00 3d 00 00 00 00 00 00 00 b6 89 05 00 # transfer envelope: version=1,
01 00 00 00 01 00 00 00 02 00 00 00 # inner_length=0x3d=61,
01 00 00 00 01 00 00 00 fb 7f 00 00 # reserved+kind+ids+
01 02 00 00 30 75 00 00 # protocol_marker=0x0201,
# timeout=30000ms
33 01 00 # opcode=0x33 DataUpdate, version=1
02 00 00 00 # record_count = 2 ← contradicts R2
93 8a 8d 18 49 1d 13 47 86 c1 e2 1d 4f d7 ca 8d # operation_id GUID
03 00 00 00 # record 1: status = 3
c0 00 # quality = 0xC0 (Good)
90 11 9d 25 fc d4 dc 01 # filetime = 0x01dcd4fc259d1190
02 # wire_kind = 0x02 (Int32)
89 00 00 00 # value = 137 (= 0x89)
04 00 00 00 # record 2: status = 4
c0 00 # quality = 0xC0
90 11 9d 25 fc d4 dc 01 # filetime (same as rec 1)
02 # wire_kind = 0x02 (Int32)
# value: TRUNCATED — see note
```
The arithmetic ties out: `inner_length = 23 (preamble) + 19 (record 1) + 19
(record 2) = 61` matches the envelope's `inner_length` field exactly. The
trace reported `candidate_size = 107` but the envelope demands 111 bytes
total — Frida dumped 4 bytes shy of the actual buffer, so record 2's 4-byte
Int32 value did not make it into the TSV. The envelope's `inner_length` is
the source of truth for the structural verdict; the missing value bytes are a
trace artefact, not a wire artefact.
**R2 verdict — CONTRADICTED.** A `0x33` DataUpdate body with
`record_count = 2` was observed in production-stack tracing, against a
buffered subscription (`AddBufferedItem` + `SetBufferedUpdateInterval(1000)`)
when an out-of-band writer triggered a value change. The .NET reference's
`NmxSubscriptionMessage.ParseDataUpdate` would hard-throw
`ArgumentException("...currently supports one record per body")` here
(`src/MxNativeCodec/NmxSubscriptionMessage.cs:71-74`).
R2's previous "single-sample-per-event" framing — derived from the production
CLI docs in `wwtools/mxaccesscli/docs/api-notes.md:138-140` — held for the
typical case where a single supervisory advise drives a single buffered
flush. **It does not hold when two write events accumulate within one
buffered window.** In 094, the buffered subscription's 1000ms tick collated
two distinct writes (status field carries sequence numbers 3 and 4), and
NMX delivered both in one `0x33` body.
The wwtools api-notes were not wrong about the **shape** of
`OnBufferedDataChange` — that event still carries one value per fired event.
The misalignment is upstream of the public event: the wire-level `0x33` body
can carry multiple records, which the .NET reference's hard-throw masked.
## Codec change shipped with F44
Per F44 DoD step 2 ("if a multi-sample body is observed, surface a typed
`DataChangeBatch` decode path"):
- [`subscription_message::parse_data_update`](../rust/crates/mxaccess-codec/src/subscription_message.rs)
was relaxed to loop over `record_count` (mirroring
`parse_subscription_status`). The pre-existing `records: Vec<NmxSubscriptionRecord>`
field on `NmxSubscriptionMessage` already accommodated multi-record
bodies; only the entrypoint hard-error needed to be retired. `record_count
<= 0` is still rejected explicitly.
- The .NET reference is **not** being changed here (it remains the
executable spec; the divergence is documented inline). Per
`design/70-risks-and-open-questions.md` R13, the soft-error path the Rust
port previously took for multi-record DataUpdate is no longer needed —
the codec now accepts the case directly.
- Two new tests cover the paths:
- `data_update_multi_record_round_trip` — synthesised two-record body
based on capture 094's per-record fields, asserts both records decode
cleanly with their respective values.
- `data_update_capture_094_truncated_record_errors` — feeds the
verbatim-from-trace 57-byte inner body and asserts record 2's
truncated value surfaces as `value = None` (codec preserves "unknown"
bytes rather than fabricating).
- Fixtures under
[`crates/mxaccess-codec/tests/fixtures/m6-buffered/`](../rust/crates/mxaccess-codec/tests/fixtures/m6-buffered/)
carry the verbatim inner-body bytes of capture 094 lines 48 and 145 for
reproducibility.
## Sub-followup F46 — RESOLVED 2026-05-06
A residual gap remained at the LMX-proxy boundary: capture 077 did not instrument `LmxProxy.dll!CLMXProxyServer.Suspend` / `.Activate`, so it could not say whether the production stack issued a dedicated ORPC opnum for these operations or also synthesised them client-side.
This was filed as **F46** in `design/followups.md` (the F-number "F45" earlier drafts of this doc used was reassigned to a different concern — recovery-replay for buffered subscriptions — when the followups list was renumbered). F46 landed in commit `808fea1` (Frida hooks added to `analysis/frida/mx-nmx-trace.js`) and the live capture ran in commit `349e217` as F50. Verdict, per `docs/F50-suspend-activate-evidence.md`:
- **Suspend** is server-side: emits NMX `PutRequest` with command `0x2D` ~140 ms after the LMX-proxy entry, body shape `2d 01 00 + correlation_id + 22 bytes` (same family as `0x1F` AdviseSupervisory).
- **Activate** against a non-suspended item is client-side only — no wire traffic, returns Success synchronously.
R5 in `design/70-risks-and-open-questions.md` is now settled. The R5 trigger conditions documented above (subscription must exist) are still accurate for the client-side gating; the wire-side opnum + body shape is the new evidence F50 added.
## Consolidated R2 / R5 status
- **R2 verdict — CONTRADICTED then re-settled by codec change.** Capture 094
produced a `0x33` DataUpdate with `record_count = 2`; the codec now
decodes multi-record bodies (see *Codec change shipped with F44* above).
Future regressions are guarded by the new round-trip tests. Status moves
from "P3 likely-not-a-real-risk" to "settled per option (b) with codec
change landed under F44".
- **R5 trigger conditions — observed + wire shape settled.** From capture 077: `Suspend` succeeds (returning `MxStatus.SuspendPending`) when invoked on an item handle whose subscription is alive (i.e. immediately following a successful `Advise`/`AdviseSupervisory`). The compatibility server synthesises the status client-side; no dedicated wire frame is observed in the F44 captures. The remaining unknown — does `LmxProxy.dll` itself issue a Suspend/Activate ORPC method? — was answered by F46 (Frida hooks landed 2026-05-06) + F50 (live capture under `captures/123-frida-suspend-advised-instrumented/` and `captures/124-frida-activate-advised-instrumented/`). Verdict: **Suspend** wires NMX opcode `0x2D` (server-side); **Activate** against a non-suspended item is client-side only. R5 closed.