Files
mxaccess/docs/M6-buffered-evidence.md
T
Joseph Doherty 9ed4700eb4 docs: audit pass — fix stale F-number references
Walked all 18 docs/*.md for stale followup references and outdated
TODO markers. Two real fixes:

docs/M6-buffered-evidence.md:
- Three references to "F45" for the LMX-proxy Suspend/Activate
  Frida instrumentation were stale. That work was actually filed
  as F46 when the followups list got renumbered (F45 was reassigned
  to "Recovery replay should re-issue RegisterReference for
  buffered subscriptions"). F46 landed in commit 808fea1, and the
  follow-up live capture landed as F50 in commit 349e217.
- Updated all three references to point at F46 + F50 + the
  resolution evidence in docs/F50-suspend-activate-evidence.md.
- Renamed the "Sub-followup filed: F45" section to
  "Sub-followup F46 — RESOLVED 2026-05-06" with the verdict from
  the live capture.

docs/M6-live-verification.md:
- "Open work" section listed F50 as a residual gap. F50 closed
  2026-05-06 per docs/F50-suspend-activate-evidence.md. Updated
  to "None. F49 sweep complete; F50 closed".

Other docs scanned, no real staleness:
- Capture-Run-2026-04-25.md, Current-Sprint-State.md,
  DotNet10-Native-Library-Plan.md — historical snapshot docs,
  intentionally pinned to their dates.
- ASB-Native-Integration-Decision.md, MxNativeSession-API.md,
  NMX-COM-Contracts.md, MXAccess-* — describe the .NET reference's
  state; "not yet" wording reflects the .NET planning context, not
  the Rust port's current state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 04:32:28 -04:00

18 KiB

M6 buffered evidence — captures 077, 079-082, 094

F44 evidence walk for risks R2 (buffered single-sample DataChange) and R5 (Activate/Suspend trigger conditions).

This document decodes each of the six F44-scope captures under captures/, summarises the LMX call sequence + matching wire bytes, and records the verdict for R2/R5. Source-of-truth references throughout:

  • src/MxNativeCodec/NmxSubscriptionMessage.cs0x32/0x33 callback decoder (ParseDataUpdate hard-throws on recordCount != 1).
  • src/MxNativeClient/MxNativeCompatibilityServer.csSuspend/Activate facade behaviour, AddBufferedItem, SetBufferedUpdateInterval.
  • wwtools/mxaccesscli/docs/api-notes.md:97-100,138-140,154-157 — the production CLI documentation that originally framed R2 as "single-sample".
  • analysis/proxy/nmxsvcps-procedures.tsv — decoded MIDL procedures.

Each capture provides a harness.log (high-level MxNativeSession-shape call trace via MxTraceHarness) and a frida-events.tsv (Frida-instrumented LmxProxy.dll + Lmx.dll + NmxAdptr.dll hooks). The frida-events.tsv columns include the raw 1st-arg / 2nd-arg pointers and hex (the raw bytes at the dumped address). Wire bytes referenced below are extracted from the hex column with the line number cited per capture.

Capture wrapping note. CNmxAdapter.ProcessDataReceived reports a (size, ptr) tuple to Frida; the hex column is the bytes at ptr for size bytes. Each frame begins with a 4-byte outer length prefix (size_le), followed by the 46-byte NmxTransferEnvelope (version + inner_length

  • reserved + message_kind + galaxy/platform/engine ids + protocol_marker 01 02 00 00 + timeout), followed by the inner body. The inner body for 0x32 SubscriptionStatus / 0x33 DataUpdate frames is what the NmxSubscriptionMessage::parse_inner codec consumes. References to "inner offset N" below mean N bytes from the first byte of the inner body (i.e. the 0x32/0x33 opcode is at inner offset 0).

077 — Suspend on advised ScanState (R5 evidence)

Scenario. MxTraceHarness --scenario=suspend-advised --tag=TestChildObject.ScanState runs Register → AddItem(TestChildObject.ScanState) → AdviseSupervisory → Suspend → unadvise → removeItem → Unregister. The harness logs Suspend returning MxStatus { Success: -1, Category: MxCategoryPending, Source: MxSourceRequestingLmx, Detail: 0 } (harness.log:9).

Frida hook coverage. This capture's hooks (frida-events.tsv:2-17) instrument Write.variantA/B, WriteSecured.variantA/B, AdviseSupervisory, plus Lmx.dll reference + NmxAdptr hooks — but not Suspend/Activate themselves on LmxProxy.dll. Suspend was therefore exercised, but its parameter shape is invisible to this capture; only the fact-of-success and the surrounding flow are recorded.

LMX call sequence (from harness.log).

mx.register.begin / .end                        # SessionHandle = 1
mx.additem.begin / .end                         # ItemHandle = 1
mx.advise-supervisory.begin / .end              # AdviseSupervisory(1, 1, ...) = 0x0
mx.suspend.begin / .end                         # status = MxStatus.SuspendPending
                                                #   (Success:-1, MxCategoryPending,
                                                #    MxSourceRequestingLmx, Detail:0)
... 3-second hold ...
mx.unadvise.begin / .end                        # Unadvise(1)
mx.removeitem.begin / .end                      # RemoveItem(1)
mx.unregister.begin / .end                      # Unregister

Wire bytes — register/advise. frida-events.tsv:30-44 shows the AdviseSupervisory call (call.enter ... ecx=0xaff15c args=[0x5e28ff0, 0x1, 0x1, 0x57579f0, 0x74794704]) returning 0x0, followed by a paired PutRequest carrying the 17-byte item-control envelope 1f 01 00 [... op-id 16 ...] 05 00 36 d7 02 00 69 00 0a 00 47 92 00 00 03 00 00 00. The returned ProcessDataReceived frame at line 50 carries the inner status 32 01 00 01 00 00 00 [...] 03 00 00 00 c0 00 ... (single-record SubscriptionStatus, recordCount=1).

R5 verdict / trigger conditions. Suspend was successfully invoked on a previously-advised supervisory item (the harness does AdviseSupervisory immediately before Suspend). The compatibility-layer Suspend returns synchronously with MxStatus.SuspendPending (per src/MxNativeClient/MxNativeCompatibilityServer.cs:554-569: the .NET reference accepts the call iff item.Subscription is not null, otherwise it throws ArgumentException("Suspend requires an advised item handle")). Concrete observed trigger conditions:

  1. The target ItemHandle must have an active subscription (i.e. Advise or AdviseSupervisory already succeeded). 077 establishes this via AdviseSupervisory(itemHandle=1) 1ms before the Suspend call.
  2. The session must be alive and the item present — a stale handle is rejected at the compatibility-server layer (GetItemLocked throws on missing items).
  3. The .NET reference does not issue any Suspend-specific wire message. The status is synthesised client-side (MxNativeCompatibilityServer.cs:568: status = MxStatus.SuspendPending) and the underlying NMX subscription continues to deliver callbacks. Consequently no 0x32/0x33 frame in 077's TCP capture corresponds to the suspend; the capture has nothing to falsify.

R5 boundary (was unproven at the time of this evidence walk; see "Sub-followup F46 — RESOLVED" below). Whether the production LmxProxy stack issues a separate ORPC method for Suspend (e.g. an ILMXProxyServer5 opnum) or also synthesises it client-side could not be answered from 077 because the Frida script did not hook LmxProxy.dll!CLMXProxyServer.Suspend. The follow-up Frida hook (F46) and live capture (F50) both landed 2026-05-06 and settled R5 as "Suspend is server-side NMX opcode 0x2D; Activate is client-side only".

079 — Buffered + supervisory advise

Scenario. MxTraceHarness --scenario=add-buffered-advise --tag=TestInt --context=TestChildObject --buffered-update-interval=1000 --duration=5. The harness sequence is Register → SetBufferedUpdateInterval(1000) → AddBufferedItem(TestInt, TestChildObject) → AdviseSupervisory → ... 5s ... → Unadvise → RemoveItem → Unregister.

Wire activity. Only the static metadata fetch (DevPlatform.GR.TimeOfLastDeploy / TimeOfLastConfigChange) and the supervisory advise reply (32 01 00 01 00 00 00 ..., frida-events.tsv:40-42) appear in the trace. No 0x33 DataUpdate frame fires during the 5-second hold — the buffered tag did not change value, so no buffered emission was triggered. The frida-events.tsv ends at the supervisory-advise reply; the cleanup messages are not visible.

R2 verdict. No multi-sample evidence in this capture. Consistent with single-sample interpretation (no buffered DataUpdate was emitted, so we have no contradicting bytes). Inconclusive in isolation but consistent with single-sample.

080 — Buffered + external write

Scenario. Identical buffered-advise setup as 079, plus an in-process "writer" sub-flow that calls AddItem2 → AdviseSupervisory → Write against the same tag while the buffered subscription is live. Two values are written sequentially (126, 127) at 1.8s spacing.

Wire activity. Each external write produces a complete sequence: AddItem2 envelope (10 01 00 ...), supervisory advise reply, write envelope (37 01 00 ... for Write.variantA), and a corresponding 0x33 DataUpdate notifying the buffered subscription of the new value. Specifically frida-events.tsv:40 carries 0x32 SubscriptionStatus after the buffered AdviseSupervisory; subsequent ProcessDataReceived frames after each write deliver 0x33 DataUpdate with record_count = 1 (Int32 wire kind, value matching the 4-byte 89 00 00 00-style payload in the writer's TransferData body).

R2 verdict. All three observed 0x33 DataUpdate frames in 080 carry record_count = 1 (grep -c "33 01 00 01" returns 1, plus there are no 33 01 00 02+ matches). Consistent with single-sample. Verdict: single-sample (consistent with R2 framing).

081 — Plain write to advised tag (post-buffered baseline)

Scenario. Plain --scenario=write exercising Register → AddItem(TestChildObject.TestInt) → AdviseSupervisory → Write(132) → Unadvise → RemoveItem → Unregister. No buffered surface. Included as F44's "plain-write reference baseline" against which the buffered captures should be compared.

Wire activity. frida-events.tsv:73 carries the post-write 0x33 DataUpdate with record_count = 1, value bytes 0x84 00 00 00 (132). One 32 01 00 02 00 00 00 SubscriptionStatus appears (the AdviseSupervisory reply in two records — one ack record, one initial-value record). One 33 01 00 01 00 00 00 DataUpdate fires after the write. No multi-sample DataUpdate.

R2 verdict. Plain (non-buffered) advise produces single-sample DataUpdate. Consistent with the documented LMX shape. Verdict: single-sample.

082 — Buffered + plain (non-supervisory) advise

Scenario. Identical to 079 except using Advise (non-supervisory) instead of AdviseSupervisory. 8-second hold, no external write.

Wire activity. Symmetrical to 079: the static metadata fetch and a single 0x32 01 00 02 00 00 00 SubscriptionStatus (the advise reply with two record entries — first the establish-ack, second the initial value). No 0x33 DataUpdate fires (no value change during the hold).

R2 verdict. Inconclusive in isolation; consistent with single-sample. The record_count = 2 in the 0x32 SubscriptionStatus is not R2 evidence — 0x32 always supports multi-record per NmxSubscriptionMessage.cs:101, and the codec already loops over recordCount. R2 is specifically about 0x33 DataUpdate.

094 — Buffered + separate-session writer (R2 contradiction)

Scenario. Like 080 but the "writer" runs in a separate registered session (Register/AddItem/AdviseSupervisory/Write/Unadvise/Unregister) while the original session holds the buffered subscription. Two values are written (136, 137) at 3s spacing.

Wire activity. The high-water-mark of activity in this capture is the post-write 0x33 DataUpdate at frida-events.tsv:145 (2026-04-25T21:40:34.222Z, ~120ms after Write.variantA of value 137 from the second writer session).

The full hex (107 bytes) breaks down as:

6b 00 00 00                                     # outer length prefix = 107
01 00 3d 00 00 00 00 00 00 00 b6 89 05 00       # transfer envelope: version=1,
  01 00 00 00 01 00 00 00 02 00 00 00           #   inner_length=0x3d=61,
  01 00 00 00 01 00 00 00 fb 7f 00 00           #   reserved+kind+ids+
  01 02 00 00 30 75 00 00                       #   protocol_marker=0x0201,
                                                #   timeout=30000ms
33 01 00                                        # opcode=0x33 DataUpdate, version=1
02 00 00 00                                     # record_count = 2  ← contradicts R2
93 8a 8d 18 49 1d 13 47 86 c1 e2 1d 4f d7 ca 8d # operation_id GUID

03 00 00 00                                     # record 1: status = 3
c0 00                                           #   quality = 0xC0 (Good)
90 11 9d 25 fc d4 dc 01                         #   filetime = 0x01dcd4fc259d1190
02                                              #   wire_kind = 0x02 (Int32)
89 00 00 00                                     #   value = 137 (= 0x89)

04 00 00 00                                     # record 2: status = 4
c0 00                                           #   quality = 0xC0
90 11 9d 25 fc d4 dc 01                         #   filetime (same as rec 1)
02                                              #   wire_kind = 0x02 (Int32)
                                                #   value: TRUNCATED — see note

The arithmetic ties out: inner_length = 23 (preamble) + 19 (record 1) + 19 (record 2) = 61 matches the envelope's inner_length field exactly. The trace reported candidate_size = 107 but the envelope demands 111 bytes total — Frida dumped 4 bytes shy of the actual buffer, so record 2's 4-byte Int32 value did not make it into the TSV. The envelope's inner_length is the source of truth for the structural verdict; the missing value bytes are a trace artefact, not a wire artefact.

R2 verdict — CONTRADICTED. A 0x33 DataUpdate body with record_count = 2 was observed in production-stack tracing, against a buffered subscription (AddBufferedItem + SetBufferedUpdateInterval(1000)) when an out-of-band writer triggered a value change. The .NET reference's NmxSubscriptionMessage.ParseDataUpdate would hard-throw ArgumentException("...currently supports one record per body") here (src/MxNativeCodec/NmxSubscriptionMessage.cs:71-74).

R2's previous "single-sample-per-event" framing — derived from the production CLI docs in wwtools/mxaccesscli/docs/api-notes.md:138-140 — held for the typical case where a single supervisory advise drives a single buffered flush. It does not hold when two write events accumulate within one buffered window. In 094, the buffered subscription's 1000ms tick collated two distinct writes (status field carries sequence numbers 3 and 4), and NMX delivered both in one 0x33 body.

The wwtools api-notes were not wrong about the shape of OnBufferedDataChange — that event still carries one value per fired event. The misalignment is upstream of the public event: the wire-level 0x33 body can carry multiple records, which the .NET reference's hard-throw masked.

Codec change shipped with F44

Per F44 DoD step 2 ("if a multi-sample body is observed, surface a typed DataChangeBatch decode path"):

  • subscription_message::parse_data_update was relaxed to loop over record_count (mirroring parse_subscription_status). The pre-existing records: Vec<NmxSubscriptionRecord> field on NmxSubscriptionMessage already accommodated multi-record bodies; only the entrypoint hard-error needed to be retired. record_count <= 0 is still rejected explicitly.
  • The .NET reference is not being changed here (it remains the executable spec; the divergence is documented inline). Per design/70-risks-and-open-questions.md R13, the soft-error path the Rust port previously took for multi-record DataUpdate is no longer needed — the codec now accepts the case directly.
  • Two new tests cover the paths:
    • data_update_multi_record_round_trip — synthesised two-record body based on capture 094's per-record fields, asserts both records decode cleanly with their respective values.
    • data_update_capture_094_truncated_record_errors — feeds the verbatim-from-trace 57-byte inner body and asserts record 2's truncated value surfaces as value = None (codec preserves "unknown" bytes rather than fabricating).
  • Fixtures under crates/mxaccess-codec/tests/fixtures/m6-buffered/ carry the verbatim inner-body bytes of capture 094 lines 48 and 145 for reproducibility.

Sub-followup F46 — RESOLVED 2026-05-06

A residual gap remained at the LMX-proxy boundary: capture 077 did not instrument LmxProxy.dll!CLMXProxyServer.Suspend / .Activate, so it could not say whether the production stack issued a dedicated ORPC opnum for these operations or also synthesised them client-side.

This was filed as F46 in design/followups.md (the F-number "F45" earlier drafts of this doc used was reassigned to a different concern — recovery-replay for buffered subscriptions — when the followups list was renumbered). F46 landed in commit 808fea1 (Frida hooks added to analysis/frida/mx-nmx-trace.js) and the live capture ran in commit 349e217 as F50. Verdict, per docs/F50-suspend-activate-evidence.md:

  • Suspend is server-side: emits NMX PutRequest with command 0x2D ~140 ms after the LMX-proxy entry, body shape 2d 01 00 + correlation_id + 22 bytes (same family as 0x1F AdviseSupervisory).
  • Activate against a non-suspended item is client-side only — no wire traffic, returns Success synchronously.

R5 in design/70-risks-and-open-questions.md is now settled. The R5 trigger conditions documented above (subscription must exist) are still accurate for the client-side gating; the wire-side opnum + body shape is the new evidence F50 added.

Consolidated R2 / R5 status

  • R2 verdict — CONTRADICTED then re-settled by codec change. Capture 094 produced a 0x33 DataUpdate with record_count = 2; the codec now decodes multi-record bodies (see Codec change shipped with F44 above). Future regressions are guarded by the new round-trip tests. Status moves from "P3 likely-not-a-real-risk" to "settled per option (b) with codec change landed under F44".
  • R5 trigger conditions — observed + wire shape settled. From capture 077: Suspend succeeds (returning MxStatus.SuspendPending) when invoked on an item handle whose subscription is alive (i.e. immediately following a successful Advise/AdviseSupervisory). The compatibility server synthesises the status client-side; no dedicated wire frame is observed in the F44 captures. The remaining unknown — does LmxProxy.dll itself issue a Suspend/Activate ORPC method? — was answered by F46 (Frida hooks landed 2026-05-06) + F50 (live capture under captures/123-frida-suspend-advised-instrumented/ and captures/124-frida-activate-advised-instrumented/). Verdict: Suspend wires NMX opcode 0x2D (server-side); Activate against a non-suspended item is client-side only. R5 closed.