[R3/R4] Path-A trace: synthesizer is in Lmx.dll's NMX-frame decoder

Five-stage Ghidra headless decompile traces the byte-to-MXSTATUS_PROXY
synthesis path end-to-end across LmxProxy.dll and Lmx.dll. New evidence
files committed alongside R3/R4 verdict update:

- analysis/ghidra/exports/LmxProxy.dll.fire-event-xrefs.md
- analysis/ghidra/exports/LmxProxy.dll.status-synthesis-decompile.md
- analysis/ghidra/exports/LmxProxy.dll.mxstatus-safearray-decompile.md
- analysis/ghidra/exports/Lmx.dll.set-attribute-result-decompile.md

Layer-by-layer findings (bytes flow inward; synthesis flows outward):

1. `Lmx.aaDCT` at 0x10178fc0 is `SysAllocString(L"Lmx.aaDCT")` — a
   tracing category BSTR, not a table.
2. `MXSTATUS_PROXY` is a 16-byte marshalled struct (4 × i16 padded
   to i32 boundaries with Pack=4) — the OUTPUT of synthesis, not a
   lookup entry.
3. `LmxProxy.dll` Fire_* event handlers receive already-populated
   `MXSTATUS_PROXY[]` and forward through ATL dispatch — no synthesis.
4. `LmxProxy.dll` Fire_* CALLERS (FUN_1001657f / FUN_10016b50 /
   FUN_10016d4b) call FUN_10003f60(out_safearray, in_status_ptr,
   count=1) which is a VERBATIM memcpy of an existing 14-byte buffer
   into the SAFEARRAY — no transformation.
5. `Lmx.dll`'s `PreboundReference::OnSetAttributeResult` (FUN_10114a90)
   receives an already-populated `short *param_7` status buffer. Log
   line confirms the layout: `<success %d category %d detectedBy %d
   detail %d>`. Dispatches on typed values — synthesis is upstream of
   this function too.

The synthesizer is the NMX-frame decoder in Lmx.dll that calls
OnSetAttributeResult / OnGetAttributeResult / equivalent
OperationComplete handler. The decoder takes raw NMX bytes plus
operation context (item handle, engine state, retry state,
correlation id) and computes the populated MXSTATUS_PROXY. There is
NO static lookup table — synthesis is per-message contextual.

Two viable paths to typed promotion (both substantial; neither a
small codec patch):

- Path A: port the synthesizer. ~1-2 weeks. Requires extending the
  Rust session to track per-operation context (handles, retries,
  correlation ids). Out of V1 scope.
- Path B: empirical capture pairs. ~30 min × 6-10 scenarios. Output
  is a (byte, context → status) mapping that approximates without
  re-implementing. Risk: mapping is only valid for captured contexts.

R3/R4 stay settled at verbatim-preserve. The .NET reference does
the same for the same reason: the synthesizer is too context-
dependent to mirror without porting the entire operation-tracking
state machine.

Reopen criteria sharpened: either (a) a consumer files a concrete
use case for typed promotion of a specific byte+context combination
(Path B's empirical capture for that one combination is the cheapest
answer); or (b) a major-version bump justifies the state-machine
port (Path A).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-06 06:33:02 -04:00
parent 4dfc0cee65
commit 460c61df43
5 changed files with 1157 additions and 7 deletions
+22 -7
View File
@@ -44,17 +44,32 @@ The `OnBufferedDataChange` **public event shape** the wwtools api-notes describe
**Severity: P1** (was a blocker; now settled per option: verbatim preserve is the canonical native behaviour)
**Status (2026-05-06): SETTLED.** Ghidra headless decompile + string-ref walk across `Lmx.dll`, `LmxProxy.dll`, `NmxAdptr.dll`, `NmxSvc.exe`, and `NmxSvcps.dll` (logs at `analysis/ghidra/exports/Lmx.dll.aadct-decompile.md` + `analysis/ghidra/exports/LmxProxy.dll.completion-status-decompile.md`) confirms there is **no static byte→status lookup table** to extract. Specifically:
**Status (2026-05-06): SETTLED.** Five-stage Ghidra headless decompile traced the byte-to-`MXSTATUS_PROXY` synthesis path end-to-end across `Lmx.dll` and `LmxProxy.dll`. Logs:
- `analysis/ghidra/exports/Lmx.dll.aadct-decompile.md``aaDCT` symbol
- `analysis/ghidra/exports/LmxProxy.dll.completion-status-decompile.md` — Fire_* event handlers
- `analysis/ghidra/exports/LmxProxy.dll.fire-event-xrefs.md` — xrefs to Fire_*
- `analysis/ghidra/exports/LmxProxy.dll.status-synthesis-decompile.md` — Fire_* callers (`FUN_1001657f` / `FUN_10016b50` / `FUN_10016d4b`)
- `analysis/ghidra/exports/LmxProxy.dll.mxstatus-safearray-decompile.md``FUN_10003f60` (the SafeArray creator)
- `analysis/ghidra/exports/Lmx.dll.set-attribute-result-decompile.md``PreboundReference::OnSetAttributeResult` (`FUN_10114a90`)
- The `Lmx.aaDCT` symbol referenced at `0x10178fc0` is a `SysAllocString(L"Lmx.aaDCT")` call into a global BSTR — a logging category name, not a status-mapping table. Decompiled function body is a textbook static initializer, no array / lookup logic.
- `MXSTATUS_PROXY` (`analysis/decompiled-interop/Interop.Lmx/Interop/Lmx/MXSTATUS_PROXY.cs`) is a 4-field struct (`success: i16`, `category: MxStatusCategory`, `detectedBy: MxStatusSource`, `detail: i16`), used as the marshalled COM event payload — not a static array of pre-mapped statuses.
- The `Fire_OnDataChange` / `Fire_OnWriteComplete` / `Fire_OperationComplete` / `Fire_OnBufferedDataChange` event-firing functions in `LmxProxy.dll` (RVAs `0x15f72`, `0x1611f`, `0x16271`, `0x163c0`) receive **already-populated** `MXSTATUS_PROXY[]` arrays — the byte-to-struct synthesis happens upstream in the proxy's NMX-callback ingestion code, not via a table lookup. The synthesis is per-call computation from operation state (engine ids, item handles, retry counters), not a static byte→status promotion.
Findings, layer by layer (the wire bytes flow inward; the synthesis flows outward):
This means the .NET reference's verbatim-preservation strategy IS the canonical native behaviour: there is no table to mirror because the native code computes the `MXSTATUS_PROXY` from operation context per-event, not from a lookup. The 1-byte completion frames `0x00`, `0x41`, `0xEF` etc. are intermediate NMX-internal signaling that the proxy synthesizes contextual status from; the only frame with a proven typed promotion is the 5-byte status-word `00 00 50 80 00``MxStatus.WriteCompleteOk`.
1. **`Lmx.aaDCT`** at `0x10178fc0` is a `SysAllocString(L"Lmx.aaDCT")` into a global BSTR — a tracing category name, not a status-mapping table. No array / lookup logic.
2. **`MXSTATUS_PROXY`** (16 bytes, Pack=4) is a 4-field marshalled struct: `success: i16` at offset 0, `category: i16` at offset 4, `detectedBy: i16` at offset 8, `detail: i16` at offset 12. It is the *output* of synthesis, not a lookup-table entry.
3. **`LmxProxy.dll` Fire_* event handlers** (`FUN_10015f72`, `FUN_1001611f`, `FUN_10016271`, `FUN_100163c0`) take an *already-populated* `MXSTATUS_PROXY[]` and forward it through ATL connection-point dispatch. No synthesis here.
4. **`LmxProxy.dll` Fire_* callers** (`FUN_1001657f` for OnDataChange / OnBufferedDataChange, `FUN_10016b50` for OnWriteComplete, `FUN_10016d4b` for OperationComplete) call **`FUN_10003f60(out_safearray, in_status_ptr, count=1)`** which creates the SafeArray. `FUN_10003f60` is **a verbatim memcpy** of an existing 14-byte buffer into the SAFEARRAY data — no transformation. Source confirms: bytes flow `*local_8 = *param_2; *(local_8+2) = *(param_2+2); *(local_8+4) = *(param_2+4); local_8[6] = param_2[6]`.
5. **`Lmx.dll` `PreboundReference::OnSetAttributeResult`** (`FUN_10114a90`) — the CALLER of step 4's path — receives an already-populated `short *param_7` status buffer. Its log line confirms the layout: `swprintf_s(L"<success %d category %d detectedBy %d detail %d>", (i16)*p, *(p+2), *(p+4), p[6])`. Its dispatch logic checks typed values (`*local_b6c == -1`, `*(int *)(local_b6c + 2) == 3`) — synthesis is upstream of THIS function too.
**Current best answer:** unchanged — `Session::operation_status_events()` exposes `Stream<Item = RawOperationStatus>` carrying frame bytes. Promote to a typed `WriteCompleted` only on the proven `00 00 50 80 00` 5-byte pattern. Other bytes stay raw as `MxStatus { Success: 0, Category: Unknown, DetectedBy: Unknown, Detail: byte }`. The Rust codec mirrors `src/MxNativeCodec/NmxOperationStatusMessage.cs:TryParseInner`.
**The synthesizer is the NMX-frame decoder in `Lmx.dll`** that calls `OnSetAttributeResult` / `OnGetAttributeResult` / equivalent OperationComplete handler. That decoder takes raw NMX bytes (e.g. 1-byte `0x00`/`0x41`/`0xEF` completion frames or 5-byte `00 00 50 80 00`-style status words) plus operation context (which item, which engine, retry state, last-write-correlation-id) and computes the populated `MXSTATUS_PROXY`. **There is no static lookup table** — the synthesis is per-message contextual.
**Reopen when:** a live capture surfaces a 1-byte completion frame whose surrounding context (e.g. observed `MXSTATUS_PROXY` struct fired through the .NET probe alongside the byte) lets us back-derive a context-aware promotion. Since the native code synthesises the struct from operation state rather than a table, the promotion logic would itself need to be context-aware — i.e. the codec would need access to the originating operation's context, which is upstream of the bytes themselves. Until then, verbatim preservation is correct by construction.
**Why this means R3/R4 stay at "verbatim preserve" canonically.** Two viable paths exist if a future consumer demands typed promotion (neither is a small Rust patch):
- **Path A — port the synthesizer.** Find every NMX-decoder callsite of `OnSetAttributeResult`/`OnGetAttributeResult`/`OperationComplete` in `Lmx.dll` (next xref ring beyond `FUN_10114a90`); decompile each; reverse-engineer the per-decoder synthesis logic; port to Rust. The synthesis depends on operation-tracking state (item handles, retry counters, correlation ids) the Rust codec does not currently track — so the port is more than a codec change; it's a session-state-machine extension. Estimate: ~1-2 weeks of focused work.
- **Path B — empirical capture pairs.** Run the .NET probe in scenarios that produce each completion byte; capture the (input NMX bytes, observed `MXSTATUS_PROXY`) pair via Frida hooks on `LmxProxy.dll!FUN_10003f60`; build an empirical (byte + context → status) mapping. ~30 min × 6-10 scenarios. Output: a mapping table that approximates the synthesizer without re-implementing it. Risk: the mapping is only valid for the captured operation contexts; new contexts may produce statuses outside the table.
**Current best answer:** unchanged — `Session::operation_status_events()` exposes `Stream<Item = RawOperationStatus>` carrying frame bytes. Promote to a typed `WriteCompleted` only on the proven `00 00 50 80 00` 5-byte pattern. Other bytes stay raw as `MxStatus { Success: 0, Category: Unknown, DetectedBy: Unknown, Detail: byte }`. The Rust codec mirrors `src/MxNativeCodec/NmxOperationStatusMessage.cs:TryParseInner`. The .NET reference does the same, for the same reason: the synthesizer is too context-dependent to mirror without porting the entire operation-tracking state machine, and that exceeds V1 scope.
**Reopen when:** either (a) a consumer files a concrete use case for typed promotion of a specific byte+context combination — at which point Path B's empirical capture for that one combination is the cheapest answer; or (b) a major-version bump justifies the operation-tracking state-machine port (Path A). Until then, verbatim preservation is correct by construction.
### R4 — Completion-only byte mapping **(settled 2026-05-06 — collapses into R3's resolution)**