[M5] mxaccess-asb: WCF binary message header (action+to dict pre-pop)

Adds the binary header block that WCF prepends to SizedEnvelope
payloads. Reverse-engineered from .NET probe wire bytes captured via
asb-relay.

Wire form (per the .NET capture analysis in the previous commit):
```
[outer length, multibyte-int31]
  [string-1 length, multibyte-int31] [UTF-8 bytes]   ← dict id 1 (action)
  [string-2 length, multibyte-int31] [UTF-8 bytes]   ← dict id 3 (to)
[NBFX <s:Envelope>...]
```

Inside the NBFX envelope, `<a:Action>` and `<a:To>` reference the
pre-pop strings via `DictionaryText 0xAA {odd-id}` instead of inlining
their values. The header strings get assigned odd dict ids
(1, 3, 5, ...); even ids stay reserved for the [MC-NBFS] static dict.

Encode side:
* `encode_envelope` now emits header [action, to] before NBFX. `to_uri`
  defaults to empty string when None — caller-supplied `with_to(uri)`
  is the supported path.
* AsbClient's `send_envelope` and `send_envelope_one_way` auto-fill
  `to_uri` from `self.via_uri` when not set.
* New private `encode_binary_header(strings)` helper.

Decode side:
* New `parse_binary_header_prefix(input)` heuristically detects + parses
  the header (look for plausible NBFX element record byte 0x40-0x77 at
  the offset implied by the outer length).
* New `resolve_with_header(text, dynamic, header)` resolves
  `DictionaryText` with odd id by indexing into header.strings; even
  ids fall through to static-dict lookup as before.

Tests pass (72) — round-trip envelope → bytes → envelope recovers
action through the new dict-id resolution path.

Live status: this commit gets us further but the connect SOAP
envelope still TCP-RSTs at SMSvcHost. The remaining delta vs the .NET
capture is structural NBFX optimisation: .NET uses single-letter
prefix-element/attribute records (0x44-0x77 PrefixDictionaryElement
_<a-z>, 0x0C-0x25 PrefixDictionaryAttribute_<a-z>, 0x0B
DictionaryXmlnsAttribute) while our F21 encoder always uses the long
forms (0x43 prefix-string + name-dict-id, etc.). Logically
equivalent but WCF's parser likely strict on which form it accepts.

Next iteration will add short-form encoding to F21 for single-letter
prefixes (s:, a:, h:, i:) which covers every namespace prefix in our
envelope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-05 15:40:59 -04:00
parent d4ee5f3a18
commit 2867310817
4 changed files with 265 additions and 90 deletions
+16 -2
View File
@@ -152,7 +152,16 @@ impl<T: AsyncRead + AsyncWrite + Unpin + Send> AsbClient<T> {
return Err(ClientError::AlreadyClosed);
}
let payload = encode_envelope(envelope, &mut self.write_dictionary)?;
// Default the WS-Addressing To header to the same URL we put
// in the NMF Via record. WCF dispatches by To-URL match
// against the registered service URL; an empty / wrong To
// produces an AddressFilterMismatch fault.
let envelope = if envelope.to_uri.is_some() {
envelope.clone()
} else {
envelope.clone().with_to(self.via_uri.clone())
};
let payload = encode_envelope(&envelope, &mut self.write_dictionary)?;
let mut framed = Vec::new();
NmfRecord::SizedEnvelope(payload).encode_into(&mut framed)?;
self.stream.write_all(&framed).await?;
@@ -276,7 +285,12 @@ impl<T: AsyncRead + AsyncWrite + Unpin + Send> AsbClient<T> {
if self.closed {
return Err(ClientError::AlreadyClosed);
}
let payload = encode_envelope(envelope, &mut self.write_dictionary)?;
let envelope = if envelope.to_uri.is_some() {
envelope.clone()
} else {
envelope.clone().with_to(self.via_uri.clone())
};
let payload = encode_envelope(&envelope, &mut self.write_dictionary)?;
let mut framed = Vec::new();
NmfRecord::SizedEnvelope(payload).encode_into(&mut framed)?;
self.stream.write_all(&framed).await?;
+245 -85
View File
@@ -169,108 +169,131 @@ impl SoapEnvelope {
}
/// Encode a SOAP envelope to NBFX bytes. Returns the byte buffer + the
/// dynamic dictionary state at end-of-encode (the F25 client threads
/// that through subsequent envelopes for compression).
/// dynamic dictionary state at end-of-encode.
///
/// Wire shape:
/// ```xml
/// <s:Envelope> (dict 4 / 2)
/// <s:Header> (dict 8)
/// <a:Action s:mustUnderstand="1">…</a:Action> (dict 10)
/// [<h:ConnectionValidator …/>] (asb headers ns)
/// </s:Header>
/// <s:Body> (dict 14)
/// {body_tokens}
/// </s:Body>
/// </s:Envelope>
/// ```
/// **Wire shape (WCF binary message format)**:
///
/// 1. **Binary header block** prepended to the NBFX envelope. WCF
/// uses this to pre-populate the per-session dynamic dictionary
/// with strings that appear inside the envelope. Each string gets
/// an odd dictionary id (`1, 3, 5, ...` — even ids are reserved
/// for the static [MC-NBFS] dictionary).
///
/// ```text
/// [outer length, multibyte-int31]
/// [action length, multibyte-int31] [action UTF-8 bytes] ← dict id 1
/// [to length, multibyte-int31] [to UTF-8 bytes] ← dict id 3
/// ```
///
/// 2. **NBFX envelope** that references the pre-populated strings:
///
/// ```xml
/// <s:Envelope xmlns:s="…" xmlns:a="…">
/// <s:Header>
/// <a:Action s:mustUnderstand="1">{dict 1}</a:Action>
/// [<h:ConnectionValidator …/>]
/// <a:MessageID>urn:uuid:…</a:MessageID>
/// <a:ReplyTo><a:Address>{anonymous}</a:Address></a:ReplyTo>
/// <a:To s:mustUnderstand="1">{dict 3}</a:To>
/// </s:Header>
/// <s:Body>{body_tokens}</s:Body>
/// </s:Envelope>
/// ```
///
/// The header form was reverse-engineered from the .NET reference's
/// wire bytes (captured via `examples/asb-relay`); see `[M5]
/// live-probe iteration` commits in the followups for the full
/// derivation.
pub fn encode_envelope(
envelope: &SoapEnvelope,
dynamic: &mut DynamicDictionary,
) -> Result<Vec<u8>, NbfxError> {
// ---- Binary header block ----
//
// Pre-populate the per-session dynamic dictionary with strings the
// NBFX envelope below references. Each string gets an odd dict id
// (`1, 3, 5, ...` — even ids are reserved for [MC-NBFS] static).
// We always include action + to even when `to_uri` is None: in
// that case we use an empty string for the To slot. WCF's default
// service dispatcher requires both populated for net.tcp.
let to_uri = envelope.to_uri.clone().unwrap_or_default();
let action_dict_id: u32 = 1;
let to_dict_id: u32 = 3;
let header_strings = [envelope.action.as_str(), to_uri.as_str()];
// ---- NBFX envelope tokens ----
let mut tokens = vec![
// <s:Envelope xmlns:s="…soap-envelope" xmlns:a="…addressing">
NbfxToken::Element {
prefix: Some("s".to_string()),
name: NbfxName::Static(ns::ENVELOPE),
},
NbfxToken::NamespaceDeclaration {
prefix: "s".to_string(),
value: NbfxText::DictionaryStatic(ns::SOAP_ENVELOPE),
},
NbfxToken::NamespaceDeclaration {
prefix: "a".to_string(),
value: NbfxText::DictionaryStatic(ns::WS_ADDRESSING),
},
// <s:Header>
NbfxToken::Element {
prefix: Some("s".to_string()),
name: NbfxName::Static(ns::HEADER),
},
// <a:Action s:mustUnderstand="1">{dict id 1}</a:Action>
NbfxToken::Element {
prefix: Some("a".to_string()),
name: NbfxName::Static(ns::ACTION),
},
NbfxToken::Attribute {
prefix: Some("s".to_string()),
name: NbfxName::Static(ns::MUST_UNDERSTAND_ATTR),
value: NbfxText::One,
},
NbfxToken::Text(NbfxText::DictionaryStatic(action_dict_id)),
NbfxToken::EndElement, // </a:Action>
];
tokens.push(NbfxToken::NamespaceDeclaration {
prefix: "s".to_string(),
value: NbfxText::DictionaryStatic(ns::SOAP_ENVELOPE),
});
tokens.push(NbfxToken::NamespaceDeclaration {
prefix: "a".to_string(),
value: NbfxText::DictionaryStatic(ns::WS_ADDRESSING),
});
// <s:Header>
tokens.push(NbfxToken::Element {
prefix: Some("s".to_string()),
name: NbfxName::Static(ns::HEADER),
});
// <h:ConnectionValidator …/> (when present, comes before
// MessageID/ReplyTo per the .NET dump's element order)
if let Some(v) = &envelope.validator {
encode_validator(&mut tokens, v, dynamic);
}
// <a:Action s:mustUnderstand="1">{action}</a:Action>
// <a:MessageID>urn:uuid:{uuid}</a:MessageID>
let message_id = format!("urn:uuid:{}", make_random_uuid_v4());
tokens.push(NbfxToken::Element {
prefix: Some("a".to_string()),
name: NbfxName::Static(ns::ACTION),
name: NbfxName::Static(26),
});
tokens.push(NbfxToken::Text(NbfxText::Chars(message_id)));
tokens.push(NbfxToken::EndElement); // </a:MessageID>
// <a:ReplyTo><a:Address>{anonymous}</a:Address></a:ReplyTo>
tokens.push(NbfxToken::Element {
prefix: Some("a".to_string()),
name: NbfxName::Static(44),
});
tokens.push(NbfxToken::Element {
prefix: Some("a".to_string()),
name: NbfxName::Static(42),
});
tokens.push(NbfxToken::Text(NbfxText::DictionaryStatic(20)));
tokens.push(NbfxToken::EndElement); // </a:Address>
tokens.push(NbfxToken::EndElement); // </a:ReplyTo>
// <a:To s:mustUnderstand="1">{dict id 3}</a:To>
tokens.push(NbfxToken::Element {
prefix: Some("a".to_string()),
name: NbfxName::Static(12),
});
tokens.push(NbfxToken::Attribute {
prefix: Some("s".to_string()),
name: NbfxName::Static(ns::MUST_UNDERSTAND_ATTR),
value: NbfxText::One,
});
tokens.push(NbfxToken::Text(NbfxText::Chars(envelope.action.clone())));
tokens.push(NbfxToken::EndElement); // </a:Action>
// <h:ConnectionValidator …/> (WCF dump shows this comes BEFORE
// MessageID/ReplyTo when present)
if let Some(v) = &envelope.validator {
encode_validator(&mut tokens, v, dynamic);
}
// <a:MessageID>urn:uuid:{uuid}</a:MessageID>
// WCF's default binding requires MessageID for two-way operations.
// We auto-generate one per envelope; the value is opaque to the
// service but must be a valid URI.
let message_id = format!("urn:uuid:{}", make_random_uuid_v4());
tokens.push(NbfxToken::Element {
prefix: Some("a".to_string()),
name: NbfxName::Static(26), // "MessageID"
});
tokens.push(NbfxToken::Text(NbfxText::Chars(message_id)));
tokens.push(NbfxToken::EndElement); // </a:MessageID>
// <a:ReplyTo>
// <a:Address>http://www.w3.org/2005/08/addressing/anonymous</a:Address>
// </a:ReplyTo>
tokens.push(NbfxToken::Element {
prefix: Some("a".to_string()),
name: NbfxName::Static(44), // "ReplyTo"
});
tokens.push(NbfxToken::Element {
prefix: Some("a".to_string()),
name: NbfxName::Static(42), // "Address"
});
tokens.push(NbfxToken::Text(NbfxText::DictionaryStatic(20))); // anonymous
tokens.push(NbfxToken::EndElement); // </a:Address>
tokens.push(NbfxToken::EndElement); // </a:ReplyTo>
// <a:To s:mustUnderstand="1">{to_uri}</a:To> (optional — WCF
// omits To for net.tcp request/response by default)
if let Some(to) = &envelope.to_uri {
tokens.push(NbfxToken::Element {
prefix: Some("a".to_string()),
name: NbfxName::Static(12), // "To"
});
tokens.push(NbfxToken::Attribute {
prefix: Some("s".to_string()),
name: NbfxName::Static(ns::MUST_UNDERSTAND_ATTR),
value: NbfxText::One,
});
tokens.push(NbfxToken::Text(NbfxText::Chars(to.clone())));
tokens.push(NbfxToken::EndElement);
}
tokens.push(NbfxToken::Text(NbfxText::DictionaryStatic(to_dict_id)));
tokens.push(NbfxToken::EndElement); // </a:To>
tokens.push(NbfxToken::EndElement); // </s:Header>
@@ -284,8 +307,121 @@ pub fn encode_envelope(
tokens.push(NbfxToken::EndElement); // </s:Envelope>
let mut out = Vec::with_capacity(estimate_envelope_size(envelope));
encode_tokens(&tokens, dynamic, &mut out)?;
// ---- Assemble output: binary header + NBFX envelope ----
let mut nbfx_bytes = Vec::with_capacity(estimate_envelope_size(envelope));
encode_tokens(&tokens, dynamic, &mut nbfx_bytes)?;
let header_bytes = encode_binary_header(&header_strings)?;
let mut out = Vec::with_capacity(header_bytes.len() + nbfx_bytes.len());
out.extend_from_slice(&header_bytes);
out.extend_from_slice(&nbfx_bytes);
Ok(out)
}
/// Encode the WCF binary message header that prepends the NBFX envelope.
/// The header pre-populates the per-session dynamic dictionary with
/// `strings`, in order — the first gets dict id 1, the second id 3,
/// etc. (odd ids only; evens are reserved for static `[MC-NBFS]`).
///
/// Wire format:
/// ```text
/// [outer length as multibyte-int31]
/// [string 1 length as multibyte-int31] [UTF-8 bytes]
/// [string 2 length as multibyte-int31] [UTF-8 bytes]
/// ...
/// ```
/// Parsed WCF binary header: the strings pre-populated into the
/// session dynamic dictionary + the byte offset where the NBFX
/// envelope begins.
struct ParsedBinaryHeader {
/// Pre-pop strings in declaration order. Wire ids: index 0 → 1,
/// index 1 → 3, index 2 → 5, etc. (odd numbers; even reserved
/// for `[MC-NBFS]` static dict).
strings: Vec<String>,
nbfx_start: usize,
}
/// Detect + decode the WCF binary header block at the start of a SOAP
/// envelope payload. Returns `None` if no header is present (e.g. the
/// peer didn't emit one).
///
/// Heuristic: read a multibyte-int31 length L from the start. If the
/// byte at offset 1+L is a plausible NBFX element record byte
/// (`0x40`-`0x77`), treat the first 1+L bytes as the header. Walk the
/// inner block as a sequence of length-prefixed UTF-8 strings.
fn parse_binary_header_prefix(input: &[u8]) -> Option<ParsedBinaryHeader> {
use mxaccess_asb_nettcp::nmf::decode_multibyte_int31;
let mut cursor = 0usize;
let outer_len = decode_multibyte_int31(input, &mut cursor).ok()?;
let outer_len = usize::try_from(outer_len).ok()?;
let header_start = cursor;
let nbfx_start = header_start + outer_len;
if nbfx_start >= input.len() {
return None;
}
let first_nbfx = *input.get(nbfx_start)?;
if !(0x40..=0x77).contains(&first_nbfx) {
return None;
}
// Walk the inner block as (multibyte-int31 length, UTF-8 bytes).
let mut strings = Vec::new();
let mut p = header_start;
let header_end = header_start + outer_len;
while p < header_end {
let str_len = decode_multibyte_int31(input, &mut p).ok()?;
let str_len = usize::try_from(str_len).ok()?;
let bytes = input.get(p..p + str_len)?;
let s = std::str::from_utf8(bytes).ok()?;
strings.push(s.to_string());
p += str_len;
}
Some(ParsedBinaryHeader {
strings,
nbfx_start,
})
}
/// Resolve an NBFX text token using static dict + dynamic dict + the
/// binary-header pre-pop strings (odd dict ids).
fn resolve_with_header(
text: &NbfxText,
dynamic: &DynamicDictionary,
header: Option<&ParsedBinaryHeader>,
) -> Option<String> {
if let NbfxText::DictionaryStatic(id) = text {
// Even ids hit the static dict; odd ids hit the dynamic
// pre-pop. Wire id 2N+1 → header.strings[N].
if id % 2 == 1 {
if let Some(h) = header {
let idx = (*id as usize - 1) / 2;
if let Some(s) = h.strings.get(idx) {
return Some(s.clone());
}
}
}
}
text.resolve(dynamic)
}
fn encode_binary_header(strings: &[&str]) -> Result<Vec<u8>, NbfxError> {
use mxaccess_asb_nettcp::nmf::encode_multibyte_int31;
let mut inner = Vec::new();
for s in strings {
let len = i32::try_from(s.len()).map_err(|_| NbfxError::PayloadTooLarge {
len: s.len(),
max: i32::MAX as u64,
})?;
encode_multibyte_int31(&mut inner, len).map_err(|_| NbfxError::IntOverflow)?;
inner.extend_from_slice(s.as_bytes());
}
let inner_len = i32::try_from(inner.len()).map_err(|_| NbfxError::PayloadTooLarge {
len: inner.len(),
max: i32::MAX as u64,
})?;
let mut out = Vec::new();
encode_multibyte_int31(&mut out, inner_len).map_err(|_| NbfxError::IntOverflow)?;
out.extend_from_slice(&inner);
Ok(out)
}
@@ -306,7 +442,31 @@ pub fn decode_envelope(
input: &[u8],
dynamic: &mut DynamicDictionary,
) -> Result<DecodedEnvelope, EnvelopeError> {
let (tokens, _consumed) = decode_tokens(input, dynamic)?;
// Strip + decode the WCF binary header block (action+to pre-pop)
// if present. The header strings get assigned odd dict ids
// (1, 3, 5, ...); inside the NBFX envelope they're referenced via
// `DictionaryText (0xAA) {odd-id}`. We feed each into the F21
// dynamic dictionary at the matching offset so `text.resolve()`
// returns the right string.
let header = parse_binary_header_prefix(input);
let nbfx_input = match &header {
Some(h) => input.get(h.nbfx_start..).unwrap_or(input),
None => input,
};
if let Some(h) = &header {
// F21's DynamicDictionary uses sequential ids starting at 0.
// Wire ids for dynamic strings are odd (1, 3, 5, ...).
// Prefill our internal dict with sentinel placeholders at
// even indices so the strings land at odd ones via `intern`.
for (i, s) in h.strings.iter().enumerate() {
// Slot 0 → wire id 1, slot 1 → wire id 3, etc. Since
// intern just appends, this works as long as we intern
// header strings before any other dynamic-dict use.
let _ = i; // silence unused-var if we change scheme later
dynamic.intern(s);
}
}
let (tokens, _consumed) = decode_tokens(nbfx_input, dynamic)?;
let mut action = None;
let mut validator: Option<ConnectionValidator> = None;
let mut body_tokens = Vec::new();
@@ -320,7 +480,7 @@ pub fn decode_envelope(
} if *id == ns::ACTION => {
idx = consume_attributes(&tokens, idx + 1);
if let Some(NbfxToken::Text(text)) = tokens.get(idx) {
action = text.resolve(dynamic);
action = resolve_with_header(text, dynamic, header.as_ref());
idx += 1;
}
idx = skip_until_end(&tokens, idx);
@@ -48,8 +48,9 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
let connection_id = [0xAAu8; 16];
let public_key = vec![0xBBu8; 32];
let body = build_connect_request_body(connection_id, &public_key);
let envelope = SoapEnvelope::new(actions::CONNECT).with_body_tokens(body);
let _ = via.clone(); // keep $via in scope for the eprintln above
let envelope = SoapEnvelope::new(actions::CONNECT)
.with_to(&via)
.with_body_tokens(body);
let mut dynamic = DynamicDictionary::new();
let payload = encode_envelope(&envelope, &mut dynamic)?;
eprintln!("envelope NBFX bytes: {}", payload.len());
+1 -1
View File
@@ -43,8 +43,8 @@
//! connects to us, but the URL inside the preamble routes correctly
//! at SMSvcHost).
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::net::{TcpListener, TcpStream};