mxaccess

Author	SHA1	Message	Date
Joseph Doherty	ceeaeefa71	[F52.3] mxaccess-codec: caller-supplied scratch buffer for write encoder rust / build / test / clippy / fmt (push) Has been cancelled Details rust / cargo public-api drift check (F41) (push) Has been cancelled Details Adds `write_message::encode_into_bytes_mut` (and the timestamped variant) which writes the encoded body into a caller-supplied `BytesMut`. The buffer is cleared and resized in place each call; once it has grown to the largest body the session will produce, it allocates nothing further. A session that holds a single `BytesMut` and reuses it across writes: - Int32 / Float32 / Float64: 2 → 1 allocs/op (only the `encode_scalar_value` scratch `Vec<u8>` remains) - Boolean: 1 → 0 allocs/op (no per-value scratch — the literal payload is a stack `[u8; 4]`) Bench delta in `design/M6-bench-baseline.md` § F52.3. The `encode_scalar_value` Vec is the remaining 1 alloc/op for fixed-width scalars; eliminating it would require inlining the LE-bytes write into the body slice (left for a follow-up since the F52 spec only asks for 2 → 1). Resolves F52 (all three optimisations landed: `4e76b44` F52.1, `a0fa5be` F52.2, this commit F52.3). Existing `encode` / `encode_to_bytes_mut` public surface unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 22:53:07 -04:00
Joseph Doherty	a0fa5bedfd	[F52.2] mxaccess-codec: thread-local name-signature cache Adds a thread-local `HashMap<String, u16>` cache inside `compute_name_signature`. Repeated calls with the same name (the hot path inside `MxReferenceHandle::from_names`) skip the `to_lowercase` allocation and the CRC-16/IBM walk entirely. Bounded at 1024 entries per thread; on overflow the cache is cleared rather than evicted LRU — any sane workload re-fills only the names it actively uses. `MxReferenceHandle::from_names` drops from 2 → 0 allocs/op once warm (bench delta in `design/M6-bench-baseline.md` § F52.2). Cold-path behaviour is unchanged: first call with a new name still pays the `to_lowercase` + cache-key `String` allocations. Two new tests pin the cache: cache-hit returns the same value as cold-compute, and cache overflow doesn't break correctness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 22:50:07 -04:00
Joseph Doherty	4e76b44391	[F52.1] mxaccess-codec: BytesMut output buffer for write encoder Adds `write_message::encode_to_bytes_mut` (and the timestamped variant) returning a freshly-allocated `BytesMut`. Allocation count is identical to `encode` (2 allocs/op for fixed-width scalars); the benefit is downstream — consumers can `BytesMut::split_to` / `freeze` and forward the body bytes to a wire-level sink without an intermediate copy. The body builders (`encode_boolean` / `encode_fixed` / `encode_variable` / `encode_array`) were refactored to fill a pre-sized `&mut [u8]` rather than each allocating their own `Vec<u8>`. The dispatcher computes the body size up front via small `*_body_size` helpers and resizes the destination buffer (Vec or BytesMut) once. This is also the prerequisite refactor for F52.3. Bench delta in `design/M6-bench-baseline.md` § F52.1; existing `encode` row unchanged at 2 allocs/op. All 265 round-trip tests unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 22:46:02 -04:00
Joseph Doherty	71c69b80c6	[F38] mxaccess-codec: counting-allocator bench harness + R12 baseline Hand-rolled GlobalAlloc wrapper around System that tracks allocs + bytes + deallocs via two atomics. Each scenario runs 10k iterations after a 1k warm-up; output is a markdown table with allocs/op, bytes/op, deallocs/op. Why hand-rolled (not dhat/criterion): R12 gates on a single number ("< 5 allocs/write"). dhat is heap-profiling-oriented (call-stack attribution, JSON snapshots); criterion measures wall-clock latency which is reported-but-not-gated per 60-roadmap.md:104. A 50-line GlobalAlloc + atomic counters is the simplest thing that answers the gate. Run: `cargo bench -p mxaccess-codec` Baseline numbers (release, Windows x64): - Bool write: 1.00 allocs/op - Int32 write: 2.00 allocs/op - Float32 write: 2.00 allocs/op - Float64 write: 2.00 allocs/op - String write: 4.00 allocs/op (5-char string) - Handle from_names: 2.00 allocs/op - DataUpdate decode: 1.00 alloc/op R12's < 5 allocs/write target is already met across the proven matrix without any zero-copy work. The bench gates on this — any write_message::encode scenario at >= 5 allocs/op exits the harness with code 1. Companion: `design/M6-bench-baseline.md` documents the numbers, explains the per-scenario breakdown, and tightens F39's scope from "hit the target" to "nice-to-have optimisations" (BytesMut output buffer, name-signature cache, session-level scratch pool). Workspace: 759 tests still pass; clippy --benches clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 04:45:33 -04:00

4 Commits