docs: record parser hot-path allocation strategy
This commit is contained in:
@@ -1,8 +1,46 @@
|
||||
# Go vs .NET NATS Server — Benchmark Comparison
|
||||
|
||||
Benchmark run: 2026-03-13. Both servers running on the same machine, tested with identical NATS.Client.Core workloads. Test parallelization disabled to avoid resource contention. Best-of-3 runs reported.
|
||||
Benchmark run: 2026-03-13 10:06 AM America/Indiana/Indianapolis. The latest refresh used the benchmark project README command (`dotnet test tests/NATS.Server.Benchmark.Tests --filter "Category=Benchmark" -v normal --logger "console;verbosity=detailed"`) and completed successfully as a `.NET`-only run. The Go/.NET comparison tables below remain the last Go-capable comparison baseline.
|
||||
|
||||
**Environment:** Apple M4, .NET 10, Go nats-server (latest from `golang/nats-server/`).
|
||||
**Environment:** Apple M4, .NET SDK 10.0.101, README benchmark command run in the benchmark project's default `Debug` configuration, Go toolchain installed but the current full-suite run emitted only `.NET` result blocks.
|
||||
|
||||
---
|
||||
|
||||
## Latest README Run (.NET only)
|
||||
|
||||
The current refresh came from `/tmp/bench-output.txt` using the benchmark project README workflow. Because the run did not emit any Go comparison blocks, the values below are the latest `.NET`-only numbers from that run, and the historical Go/.NET comparison tables are preserved below instead of being overwritten with mixed-source ratios.
|
||||
|
||||
### Core and JetStream
|
||||
|
||||
| Benchmark | .NET msg/s | .NET MB/s | Notes |
|
||||
|-----------|------------|-----------|-------|
|
||||
| Single Publisher (16B) | 1,392,442 | 21.2 | README full-suite run |
|
||||
| Single Publisher (128B) | 1,491,226 | 182.0 | README full-suite run |
|
||||
| PubSub 1:1 (16B) | 717,731 | 11.0 | README full-suite run |
|
||||
| PubSub 1:1 (16KB) | 28,450 | 444.5 | README full-suite run |
|
||||
| Fan-Out 1:4 (128B) | 1,451,748 | 177.2 | README full-suite run |
|
||||
| Multi 4Px4S (128B) | 244,878 | 29.9 | README full-suite run |
|
||||
| Request-Reply Single (128B) | 6,840 | 0.8 | P50 142.5 us, P99 203.9 us |
|
||||
| Request-Reply 10Cx2S (16B) | 22,844 | 0.3 | P50 421.1 us, P99 602.1 us |
|
||||
| JS Sync Publish (16B Memory) | 12,619 | 0.2 | README full-suite run |
|
||||
| JS Async Publish (128B File) | 46,631 | 5.7 | README full-suite run |
|
||||
| JS Ordered Consumer (128B) | 108,057 | 13.2 | README full-suite run |
|
||||
| JS Durable Fetch (128B) | 490,090 | 59.8 | README full-suite run |
|
||||
|
||||
### Parser Microbenchmarks
|
||||
|
||||
| Benchmark | Ops/s | MB/s | Alloc |
|
||||
|-----------|-------|------|-------|
|
||||
| Parser PING | 5,756,370 | 32.9 | 0.0 B/op |
|
||||
| Parser PUB | 2,537,973 | 96.8 | 40.0 B/op |
|
||||
| Parser HPUB | 2,298,811 | 122.8 | 40.0 B/op |
|
||||
| Parser PUB split payload | 2,049,535 | 78.2 | 176.0 B/op |
|
||||
|
||||
### Current Run Highlights
|
||||
|
||||
1. The parser microbenchmarks show the hot path is already at zero allocation for `PING`, with contiguous `PUB` and `HPUB` still paying a small fixed cost for retained field copies.
|
||||
2. Split-payload `PUB` remains meaningfully more allocation-heavy than contiguous `PUB` because the parser must preserve unread payload state across reads and then materialize contiguous memory at the current client boundary.
|
||||
3. The README-driven suite was a `.NET`-only refresh, so the comparative Go/.NET ratios below should still be treated as the last Go-capable baseline rather than current same-run ratios.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user