Files
natsdotnet/benchmarks.md
Joseph Doherty 37575dc41c feat: add benchmark test project for Go vs .NET server comparison
Side-by-side performance benchmarks using NATS.Client.Core against both
servers on ephemeral ports. Includes core pub/sub, request/reply latency,
and JetStream throughput tests with comparison output and
benchmarks_comparison.md results. Also fixes timestamp flakiness in
StoreInterfaceTests by using explicit timestamps.
2026-03-13 01:23:31 -04:00

4.1 KiB

NATS Go Server — Reference Benchmark Numbers

Typical throughput and latency figures for the Go NATS server, collected from official documentation and community benchmarks. These serve as performance targets for the .NET port.

Test Environment

Official NATS docs benchmarks were run on an Apple M4 (10 cores: 4P + 6E, 16 GB RAM). Numbers will vary by hardware — treat these as order-of-magnitude targets, not exact goals.


Core NATS — Pub/Sub Throughput

All figures use the nats bench tool with default 16-byte messages unless noted.

Single Publisher (no subscribers)

Messages Payload Throughput Latency
1M 16 B 14,786,683 msgs/sec (~226 MiB/s) 0.07 us

Publisher + Subscriber (1:1)

Messages Payload Throughput (each side) Latency
1M 16 B ~4,927,000 msgs/sec (~75 MiB/s) 0.20 us
100K 16 KB ~228,000 msgs/sec (~3.5 GiB/s) 4.3 us

Fan-Out (1 Publisher : N Subscribers)

Subscribers Payload Per-Subscriber Rate Aggregate Latency
4 128 B ~1,010,000 msgs/sec 4,015,923 msgs/sec (~490 MiB/s) ~1.0 us

Multi-Publisher / Multi-Subscriber (N:M)

Config Payload Pub Aggregate Sub Aggregate Pub Latency Sub Latency
4P x 4S 128 B 1,080,144 msgs/sec (~132 MiB/s) 4,323,201 msgs/sec (~528 MiB/s) 3.7 us 0.93 us

Core NATS — Request/Reply Latency

Config Payload Throughput Avg Latency
1 client, 1 service 128 B 19,659 msgs/sec 50.9 us
50 clients, 2 services 16 B 132,438 msgs/sec ~370 us

Core NATS — Tail Latency Under Load

From the Brave New Geek latency benchmarks (loopback, request/reply):

Payload Rate p99.99 p99.9999 Notes
256 B 3,000 req/s sub-ms sub-ms Minimal load
1 KB 3,000 req/s sub-ms ~1.2 ms
5 KB 2,000 req/s sub-ms ~1.2 ms
1 KB 20,000 req/s (25 conns) elevated ~90 ms Concurrent load
1 MB 100 req/s ~214 ms Large payload tail

A protocol parser optimization improved 5 KB latencies by ~30% and 1 MB latencies by ~90% up to p90.


JetStream — Publication

Mode Payload Storage Throughput Latency
Synchronous 16 B Memory 35,734 msgs/sec (~558 KiB/s) 28.0 us
Batch (1000 msgs) 16 B Memory 627,430 msgs/sec (~9.6 MiB/s) 1.6 us
Async 128 B File 403,828 msgs/sec (~49 MiB/s) 2.5 us

JetStream — Consumption

Mode Clients Throughput Latency
Ordered ephemeral consumer 1 1,201,540 msgs/sec (~147 MiB/s) 0.83 us
Durable consumer (callback) 4 290,438 msgs/sec (~36 MiB/s) 13.7 us
Durable consumer fetch (no ack) 2 1,128,932 msgs/sec (~138 MiB/s) 1.76 us
Direct sync get 1 33,244 msgs/sec (~4.1 MiB/s) 30.1 us
Batched get 2 1,000,898 msgs/sec (~122 MiB/s)

JetStream — Key-Value Store

Operation Clients Payload Throughput Latency
Sync put 1 128 B 30,067 msgs/sec (~3.7 MiB/s) 33.3 us
Get (randomized keys) 16 128 B 102,844 msgs/sec (~13 MiB/s) ~153 us

Resource Usage

Scenario RSS Memory
Core NATS at 2M msgs/sec (1 pub + 1 sub) ~11 MB
JetStream production (recommended minimum) 4 CPU cores, 8 GiB RAM

Sources