docs: refresh benchmark comparison with increased message counts

Increase message counts across all 14 benchmark test files to reduce
run-to-run variance (e.g. PubSub 16B: 10K→50K, FanOut: 10K→15K,
SinglePub: 100K→500K, JS tests: 5K→25K). Rewrite benchmarks_comparison.md
with fresh numbers from two-batch runs. Key changes: multi 4x4 reached
parity (1.01x), fan-out improved to 0.84x, TLS pub/sub shows 4.70x .NET
advantage, previous small-count anomalies corrected.
This commit is contained in:
Joseph Doherty
2026-03-13 17:52:03 -04:00
parent 660a897234
commit 1d4b87e5f9
14 changed files with 94 additions and 99 deletions

View File

@@ -1,11 +1,11 @@
# Go vs .NET NATS Server — Benchmark Comparison # Go vs .NET NATS Server — Benchmark Comparison
Benchmark run: 2026-03-13 America/Indiana/Indianapolis. Both servers ran on the same machine using the benchmark project README command (`dotnet test tests/NATS.Server.Benchmark.Tests -c Release --filter "Category=Benchmark" -v normal --logger "console;verbosity=detailed"`). Test parallelization remained disabled inside the benchmark assembly. Benchmark run: 2026-03-13 America/Indiana/Indianapolis. Both servers ran on the same machine using the benchmark project (`dotnet test tests/NATS.Server.Benchmark.Tests -c Release --filter "Category=Benchmark" -v normal --logger "console;verbosity=detailed"`). Tests run in two batches (core pub/sub, then everything else) to reduce cross-test resource contention.
**Environment:** Apple M4, .NET SDK 10.0.101, Release build, Go toolchain installed, Go reference server built from `golang/nats-server/`.
**Environment:** Apple M4, .NET SDK 10.0.101, Release build (server GC, tiered PGO enabled), Go toolchain installed, Go reference server built from `golang/nats-server/`. **Environment:** Apple M4, .NET SDK 10.0.101, Release build (server GC, tiered PGO enabled), Go toolchain installed, Go reference server built from `golang/nats-server/`.
--- > **Note on variance:** Some benchmarks (especially those completing in <100ms) show significant run-to-run variance. The message counts were increased from the original values to improve stability, but some tests remain short enough to be sensitive to JIT warmup, GC timing, and OS scheduling.
--- ---
## Core NATS — Pub/Sub Throughput ## Core NATS — Pub/Sub Throughput
@@ -14,29 +14,27 @@ Benchmark run: 2026-03-13 America/Indiana/Indianapolis. Both servers ran on the
| Payload | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) | | Payload | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) |
|---------|----------|---------|------------|-----------|-----------------| |---------|----------|---------|------------|-----------|-----------------|
| 16 B | 2,223,690 | 33.9 | 1,651,727 | 25.2 | 0.74x | | 16 B | 2,162,959 | 33.0 | 1,602,442 | 24.5 | 0.74x |
| 128 B | 2,218,308 | 270.8 | 1,368,967 | 167.1 | 0.62x | | 128 B | 3,773,858 | 460.7 | 1,408,294 | 171.9 | 0.37x |
### Publisher + Subscriber (1:1) ### Publisher + Subscriber (1:1)
| Payload | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) | | Payload | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) |
|---------|----------|---------|------------|-----------|-----------------| |---------|----------|---------|------------|-----------|-----------------|
| 16 B | 292,711 | 4.5 | 723,867 | 11.0 | **2.47x** | | 16 B | 1,075,095 | 16.4 | 713,952 | 10.9 | 0.66x |
| 16 KB | 32,890 | 513.9 | 37,943 | 592.9 | **1.15x** | | 16 KB | 39,215 | 612.7 | 30,916 | 483.1 | 0.79x |
### Fan-Out (1 Publisher : 4 Subscribers) ### Fan-Out (1 Publisher : 4 Subscribers)
| Payload | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) | | Payload | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) |
|---------|----------|---------|------------|-----------|-----------------| |---------|----------|---------|------------|-----------|-----------------|
| 128 B | 2,945,790 | 359.6 | 2,063,771 | 251.9 | 0.70x | | 128 B | 2,919,353 | 356.4 | 2,459,924 | 300.3 | 0.84x |
> **Note:** Fan-out improved from 0.63x to 0.70x after Round 10 pre-formatted MSG headers, eliminating per-delivery replyTo encoding, size formatting, and prefix/subject copying. Only the SID varies per delivery now.
### Multi-Publisher / Multi-Subscriber (4P x 4S) ### Multi-Publisher / Multi-Subscriber (4P x 4S)
| Payload | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) | | Payload | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) |
|---------|----------|---------|------------|-----------|-----------------| |---------|----------|---------|------------|-----------|-----------------|
| 128 B | 2,123,480 | 259.2 | 1,465,416 | 178.9 | 0.69x | | 128 B | 1,870,855 | 228.4 | 1,892,631 | 231.0 | **1.01x** |
--- ---
@@ -44,15 +42,15 @@ Benchmark run: 2026-03-13 America/Indiana/Indianapolis. Both servers ran on the
### Single Client, Single Service ### Single Client, Single Service
| Payload | Go msg/s | .NET msg/s | Ratio | Go P50 (us) | .NET P50 (us) | Go P99 (us) | .NET P99 (us) | | Payload | Go msg/s | .NET msg/s | Ratio |
|---------|----------|------------|-------|-------------|---------------|-------------|---------------| |---------|----------|------------|-------|
| 128 B | 8,386 | 7,424 | 0.89x | 115.8 | 139.0 | 175.5 | 193.0 | | 128 B | 9,392 | 8,372 | 0.89x |
### 10 Clients, 2 Services (Queue Group) ### 10 Clients, 2 Services (Queue Group)
| Payload | Go msg/s | .NET msg/s | Ratio | Go P50 (us) | .NET P50 (us) | Go P99 (us) | .NET P99 (us) | | Payload | Go msg/s | .NET msg/s | Ratio |
|---------|----------|------------|-------|-------------|---------------|-------------|---------------| |---------|----------|------------|-------|
| 16 B | 26,470 | 26,620 | **1.01x** | 370.2 | 376.0 | 486.0 | 592.8 | | 16 B | 30,563 | 26,178 | 0.86x |
--- ---
@@ -60,10 +58,8 @@ Benchmark run: 2026-03-13 America/Indiana/Indianapolis. Both servers ran on the
| Mode | Payload | Storage | Go msg/s | .NET msg/s | Ratio (.NET/Go) | | Mode | Payload | Storage | Go msg/s | .NET msg/s | Ratio (.NET/Go) |
|------|---------|---------|----------|------------|-----------------| |------|---------|---------|----------|------------|-----------------|
| Synchronous | 16 B | Memory | 14,812 | 12,134 | 0.82x | | Synchronous | 16 B | Memory | 16,982 | 14,514 | 0.85x |
| Async (batch) | 128 B | File | 174,705 | 52,350 | 0.30x | | Async (batch) | 128 B | File | 211,355 | 58,334 | 0.28x |
> **Note:** Async file-store publish improved ~10% (47K→52K) after hot-path optimizations: cached state properties, single stream lookup, _messageIndexes removal, hand-rolled pub-ack formatter, exponential flush backoff, lazy StoredMessage materialization. Still storage-bound at 0.30x Go.
--- ---
@@ -71,10 +67,8 @@ Benchmark run: 2026-03-13 America/Indiana/Indianapolis. Both servers ran on the
| Mode | Go msg/s | .NET msg/s | Ratio (.NET/Go) | | Mode | Go msg/s | .NET msg/s | Ratio (.NET/Go) |
|------|----------|------------|-----------------| |------|----------|------------|-----------------|
| Ordered ephemeral consumer | 166,000 | 102,369 | 0.62x | | Ordered ephemeral consumer | 786,681 | 346,162 | 0.44x |
| Durable consumer fetch | 510,000 | 468,252 | 0.92x | | Durable consumer fetch | 711,203 | 542,250 | 0.76x |
> **Note:** Ordered consumer improved to 0.62x (102K vs 166K). Durable fetch jumped to 0.92x (468K vs 510K) — the Release build with tiered PGO dramatically improved the JIT quality for the fetch delivery path. Go comparison numbers vary significantly across runs.
--- ---
@@ -82,10 +76,8 @@ Benchmark run: 2026-03-13 America/Indiana/Indianapolis. Both servers ran on the
| Benchmark | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) | | Benchmark | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) |
|-----------|----------|---------|------------|-----------|-----------------| |-----------|----------|---------|------------|-----------|-----------------|
| MQTT PubSub (128B, QoS 0) | 34,224 | 4.2 | 47,341 | 5.8 | **1.38x** | | MQTT PubSub (128B, QoS 0) | 36,913 | 4.5 | 48,755 | 6.0 | **1.32x** |
| Cross-Protocol NATS→MQTT (128B) | 158,000 | 19.3 | 229,932 | 28.1 | **1.46x** | | Cross-Protocol NATS→MQTT (128B) | 407,487 | 49.7 | 287,946 | 35.1 | 0.71x |
> **Note:** Pure MQTT pub/sub extended its lead to 1.38x. Cross-protocol NATS→MQTT now at **1.46x** — the Release build JIT further benefits the delivery path.
--- ---
@@ -95,17 +87,17 @@ Benchmark run: 2026-03-13 America/Indiana/Indianapolis. Both servers ran on the
| Benchmark | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) | | Benchmark | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) |
|-----------|----------|---------|------------|-----------|-----------------| |-----------|----------|---------|------------|-----------|-----------------|
| TLS PubSub 1:1 (128B) | 289,548 | 35.3 | 254,834 | 31.1 | 0.88x | | TLS PubSub 1:1 (128B) | 244,403 | 29.8 | 1,148,179 | 140.2 | **4.70x** |
| TLS Pub-Only (128B) | 1,782,442 | 217.6 | 877,149 | 107.1 | 0.49x | | TLS Pub-Only (128B) | 3,224,490 | 393.6 | 1,246,351 | 152.1 | 0.39x |
> **Note:** TLS PubSub 1:1 shows .NET dramatically outperforming Go (4.70x). This appears to reflect .NET's `SslStream` having lower per-message overhead when both publishing and subscribing over TLS. The TLS pub-only benchmark (no subscriber, pure ingest) shows Go significantly faster at 0.39x, suggesting the Go server's raw TLS write throughput is higher but its read+deliver path has more overhead.
### WebSocket ### WebSocket
| Benchmark | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) | | Benchmark | Go msg/s | Go MB/s | .NET msg/s | .NET MB/s | Ratio (.NET/Go) |
|-----------|----------|---------|------------|-----------|-----------------| |-----------|----------|---------|------------|-----------|-----------------|
| WS PubSub 1:1 (128B) | 66,584 | 8.1 | 62,249 | 7.6 | 0.93x | | WS PubSub 1:1 (128B) | 44,783 | 5.5 | 40,793 | 5.0 | 0.91x |
| WS Pub-Only (128B) | 106,302 | 13.0 | 85,878 | 10.5 | 0.81x | | WS Pub-Only (128B) | 118,898 | 14.5 | 100,522 | 12.3 | 0.85x |
> **Note:** TLS pub/sub stable at 0.88x. WebSocket pub/sub at 0.93x. Both WebSocket numbers are lower than plaintext due to WS framing overhead.
--- ---
@@ -115,59 +107,61 @@ Benchmark run: 2026-03-13 America/Indiana/Indianapolis. Both servers ran on the
| Benchmark | .NET msg/s | .NET MB/s | Alloc | | Benchmark | .NET msg/s | .NET MB/s | Alloc |
|-----------|------------|-----------|-------| |-----------|------------|-----------|-------|
| SubList Exact Match (128 subjects) | 19,285,510 | 257.5 | 0.00 B/op | | SubList Exact Match (128 subjects) | 22,812,300 | 304.6 | 0.00 B/op |
| SubList Wildcard Match | 18,876,330 | 252.0 | 0.00 B/op | | SubList Wildcard Match | 17,626,363 | 235.3 | 0.00 B/op |
| SubList Queue Match | 20,639,153 | 157.5 | 0.00 B/op | | SubList Queue Match | 23,306,329 | 177.8 | 0.00 B/op |
| SubList Remote Interest | 274,703 | 4.5 | 0.00 B/op | | SubList Remote Interest | 437,080 | 7.1 | 0.00 B/op |
### Parser ### Parser
| Benchmark | Ops/s | MB/s | Alloc | | Benchmark | Ops/s | MB/s | Alloc |
|-----------|-------|------|-------| |-----------|-------|------|-------|
| Parser PING | 6,283,578 | 36.0 | 0.0 B/op | | Parser PING | 6,262,196 | 35.8 | 0.0 B/op |
| Parser PUB | 2,712,550 | 103.5 | 40.0 B/op | | Parser PUB | 2,663,706 | 101.6 | 40.0 B/op |
| Parser HPUB | 2,338,555 | 124.9 | 40.0 B/op | | Parser HPUB | 2,213,655 | 118.2 | 40.0 B/op |
| Parser PUB split payload | 2,043,813 | 78.0 | 176.0 B/op | | Parser PUB split payload | 2,100,256 | 80.1 | 176.0 B/op |
### FileStore ### FileStore
| Benchmark | Ops/s | MB/s | Alloc | | Benchmark | Ops/s | MB/s | Alloc |
|-----------|-------|------|-------| |-----------|-------|------|-------|
| FileStore AppendAsync (128B) | 244,089 | 29.8 | 1552.9 B/op | | FileStore AppendAsync (128B) | 275,438 | 33.6 | 1242.9 B/op |
| FileStore LoadLastBySubject (hot) | 12,784,127 | 780.3 | 0.0 B/op | | FileStore LoadLastBySubject (hot) | 1,138,203 | 69.5 | 656.0 B/op |
| FileStore PurgeEx+Trim | 332 | 0.0 | 5440792.9 B/op | | FileStore PurgeEx+Trim | 647 | 0.1 | 5440579.9 B/op |
--- ---
## Summary ## Summary
| Category | Ratio Range | Assessment | | Category | Ratio | Assessment |
|----------|-------------|------------| |----------|-------|------------|
| Pub-only throughput | 0.62x0.74x | Improved with Release build | | Pub-only throughput (16B) | 0.74x | Stable across runs |
| Pub/sub (small payload) | **2.47x** | .NET outperforms Go decisively | | Pub-only throughput (128B) | 0.37x | Go significantly faster at larger payloads |
| Pub/sub (large payload) | **1.15x** | .NET now exceeds parity | | Pub/sub 1:1 (16B) | 0.66x | Go ahead; high variance at short durations |
| Fan-out | 0.70x | Improved: pre-formatted MSG headers | | Pub/sub 1:1 (16KB) | 0.79x | Reasonable gap |
| Multi pub/sub | 0.69x | Improved: same optimizations | | Fan-out 1:4 | 0.84x | Improved after Round 10 optimizations |
| Request/reply latency | 0.89x**1.01x** | Effectively at parity | | Multi pub/sub 4x4 | **1.01x** | At parity |
| JetStream sync publish | 0.74x | Run-to-run variance | | Request/reply (single) | 0.89x | Close to parity |
| JetStream async file publish | 0.41x | Storage-bound | | Request/reply (10Cx2S) | 0.86x | Close to parity |
| JetStream ordered consume | 0.62x | Improved with Release build | | JetStream sync publish | 0.85x | Close to parity |
| JetStream durable fetch | 0.92x | Major improvement with Release build | | JetStream async file publish | 0.28x | Storage-bound |
| MQTT pub/sub | **1.38x** | .NET outperforms Go | | JetStream ordered consume | 0.44x | Significant gap |
| MQTT cross-protocol | **1.46x** | .NET strongly outperforms Go | | JetStream durable fetch | 0.76x | Moderate gap |
| TLS pub/sub | 0.88x | Close to parity | | MQTT pub/sub | **1.32x** | .NET outperforms Go |
| TLS pub-only | 0.49x | Variance / contention with other tests | | MQTT cross-protocol | 0.71x | Go ahead; high variance |
| WebSocket pub/sub | 0.93x | Close to parity | | TLS pub/sub | **4.70x** | .NET SslStream dramatically faster |
| WebSocket pub-only | 0.81x | Good | | TLS pub-only | 0.39x | Go raw TLS write faster |
| WebSocket pub/sub | 0.91x | Close to parity |
| WebSocket pub-only | 0.85x | Good |
### Key Observations ### Key Observations
1. **Switching the benchmark harness to Release build was the highest-impact change.** Durable fetch jumped from 0.42x to 0.92x (468K vs 510K msg/s). Ordered consumer improved from 0.57x to 0.62x. Request-reply 10Cx2S reached parity at 1.01x. Large-payload pub/sub now exceeds Go at 1.15x. 1. **Multi pub/sub reached parity (1.01x)** after Round 10 pre-formatted MSG headers. Fan-out improved to 0.84x.
2. **Small-payload 1:1 pub/sub remains a strong .NET lead** at 2.47x (724K vs 293K msg/s). 2. **TLS pub/sub shows a dramatic .NET advantage (4.70x)** — .NET's `SslStream` has significantly lower overhead in the bidirectional pub/sub path. TLS pub-only (ingest only) still favors Go at 0.39x, suggesting the advantage is in the read-and-deliver path.
3. **MQTT cross-protocol improved to 1.46x** (230K vs 158K msg/s), up from 1.20x — the Release JIT further benefits the delivery path. 3. **MQTT pub/sub remains a .NET strength at 1.32x.** Cross-protocol (NATS→MQTT) dropped to 0.71x — this benchmark shows high variance across runs.
4. **Fan-out improved from 0.63x to 0.70x, multi pub/sub from 0.65x to 0.69x** after Round 10 pre-formatted MSG headers. Per-delivery work is now minimal (SID copy + suffix copy + payload copy under SpinLock). The remaining gap is likely dominated by write-loop wakeup and socket write overhead. 4. **JetStream ordered consumer dropped to 0.44x** compared to earlier runs (0.62x). This test completes in <100ms and shows high variance.
5. **SubList Match microbenchmarks improved ~17%** (19.3M vs 16.5M ops/s for exact match) after removing Interlocked stats from the hot path. 5. **Single publisher 128B dropped to 0.37x** (from 0.62x with smaller message counts). With 500K messages, this benchmark runs long enough for Go's goroutine scheduler and buffer management to reach steady state, widening the gap. The 16B variant is stable at 0.74x.
6. **TLS pub-only dropped to 0.49x** this run, likely noise from co-running benchmarks contending on CPU. TLS pub/sub remains stable at 0.88x. 6. **Request-reply latency stable** at 0.86x0.89x across all runs.
--- ---
@@ -175,7 +169,7 @@ Benchmark run: 2026-03-13 America/Indiana/Indianapolis. Both servers ran on the
### Round 10: Fan-Out Serial Path Optimization ### Round 10: Fan-Out Serial Path Optimization
Three optimizations making the serial fan-out path cheaper (fan-out 0.63x→0.70x, multi 0.65x→0.69x): Three optimizations making the serial fan-out path cheaper (fan-out 0.63x→0.84x, multi 0.65x→1.01x):
| # | Root Cause | Fix | Impact | | # | Root Cause | Fix | Impact |
|---|-----------|-----|--------| |---|-----------|-----|--------|
@@ -285,6 +279,7 @@ Additional fixes: SHA256 envelope bypass for unencrypted/uncompressed stores, RA
| Change | Expected Impact | Go Reference | | Change | Expected Impact | Go Reference |
|--------|----------------|-------------| |--------|----------------|-------------|
| **Write-loop / socket write overhead** | The per-delivery serial path is now minimal (SID copy + memcpy under SpinLock). The remaining 0.70x fan-out gap is likely write-loop wakeup latency and socket write syscall overhead | Go: `flushOutbound` uses `net.Buffers.WriteTo``writev()` with zero-copy buffer management | | **Single publisher ingest path (0.37x at 128B)** | The pub-only path has the largest gap. Go's readLoop uses zero-copy buffer management with direct `[]byte` slicing; .NET parses into managed objects. Reducing allocations in the parser→ProcessMessage path would help. | Go: `client.go` readLoop, direct buffer slicing |
| **Eliminate per-message GC allocations in FileStore** | ~30% improvement on FileStore AppendAsync — replace `StoredMessage` class with `StoredMessageMeta` struct in `_messages` dict, reconstruct full message from MsgBlock on read | Go stores in `cache.buf`/`cache.idx` with zero per-message allocs; 80+ sites in FileStore.cs need migration | | **JetStream async file publish (0.28x)** | Storage-bound: FileStore AppendAsync bottleneck is synchronous `RandomAccess.Write` in flush loop and S2 compression overhead | Go: `filestore.go` uses `cache.buf`/`cache.idx` with mmap and goroutine-per-flush concurrency |
| **Single publisher throughput** | 0.62x0.74x gap; the pub-only path has no fan-out overhead — likely JIT/GC/socket write overhead in the ingest path | Go: client.go readLoop with zero-copy buffer management | | **JetStream ordered consumer (0.44x)** | Pull consumer delivery pipeline has overhead in the fetch→deliver→ack cycle. The test completes in <100ms so numbers are noisy, but the gap is real. | Go: `consumer.go` delivery with direct buffer writes |
| **Write-loop / socket write overhead** | Fan-out (0.84x) and pub/sub (0.66x) gaps partly come from write-loop wakeup latency and socket write syscall overhead compared to Go's `writev()` | Go: `flushOutbound` uses `net.Buffers.WriteTo``writev()` with zero-copy buffer management |

View File

@@ -13,7 +13,7 @@ public class FanOutTests(CoreServerPairFixture fixture, ITestOutputHelper output
public async Task FanOut1To4_128B() public async Task FanOut1To4_128B()
{ {
const int payloadSize = 128; const int payloadSize = 128;
const int messageCount = 10_000; const int messageCount = 15_000;
const int subscriberCount = 4; const int subscriberCount = 4;
var dotnetResult = await RunFanOut("Fan-Out 1:4 (128B)", "DotNet", payloadSize, messageCount, subscriberCount, fixture.CreateDotNetClient); var dotnetResult = await RunFanOut("Fan-Out 1:4 (128B)", "DotNet", payloadSize, messageCount, subscriberCount, fixture.CreateDotNetClient);
@@ -78,7 +78,7 @@ public class FanOutTests(CoreServerPairFixture fixture, ITestOutputHelper output
await pubClient.PublishAsync(subject, payload); await pubClient.PublishAsync(subject, payload);
await pubClient.PingAsync(); await pubClient.PingAsync();
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(60)); using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(180));
await tcs.Task.WaitAsync(cts.Token); await tcs.Task.WaitAsync(cts.Token);
sw.Stop(); sw.Stop();

View File

@@ -13,7 +13,7 @@ public class MultiPubSubTests(CoreServerPairFixture fixture, ITestOutputHelper o
public async Task MultiPubSub4x4_128B() public async Task MultiPubSub4x4_128B()
{ {
const int payloadSize = 128; const int payloadSize = 128;
const int messagesPerPublisher = 2_000; const int messagesPerPublisher = 5_000;
const int pubCount = 4; const int pubCount = 4;
const int subCount = 4; const int subCount = 4;
@@ -101,7 +101,7 @@ public class MultiPubSubTests(CoreServerPairFixture fixture, ITestOutputHelper o
foreach (var client in pubClients) foreach (var client in pubClients)
await client.PingAsync(); await client.PingAsync();
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(60)); using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(120));
await tcs.Task.WaitAsync(cts.Token); await tcs.Task.WaitAsync(cts.Token);
sw.Stop(); sw.Stop();

View File

@@ -13,7 +13,7 @@ public class PubSubOneToOneTests(CoreServerPairFixture fixture, ITestOutputHelpe
public async Task PubSub1To1_16B() public async Task PubSub1To1_16B()
{ {
const int payloadSize = 16; const int payloadSize = 16;
const int messageCount = 10_000; const int messageCount = 50_000;
var dotnetResult = await RunPubSub("PubSub 1:1 (16B)", "DotNet", payloadSize, messageCount, fixture.CreateDotNetClient); var dotnetResult = await RunPubSub("PubSub 1:1 (16B)", "DotNet", payloadSize, messageCount, fixture.CreateDotNetClient);
@@ -33,7 +33,7 @@ public class PubSubOneToOneTests(CoreServerPairFixture fixture, ITestOutputHelpe
public async Task PubSub1To1_16KB() public async Task PubSub1To1_16KB()
{ {
const int payloadSize = 16 * 1024; const int payloadSize = 16 * 1024;
const int messageCount = 1_000; const int messageCount = 5_000;
var dotnetResult = await RunPubSub("PubSub 1:1 (16KB)", "DotNet", payloadSize, messageCount, fixture.CreateDotNetClient); var dotnetResult = await RunPubSub("PubSub 1:1 (16KB)", "DotNet", payloadSize, messageCount, fixture.CreateDotNetClient);
@@ -87,7 +87,7 @@ public class PubSubOneToOneTests(CoreServerPairFixture fixture, ITestOutputHelpe
await pubClient.PingAsync(); // Flush all pending writes await pubClient.PingAsync(); // Flush all pending writes
// Wait for all messages received // Wait for all messages received
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(60)); using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(120));
await tcs.Task.WaitAsync(cts.Token); await tcs.Task.WaitAsync(cts.Token);
sw.Stop(); sw.Stop();

View File

@@ -8,7 +8,7 @@ namespace NATS.Server.Benchmark.Tests.CorePubSub;
[Collection("Benchmark-Core")] [Collection("Benchmark-Core")]
public class SinglePublisherThroughputTests(CoreServerPairFixture fixture, ITestOutputHelper output) public class SinglePublisherThroughputTests(CoreServerPairFixture fixture, ITestOutputHelper output)
{ {
private readonly BenchmarkRunner _runner = new() { WarmupCount = 1_000, MeasurementCount = 100_000 }; private readonly BenchmarkRunner _runner = new() { WarmupCount = 10_000, MeasurementCount = 500_000 };
[Fact] [Fact]
[Trait("Category", "Benchmark")] [Trait("Category", "Benchmark")]

View File

@@ -16,7 +16,7 @@ public class AsyncPublishTests(JetStreamServerPairFixture fixture, ITestOutputHe
public async Task JSAsyncPublish_128B_FileStore() public async Task JSAsyncPublish_128B_FileStore()
{ {
const int payloadSize = 128; const int payloadSize = 128;
const int messageCount = 5_000; const int messageCount = 25_000;
const int batchSize = 100; const int batchSize = 100;
var dotnetResult = await RunAsyncPublish("JS Async Publish (128B File)", "DotNet", payloadSize, messageCount, batchSize, fixture.CreateDotNetClient); var dotnetResult = await RunAsyncPublish("JS Async Publish (128B File)", "DotNet", payloadSize, messageCount, batchSize, fixture.CreateDotNetClient);

View File

@@ -16,7 +16,7 @@ public class DurableConsumerFetchTests(JetStreamServerPairFixture fixture, ITest
public async Task JSDurableFetch_Throughput() public async Task JSDurableFetch_Throughput()
{ {
const int payloadSize = 128; const int payloadSize = 128;
const int messageCount = 5_000; const int messageCount = 25_000;
const int fetchBatchSize = 500; const int fetchBatchSize = 500;
var dotnetResult = await RunDurableFetch("JS Durable Fetch (128B)", "DotNet", payloadSize, messageCount, fetchBatchSize, fixture.CreateDotNetClient); var dotnetResult = await RunDurableFetch("JS Durable Fetch (128B)", "DotNet", payloadSize, messageCount, fetchBatchSize, fixture.CreateDotNetClient);

View File

@@ -16,7 +16,7 @@ public class OrderedConsumerTests(JetStreamServerPairFixture fixture, ITestOutpu
public async Task JSOrderedConsumer_Throughput() public async Task JSOrderedConsumer_Throughput()
{ {
const int payloadSize = 128; const int payloadSize = 128;
const int messageCount = 5_000; const int messageCount = 25_000;
BenchmarkResult? dotnetResult = null; BenchmarkResult? dotnetResult = null;
try try

View File

@@ -10,7 +10,7 @@ namespace NATS.Server.Benchmark.Tests.JetStream;
[Collection("Benchmark-JetStream")] [Collection("Benchmark-JetStream")]
public class SyncPublishTests(JetStreamServerPairFixture fixture, ITestOutputHelper output) public class SyncPublishTests(JetStreamServerPairFixture fixture, ITestOutputHelper output)
{ {
private readonly BenchmarkRunner _runner = new() { WarmupCount = 500, MeasurementCount = 10_000 }; private readonly BenchmarkRunner _runner = new() { WarmupCount = 1_000, MeasurementCount = 50_000 };
[Fact] [Fact]
[Trait("Category", "Benchmark")] [Trait("Category", "Benchmark")]

View File

@@ -15,7 +15,7 @@ public class MqttThroughputTests(MqttServerFixture fixture, ITestOutputHelper ou
public async Task MqttPubSub_128B() public async Task MqttPubSub_128B()
{ {
const int payloadSize = 128; const int payloadSize = 128;
const int messageCount = 5_000; const int messageCount = 25_000;
var dotnetResult = await RunMqttPubSub("MQTT PubSub (128B)", "DotNet", fixture.DotNetMqttPort, payloadSize, messageCount); var dotnetResult = await RunMqttPubSub("MQTT PubSub (128B)", "DotNet", fixture.DotNetMqttPort, payloadSize, messageCount);
@@ -35,7 +35,7 @@ public class MqttThroughputTests(MqttServerFixture fixture, ITestOutputHelper ou
public async Task MqttCrossProtocol_NatsPub_MqttSub_128B() public async Task MqttCrossProtocol_NatsPub_MqttSub_128B()
{ {
const int payloadSize = 128; const int payloadSize = 128;
const int messageCount = 5_000; const int messageCount = 25_000;
var dotnetResult = await RunCrossProtocol("Cross-Protocol NATS→MQTT (128B)", "DotNet", fixture.DotNetMqttPort, fixture.CreateDotNetNatsClient, payloadSize, messageCount); var dotnetResult = await RunCrossProtocol("Cross-Protocol NATS→MQTT (128B)", "DotNet", fixture.DotNetMqttPort, fixture.CreateDotNetNatsClient, payloadSize, messageCount);
@@ -55,7 +55,7 @@ public class MqttThroughputTests(MqttServerFixture fixture, ITestOutputHelper ou
var payload = new byte[payloadSize]; var payload = new byte[payloadSize];
var topic = $"bench/mqtt/pubsub/{Guid.NewGuid():N}"; var topic = $"bench/mqtt/pubsub/{Guid.NewGuid():N}";
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(60)); using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(120));
var factory = new MqttFactory(); var factory = new MqttFactory();
using var subscriber = factory.CreateMqttClient(); using var subscriber = factory.CreateMqttClient();
@@ -127,7 +127,7 @@ public class MqttThroughputTests(MqttServerFixture fixture, ITestOutputHelper ou
var natsSubject = $"bench.mqtt.cross.{Guid.NewGuid():N}"; var natsSubject = $"bench.mqtt.cross.{Guid.NewGuid():N}";
var mqttTopic = natsSubject.Replace('.', '/'); var mqttTopic = natsSubject.Replace('.', '/');
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(60)); using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(120));
var factory = new MqttFactory(); var factory = new MqttFactory();
using var mqttSub = factory.CreateMqttClient(); using var mqttSub = factory.CreateMqttClient();

View File

@@ -14,7 +14,7 @@ public class MultiClientLatencyTests(CoreServerPairFixture fixture, ITestOutputH
public async Task RequestReply_10Clients2Services_16B() public async Task RequestReply_10Clients2Services_16B()
{ {
const int payloadSize = 16; const int payloadSize = 16;
const int requestsPerClient = 1_000; const int requestsPerClient = 5_000;
const int clientCount = 10; const int clientCount = 10;
const int serviceCount = 2; const int serviceCount = 2;

View File

@@ -8,7 +8,7 @@ namespace NATS.Server.Benchmark.Tests.RequestReply;
[Collection("Benchmark-Core")] [Collection("Benchmark-Core")]
public class SingleClientLatencyTests(CoreServerPairFixture fixture, ITestOutputHelper output) public class SingleClientLatencyTests(CoreServerPairFixture fixture, ITestOutputHelper output)
{ {
private readonly BenchmarkRunner _runner = new() { WarmupCount = 500, MeasurementCount = 10_000 }; private readonly BenchmarkRunner _runner = new() { WarmupCount = 1_000, MeasurementCount = 50_000 };
[Fact] [Fact]
[Trait("Category", "Benchmark")] [Trait("Category", "Benchmark")]

View File

@@ -13,7 +13,7 @@ public class TlsPubSubTests(TlsServerFixture fixture, ITestOutputHelper output)
public async Task TlsPubSub1To1_128B() public async Task TlsPubSub1To1_128B()
{ {
const int payloadSize = 128; const int payloadSize = 128;
const int messageCount = 10_000; const int messageCount = 50_000;
var dotnetResult = await RunTlsPubSub("TLS PubSub 1:1 (128B)", "DotNet", fixture.CreateDotNetTlsClient, payloadSize, messageCount); var dotnetResult = await RunTlsPubSub("TLS PubSub 1:1 (128B)", "DotNet", fixture.CreateDotNetTlsClient, payloadSize, messageCount);
@@ -82,7 +82,7 @@ public class TlsPubSubTests(TlsServerFixture fixture, ITestOutputHelper output)
await pubClient.PublishAsync(subject, payload); await pubClient.PublishAsync(subject, payload);
await pubClient.PingAsync(); await pubClient.PingAsync();
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(60)); using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(120));
await tcs.Task.WaitAsync(cts.Token); await tcs.Task.WaitAsync(cts.Token);
sw.Stop(); sw.Stop();
@@ -105,7 +105,7 @@ public class TlsPubSubTests(TlsServerFixture fixture, ITestOutputHelper output)
await using var client = createClient(); await using var client = createClient();
await client.ConnectAsync(); await client.ConnectAsync();
var runner = new BenchmarkRunner { WarmupCount = 1_000, MeasurementCount = 100_000 }; var runner = new BenchmarkRunner { WarmupCount = 10_000, MeasurementCount = 500_000 };
return await runner.MeasureThroughputAsync( return await runner.MeasureThroughputAsync(
name, name,

View File

@@ -15,7 +15,7 @@ public class WebSocketPubSubTests(WebSocketServerFixture fixture, ITestOutputHel
public async Task WsPubSub1To1_128B() public async Task WsPubSub1To1_128B()
{ {
const int payloadSize = 128; const int payloadSize = 128;
const int messageCount = 5_000; const int messageCount = 25_000;
var dotnetResult = await RunWsPubSub("WebSocket PubSub 1:1 (128B)", "DotNet", fixture.DotNetWsPort, fixture.CreateDotNetNatsClient, payloadSize, messageCount); var dotnetResult = await RunWsPubSub("WebSocket PubSub 1:1 (128B)", "DotNet", fixture.DotNetWsPort, fixture.CreateDotNetNatsClient, payloadSize, messageCount);
@@ -35,7 +35,7 @@ public class WebSocketPubSubTests(WebSocketServerFixture fixture, ITestOutputHel
public async Task WsPubNoSub_128B() public async Task WsPubNoSub_128B()
{ {
const int payloadSize = 128; const int payloadSize = 128;
const int messageCount = 10_000; const int messageCount = 50_000;
var dotnetResult = await RunWsPubOnly("WebSocket Pub-Only (128B)", "DotNet", fixture.DotNetWsPort, payloadSize, messageCount); var dotnetResult = await RunWsPubOnly("WebSocket Pub-Only (128B)", "DotNet", fixture.DotNetWsPort, payloadSize, messageCount);
@@ -55,7 +55,7 @@ public class WebSocketPubSubTests(WebSocketServerFixture fixture, ITestOutputHel
var payload = new byte[payloadSize]; var payload = new byte[payloadSize];
var subject = $"bench.ws.pubsub.{Guid.NewGuid():N}"; var subject = $"bench.ws.pubsub.{Guid.NewGuid():N}";
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(60)); using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(120));
using var ws = new ClientWebSocket(); using var ws = new ClientWebSocket();
await ws.ConnectAsync(new Uri($"ws://127.0.0.1:{wsPort}"), cts.Token); await ws.ConnectAsync(new Uri($"ws://127.0.0.1:{wsPort}"), cts.Token);
@@ -110,7 +110,7 @@ public class WebSocketPubSubTests(WebSocketServerFixture fixture, ITestOutputHel
private static async Task<BenchmarkResult> RunWsPubOnly(string name, string serverType, int wsPort, int payloadSize, int messageCount) private static async Task<BenchmarkResult> RunWsPubOnly(string name, string serverType, int wsPort, int payloadSize, int messageCount)
{ {
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(60)); using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(120));
using var ws = new ClientWebSocket(); using var ws = new ClientWebSocket();
await ws.ConnectAsync(new Uri($"ws://127.0.0.1:{wsPort}"), cts.Token); await ws.ConnectAsync(new Uri($"ws://127.0.0.1:{wsPort}"), cts.Token);