Cross-language ReadBulk stress benchmark
Adds a bench-read-bulk subcommand to every client CLI (.NET, Go, Rust,
Python, Java) and a PowerShell driver that runs all five concurrently
against the deployed gateway and prints a side-by-side comparison.
Each CLI''s bench:
- Opens its own session, registers, subscribes to bulk-size tags so the
worker''s MxAccessValueCache populates from real OnDataChange events.
- Runs a warmup-seconds-long pre-loop with identical calls so JIT /
connection-pool / first-call overhead is amortised before the
measurement window.
- Runs ReadBulk in a tight in-process loop for duration-seconds with
per-call high-resolution latency capture (Stopwatch in .NET,
time.Now in Go, std::time::Instant in Rust, time.perf_counter in
Python, System.nanoTime in Java).
- Unsubscribes + closes the session, then emits one JSON object with
the shared schema: { language, durationMs, totalCalls, successfulCalls,
failedCalls, totalReadResults, cachedReadResults, callsPerSecond,
latencyMs: { p50, p95, p99, max, mean } }.
The PS driver (scripts/bench-read-bulk.ps1) launches one detached process
per client, waits for all to finish, parses the trailing JSON object from
each stdout, prints a comparison table, and persists the combined report
under artifacts/bench/. Quoting around Java''s `gradle --args="..."` is
handled by writing a one-shot .bat that cmd.exe runs; the .NET CLI''s
per-call gRPC timeout is auto-scaled to (Duration + Warmup + 30s) so the
channel-wide timeout doesn''t cancel the bench mid-loop.
Live 30-second steady-state run against the deployed gateway, all five
clients hitting the same six TestMachine_001..006.TestChangingInt tags:
client calls/sec cached/total p50 ms p95 ms p99 ms max ms
dotnet 171.78 30924/30924 3.84 14.06 40.41 542.48
go 175.46 31590/31590 3.93 13.52 41.26 243.00
rust 123.26 22188/22188 5.52 15.78 48.11 544.41
python 145.79 26244/26244 4.86 14.85 41.65 645.84
java 181.12 32604/32604 3.80 10.59 33.37 344.27
143,550 ReadBulk results across all five clients during the 30s window;
100% were was_cached = true (the worker''s cache fast-path never fell
through to the snapshot lifecycle). Aggregate read throughput ~800
calls/sec against five concurrent sessions sharing the same cached tags.
A second variant with bulk-size 20 sustained the same per-client call
rate while delivering 3.3x more values per call (~37,000 cached reads/sec
aggregate across the five concurrent sessions), confirming the linear
per-tag cache lookup inside one call is not a bottleneck at this scale.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,379 @@
|
||||
<#
|
||||
.SYNOPSIS
|
||||
Cross-language ReadBulk stress benchmark driver.
|
||||
|
||||
.DESCRIPTION
|
||||
Launches the bench-read-bulk subcommand of every client CLI (.NET, Go, Rust,
|
||||
Python, Java) concurrently against a running gateway and worker. Each client
|
||||
opens its own session, subscribes to -BulkSize tags so the worker's per-session
|
||||
MxAccessValueCache populates from real OnDataChange events, then hammers
|
||||
ReadBulk in a tight in-process loop for -DurationSeconds with per-call
|
||||
high-resolution latency capture. Each emits a single JSON stats object on
|
||||
stdout; this script collates the five into a comparison table.
|
||||
|
||||
The gateway and worker are assumed to be running at -Endpoint with the API
|
||||
key in $env:<ApiKeyEnv>.
|
||||
|
||||
.PARAMETER Clients
|
||||
Which clients to run. Defaults to all five.
|
||||
|
||||
.PARAMETER Endpoint
|
||||
gRPC endpoint of the gateway. Default localhost:5120.
|
||||
|
||||
.PARAMETER ApiKeyEnv
|
||||
Environment variable holding the API key. Default MXGATEWAY_API_KEY.
|
||||
|
||||
.PARAMETER DurationSeconds
|
||||
Steady-state measurement window per client.
|
||||
|
||||
.PARAMETER WarmupSeconds
|
||||
Warm-up window per client (calls during this window are discarded).
|
||||
|
||||
.PARAMETER BulkSize
|
||||
Number of tags per ReadBulk call.
|
||||
|
||||
.PARAMETER TagStart
|
||||
First machine number per client. Each client uses a contiguous range starting
|
||||
here, so machine ranges do not overlap when -DistinctTags is set.
|
||||
|
||||
.PARAMETER TagPrefix
|
||||
Tag prefix (machine number is appended as %03d).
|
||||
|
||||
.PARAMETER TagAttribute
|
||||
Attribute appended to each tag.
|
||||
|
||||
.PARAMETER DistinctTags
|
||||
When set, each client uses its own slice of tags (clients[i] starts at
|
||||
TagStart + i * BulkSize). When unset (default), all clients hit the same
|
||||
tags to maximise contention on the worker's value cache.
|
||||
|
||||
.PARAMETER ReportPath
|
||||
Where to persist the combined report. Defaults to artifacts/bench/...
|
||||
#>
|
||||
[CmdletBinding()]
|
||||
param(
|
||||
[string[]]$Clients = @("dotnet", "go", "rust", "python", "java"),
|
||||
[string]$Endpoint = "localhost:5120",
|
||||
[string]$ApiKeyEnv = "MXGATEWAY_API_KEY",
|
||||
[int]$DurationSeconds = 30,
|
||||
[int]$WarmupSeconds = 3,
|
||||
[int]$BulkSize = 6,
|
||||
[int]$TagStart = 1,
|
||||
[string]$TagPrefix = "TestMachine_",
|
||||
[string]$TagAttribute = "TestChangingInt",
|
||||
[int]$TimeoutMs = 1500,
|
||||
[switch]$DistinctTags,
|
||||
[string]$ReportPath
|
||||
)
|
||||
|
||||
Set-StrictMode -Version Latest
|
||||
$ErrorActionPreference = "Stop"
|
||||
|
||||
$repoRoot = Resolve-Path (Join-Path $PSScriptRoot "..")
|
||||
$validClients = @("dotnet", "go", "rust", "python", "java")
|
||||
foreach ($c in $Clients) {
|
||||
if ($validClients -notcontains $c) {
|
||||
throw "Unsupported client '$c'. Valid: $($validClients -join ', ')."
|
||||
}
|
||||
}
|
||||
|
||||
if ([string]::IsNullOrWhiteSpace($ReportPath)) {
|
||||
$timestamp = Get-Date -Format "yyyyMMdd-HHmmss"
|
||||
$ReportPath = Join-Path $repoRoot "artifacts/bench/bench-read-bulk-$timestamp.json"
|
||||
}
|
||||
$reportDir = Split-Path -Parent $ReportPath
|
||||
if (-not (Test-Path $reportDir)) {
|
||||
New-Item -ItemType Directory -Path $reportDir -Force | Out-Null
|
||||
}
|
||||
|
||||
$apiKeyValue = (Get-Item -Path "Env:$ApiKeyEnv" -ErrorAction SilentlyContinue).Value
|
||||
if ([string]::IsNullOrWhiteSpace($apiKeyValue)) {
|
||||
throw "The API key environment variable '$ApiKeyEnv' is not set. Define it before running the bench."
|
||||
}
|
||||
|
||||
# Temp dir for per-client stdout/stderr capture + (Java only) a one-shot
|
||||
# wrapper .bat that handles cmd.exe's quoting rules for `gradle --args="..."`.
|
||||
$tmpDir = Join-Path ([System.IO.Path]::GetTempPath()) "mxgw-bench-$([guid]::NewGuid())"
|
||||
New-Item -ItemType Directory -Path $tmpDir -Force | Out-Null
|
||||
|
||||
function ConvertTo-HttpEndpoint {
|
||||
param([string]$Value)
|
||||
if ($Value -match '^https?://') { return $Value }
|
||||
return "http://$Value"
|
||||
}
|
||||
|
||||
function ConvertTo-HostEndpoint {
|
||||
param([string]$Value)
|
||||
return ($Value -replace '^https?://', '')
|
||||
}
|
||||
|
||||
# Build the per-client command array. Each client gets its own tag range when
|
||||
# -DistinctTags is set so the workers race against distinct cache slices.
|
||||
function Get-ClientCommand {
|
||||
param(
|
||||
[string]$Client,
|
||||
[int]$ClientIndex
|
||||
)
|
||||
|
||||
$effectiveTagStart = if ($DistinctTags) { $TagStart + ($ClientIndex * $BulkSize) } else { $TagStart }
|
||||
$httpEndpoint = ConvertTo-HttpEndpoint -Value $Endpoint
|
||||
$hostEndpoint = ConvertTo-HostEndpoint -Value $Endpoint
|
||||
$clientName = "mxgw-$Client-bench"
|
||||
|
||||
# Per-call gRPC timeout must exceed (DurationSeconds + WarmupSeconds + slack)
|
||||
# — otherwise the channel-wide timeout cancels the bench mid-loop.
|
||||
$callTimeoutSeconds = [int]([Math]::Max(60, $DurationSeconds + $WarmupSeconds + 30))
|
||||
|
||||
switch ($Client) {
|
||||
"dotnet" {
|
||||
$cliArgs = @(
|
||||
"run", "--project", "clients/dotnet/MxGateway.Client.Cli", "--no-build", "--",
|
||||
"bench-read-bulk",
|
||||
"--endpoint", $httpEndpoint,
|
||||
"--api-key-env", $ApiKeyEnv,
|
||||
"--timeout", "${callTimeoutSeconds}s",
|
||||
"--client-name", $clientName,
|
||||
"--duration-seconds", "$DurationSeconds",
|
||||
"--warmup-seconds", "$WarmupSeconds",
|
||||
"--bulk-size", "$BulkSize",
|
||||
"--tag-start", "$effectiveTagStart",
|
||||
"--tag-prefix", $TagPrefix,
|
||||
"--tag-attribute", $TagAttribute,
|
||||
"--timeout-ms", "$TimeoutMs",
|
||||
"--json"
|
||||
)
|
||||
return [pscustomobject]@{ file = "dotnet"; args = $cliArgs; cwd = $repoRoot }
|
||||
}
|
||||
"go" {
|
||||
$cliArgs = @(
|
||||
"run", "./cmd/mxgw-go", "bench-read-bulk",
|
||||
"-endpoint", $hostEndpoint,
|
||||
"-api-key-env", $ApiKeyEnv,
|
||||
"-plaintext",
|
||||
"-json",
|
||||
"-client-name", $clientName,
|
||||
"-duration-seconds", "$DurationSeconds",
|
||||
"-warmup-seconds", "$WarmupSeconds",
|
||||
"-bulk-size", "$BulkSize",
|
||||
"-tag-start", "$effectiveTagStart",
|
||||
"-tag-prefix", $TagPrefix,
|
||||
"-tag-attribute", $TagAttribute,
|
||||
"-timeout-ms", "$TimeoutMs"
|
||||
)
|
||||
return [pscustomobject]@{ file = "go"; args = $cliArgs; cwd = (Join-Path $repoRoot "clients/go") }
|
||||
}
|
||||
"rust" {
|
||||
$cliArgs = @(
|
||||
"run", "--quiet", "-p", "mxgw-cli", "--",
|
||||
"bench-read-bulk",
|
||||
"--endpoint", $httpEndpoint,
|
||||
"--api-key-env", $ApiKeyEnv,
|
||||
"--client-name", $clientName,
|
||||
"--duration-seconds", "$DurationSeconds",
|
||||
"--warmup-seconds", "$WarmupSeconds",
|
||||
"--bulk-size", "$BulkSize",
|
||||
"--tag-start", "$effectiveTagStart",
|
||||
"--tag-prefix", $TagPrefix,
|
||||
"--tag-attribute", $TagAttribute,
|
||||
"--timeout-ms", "$TimeoutMs",
|
||||
"--json"
|
||||
)
|
||||
return [pscustomobject]@{ file = "cargo"; args = $cliArgs; cwd = (Join-Path $repoRoot "clients/rust") }
|
||||
}
|
||||
"python" {
|
||||
$cliArgs = @(
|
||||
"-m", "mxgateway_cli", "bench-read-bulk",
|
||||
"--endpoint", $hostEndpoint,
|
||||
"--api-key-env", $ApiKeyEnv,
|
||||
"--plaintext",
|
||||
"--client-name", $clientName,
|
||||
"--duration-seconds", "$DurationSeconds",
|
||||
"--warmup-seconds", "$WarmupSeconds",
|
||||
"--bulk-size", "$BulkSize",
|
||||
"--tag-start", "$effectiveTagStart",
|
||||
"--tag-prefix", $TagPrefix,
|
||||
"--tag-attribute", $TagAttribute,
|
||||
"--timeout-ms", "$TimeoutMs",
|
||||
"--json"
|
||||
)
|
||||
$python = 'C:\Users\dohertj2\AppData\Local\Programs\Python\Python312\python.exe'
|
||||
return [pscustomobject]@{ file = $python; args = $cliArgs; cwd = (Join-Path $repoRoot "clients/python"); pythonpath = (Join-Path $repoRoot "clients/python/src") }
|
||||
}
|
||||
"java" {
|
||||
$inner = @(
|
||||
"bench-read-bulk",
|
||||
"--endpoint", $hostEndpoint,
|
||||
"--api-key-env", $ApiKeyEnv,
|
||||
"--plaintext",
|
||||
"--json",
|
||||
"--client-name", $clientName,
|
||||
"--duration-seconds", "$DurationSeconds",
|
||||
"--warmup-seconds", "$WarmupSeconds",
|
||||
"--bulk-size", "$BulkSize",
|
||||
"--tag-start", "$effectiveTagStart",
|
||||
"--tag-prefix", $TagPrefix,
|
||||
"--tag-attribute", $TagAttribute,
|
||||
"--timeout-ms", "$TimeoutMs"
|
||||
)
|
||||
$gradle = (Get-Command "gradle.bat", "gradle.cmd", "gradle.exe", "gradle" -ErrorAction SilentlyContinue | Select-Object -First 1)
|
||||
if ($null -eq $gradle) { throw "gradle not on PATH; required for the Java bench." }
|
||||
# Start-Process with ArgumentList mangles the `--args="..."` quoting
|
||||
# cmd.exe needs to keep the whole bench-args expression as a single
|
||||
# gradle argument. Workaround: write a one-shot .bat that contains
|
||||
# the literal gradle command line and invoke that batch via cmd.
|
||||
$batPath = Join-Path $tmpDir "java-bench.bat"
|
||||
$batContent = '@echo off' + "`r`n" +
|
||||
'"' + $gradle.Source + '" --quiet :mxgateway-cli:run "--args=' + ($inner -join ' ') + '"' + "`r`n"
|
||||
Set-Content -Path $batPath -Value $batContent -Encoding ASCII
|
||||
return [pscustomobject]@{ file = "cmd.exe"; args = @("/c", $batPath); cwd = (Join-Path $repoRoot "clients/java") }
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Start one detached process per client and wait for all. Stdout (the JSON
|
||||
# stats line) is captured to a per-client tmp file; stderr is captured too in
|
||||
# case a bench crashed.
|
||||
$jobs = @()
|
||||
|
||||
Write-Host "Launching $($Clients.Count) concurrent benches against $Endpoint (duration=$($DurationSeconds)s, warmup=$($WarmupSeconds)s, bulkSize=$BulkSize, distinctTags=$([bool]$DistinctTags))"
|
||||
|
||||
for ($i = 0; $i -lt $Clients.Count; $i++) {
|
||||
$client = $Clients[$i]
|
||||
$cmd = Get-ClientCommand -Client $client -ClientIndex $i
|
||||
$stdoutPath = Join-Path $tmpDir "$client.out"
|
||||
$stderrPath = Join-Path $tmpDir "$client.err"
|
||||
$startArgs = @{
|
||||
FilePath = $cmd.file
|
||||
ArgumentList = $cmd.args
|
||||
WorkingDirectory = $cmd.cwd
|
||||
RedirectStandardOutput = $stdoutPath
|
||||
RedirectStandardError = $stderrPath
|
||||
NoNewWindow = $true
|
||||
PassThru = $true
|
||||
}
|
||||
if ($cmd.PSObject.Properties['pythonpath']) {
|
||||
# Python needs PYTHONPATH so the editable mxgateway_cli module resolves.
|
||||
$env:PYTHONPATH = $cmd.pythonpath
|
||||
}
|
||||
$process = Start-Process @startArgs
|
||||
$jobs += [pscustomobject]@{ client = $client; process = $process; stdoutPath = $stdoutPath; stderrPath = $stderrPath }
|
||||
Write-Host " [$client] pid=$($process.Id)"
|
||||
}
|
||||
|
||||
foreach ($job in $jobs) {
|
||||
$job.process.WaitForExit()
|
||||
}
|
||||
|
||||
# Parse one JSON line per client. The line is typically the last
|
||||
# `{`-prefixed line in stdout (gradle, dotnet run, cargo run can emit log
|
||||
# noise before it).
|
||||
function Get-JsonStats {
|
||||
param([string]$Path)
|
||||
if (-not (Test-Path $Path)) { return $null }
|
||||
$content = Get-Content -Path $Path -Raw
|
||||
if ([string]::IsNullOrWhiteSpace($content)) { return $null }
|
||||
|
||||
# Scan from the LAST top-level `{` (the bench JSON is the final structured
|
||||
# output line; earlier text may be log noise from `dotnet run` / `cargo
|
||||
# run` / `gradle :run`). Walk forward counting braces to locate the
|
||||
# matching `}` so nested objects like `latencyMs` don't confuse the parser.
|
||||
$startIndex = -1
|
||||
$depth = 0
|
||||
for ($i = $content.Length - 1; $i -ge 0; $i--) {
|
||||
$ch = $content[$i]
|
||||
if ($ch -eq '}') { $depth++ }
|
||||
elseif ($ch -eq '{') {
|
||||
$depth--
|
||||
if ($depth -eq 0) { $startIndex = $i; break }
|
||||
}
|
||||
}
|
||||
if ($startIndex -lt 0) { return $null }
|
||||
|
||||
$endIndex = -1
|
||||
$depth = 0
|
||||
for ($i = $startIndex; $i -lt $content.Length; $i++) {
|
||||
$ch = $content[$i]
|
||||
if ($ch -eq '{') { $depth++ }
|
||||
elseif ($ch -eq '}') {
|
||||
$depth--
|
||||
if ($depth -eq 0) { $endIndex = $i; break }
|
||||
}
|
||||
}
|
||||
if ($endIndex -lt 0) { return $null }
|
||||
|
||||
$json = $content.Substring($startIndex, $endIndex - $startIndex + 1)
|
||||
try { return $json | ConvertFrom-Json }
|
||||
catch { return $null }
|
||||
}
|
||||
|
||||
$results = @()
|
||||
foreach ($job in $jobs) {
|
||||
$stats = Get-JsonStats -Path $job.stdoutPath
|
||||
if ($null -eq $stats) {
|
||||
$stderr = if (Test-Path $job.stderrPath) { (Get-Content -Path $job.stderrPath -Raw) } else { "" }
|
||||
Write-Warning "[$($job.client)] no JSON stats parsed; exit=$($job.process.ExitCode); stderr=$([string]::IsNullOrWhiteSpace($stderr) ? '(empty)' : $stderr.Substring(0, [Math]::Min(300, $stderr.Length)))"
|
||||
$results += [pscustomobject]@{ client = $job.client; exitCode = $job.process.ExitCode; stats = $null; stderr = $stderr }
|
||||
} else {
|
||||
$results += [pscustomobject]@{ client = $job.client; exitCode = $job.process.ExitCode; stats = $stats; stderr = $null }
|
||||
}
|
||||
}
|
||||
|
||||
# Pretty-print a side-by-side table.
|
||||
$rows = foreach ($r in $results) {
|
||||
if ($null -eq $r.stats) {
|
||||
[pscustomobject]@{
|
||||
client = $r.client
|
||||
"calls/sec" = "ERR"
|
||||
"total" = "-"
|
||||
"ok" = "-"
|
||||
"fail" = "-"
|
||||
"cached/total" = "-"
|
||||
"p50 ms" = "-"
|
||||
"p95 ms" = "-"
|
||||
"p99 ms" = "-"
|
||||
"max ms" = "-"
|
||||
"mean ms" = "-"
|
||||
}
|
||||
} else {
|
||||
$s = $r.stats
|
||||
[pscustomobject]@{
|
||||
client = $s.language
|
||||
"calls/sec" = $s.callsPerSecond
|
||||
"total" = $s.totalCalls
|
||||
"ok" = $s.successfulCalls
|
||||
"fail" = $s.failedCalls
|
||||
"cached/total" = "$($s.cachedReadResults)/$($s.totalReadResults)"
|
||||
"p50 ms" = $s.latencyMs.p50
|
||||
"p95 ms" = $s.latencyMs.p95
|
||||
"p99 ms" = $s.latencyMs.p99
|
||||
"max ms" = $s.latencyMs.max
|
||||
"mean ms" = $s.latencyMs.mean
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
$rows | Format-Table -AutoSize | Out-Host
|
||||
|
||||
$report = [pscustomobject]@{
|
||||
schemaVersion = 1
|
||||
endpoint = $Endpoint
|
||||
apiKeyEnv = $ApiKeyEnv
|
||||
durationSeconds = $DurationSeconds
|
||||
warmupSeconds = $WarmupSeconds
|
||||
bulkSize = $BulkSize
|
||||
distinctTags = [bool]$DistinctTags
|
||||
tagPrefix = $TagPrefix
|
||||
tagAttribute = $TagAttribute
|
||||
startedAt = (Get-Date).ToUniversalTime().ToString("o")
|
||||
clients = $results | ForEach-Object {
|
||||
[ordered]@{
|
||||
client = $_.client
|
||||
exitCode = $_.exitCode
|
||||
stats = $_.stats
|
||||
}
|
||||
}
|
||||
}
|
||||
$report | ConvertTo-Json -Depth 12 | Set-Content -Path $ReportPath -Encoding UTF8
|
||||
Write-Host "Combined report written to: $ReportPath"
|
||||
|
||||
Remove-Item -Path $tmpDir -Recurse -Force -ErrorAction SilentlyContinue
|
||||
Reference in New Issue
Block a user