Second re-review pass at commit a020350 caught 48 new findings — including
one High-severity regression I introduced in the prior sweep — and fixed
them all in one parallel wave.
High (1)
- Client.Python-018: prior sweep set `license = "Proprietary"` in
pyproject.toml. setuptools >= 77 enforces PEP 639 and rejects the
string (it must be a valid SPDX expression), so `pip wheel .` and
`pip install -e .` both fail before any source compiles. Tests
still pass because pytest bypasses the build backend via
`pythonpath`. Dropped the invalid license string, kept the
`License :: Other/Proprietary License` classifier, and added
`tests/test_packaging.py` so a future regression of the same shape
is caught in CI.
Mediums (6)
- Worker-023: `HeartbeatStuckCeiling` (default 75s = 5x HeartbeatGrace)
on WorkerPipeSessionOptions bounds the in-flight-command watchdog
suppression so a truly stuck COM call still triggers StaHung
instead of permanently defeating the watchdog.
- Client.Rust-018: reverted Rust's `latencyMs` split so the
cross-language bench comparison is apples-to-apples again;
`failureLatencyMs` kept as Rust-only enrichment.
- Client.Java-021: applied Client.Java-002's terminal-state
serialisation pattern to DeployEventStream so close() arriving
after queue-overflow can't erase the overflow exception.
- IntegrationTests-017: teardown-parity test now uses a two-window
stability check after UnAdvise instead of strict equality against
the pre-UnAdvise count (which raced against in-flight events).
- IntegrationTests-019: new RecordingTestOutputHelper wraps every
log sink the WriteSecured live test owns (worker stdout/stderr,
gateway logs, direct WriteLine) so the credential is proven
absent from the full output buffer, not just the diagnostic
message.
- Tests-020: added MxAccessGatewayServiceConstraintTests coverage
for the previously-uncovered Write2Bulk and WriteSecured2Bulk
arms of WriteBulkConstraintPlan.SetPayload.
Lows (41 — highlights)
- Server: Galaxy glob cache eviction is race-free (Server-024);
GalaxyRepositoryGrpcService takes IGalaxyRepository (Server-025);
AlarmsOptions validated at startup (Server-026); Authorization.md
Constraint Enforcement snippet/prose enumerate the bulk write/read
family (Server-027); bulk-read-commands and bulk-write-commands
capability tokens added to OpenSession (Server-029);
NotWiredAlarmRpcDispatcher XML doc and missing scope-resolver and
state-machine tests cleaned up (023, 028).
- Worker: AlarmCommandHandler now invokes the same STA-affinity
guard the poll path uses, at every command entry (Worker-024);
RunAsync null-checks the runtime-session factory result
(Worker-025).
- Worker.Tests: shared LiveMxAccessOptInVariableName lives on
GatewayContractInfo (Worker.Tests-025); MxAccessSession.CreateForTesting
rejects production sinks (Worker.Tests-026); FakeRuntimeSession's
CancelCommandReturnValue serialised under lock (Worker.Tests-027);
Probes namespace lifted to MxGateway.Worker.Tests.Probes
(Worker.Tests-029); cancel-envelope sequence numbers monotonised
(Worker.Tests-030); docs/GatewayTesting.md gains a "Dev-rig Probes"
section (Worker.Tests-028).
- Tests: ManualTimeProvider consolidated into one TestSupport/ copy
(Tests-021); SessionManagerBulkTests adds a mid-flight cancellation
test backed by a TaskCompletionSource fake (Tests-022); companion
FakeWorkerProcess.WaitForExitAsync no longer fakes its exit signal
(Tests-023); constraint plan reply-count divergence pinned
(Tests-024).
- IntegrationTests: TryGetSession chain carries [MaybeNullWhen(false)]
end-to-end (IntegrationTests-018); abnormal-exit keyword set
tightened to pipe-disconnected/end-of-stream and the test now
asserts streamTask.IsFaulted (020, 021).
- Client.Dotnet: bench commands added to isLongRunning so the
default 30s wall-clock budget doesn't kill them (015);
BenchStreamEventsAsync observes the inner stream task on every
exit path (016).
- Client.Go: parseValue wraps strconv errors with flag context and
%w (017); bench loops honour ctx.Done() (018); galaxy-watch parses
RFC3339Nano with fractional seconds (019); runStreamEvents installs
signal.NotifyContext like runGalaxyWatch (020); five new CLI-level
table-driven tests cover the bulk/bench subcommands (021).
- Client.Java: toCompletable Javadoc rewritten to match the actual
cancellation contract Client.Java-015 established (022); stream-events
text path uses Long.toUnsignedString for worker_sequence (023);
bench-read-bulk no longer pollutes success-latency histogram with
failure durations (024); --shutdown-timeout CLI option propagates
through to ClientOptions (025); seven new MxGatewayCliTests cover
the bulk and bench commands (026).
- Client.Python: mxgateway_cli ships its own py.typed marker (019);
wheel-build smoke test added under tests/test_packaging.py (020);
README documents the Galaxy CLI parity gap explicitly (021).
- Client.Rust: RustClientDesign.md signatures match session.rs and
document the AsRef<str> read_bulk genericism (019);
next_correlation_id re-exported at the crate root, with a
property-style doc contract and an explicit disclaimer that the
literal textual format is not part of the contract (020).
- Contracts: BulkWriteResult comment names the actual
IConstraintEnforcer mechanism instead of "tag-allowlist filter"
(014); BulkReadResult gains explicit per-arm payload-population
documentation for the success vs failure cases (015).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -221,9 +221,13 @@ public static class MxGatewayClientCli
|
||||
private static CancellationTokenSource CreateCancellation(CliArguments arguments, string command)
|
||||
{
|
||||
var cancellation = new CancellationTokenSource();
|
||||
// Long-running streaming commands run until Ctrl+C / cancellation by default;
|
||||
// a caller-supplied --timeout still applies if present.
|
||||
bool isLongRunning = command is "galaxy-watch";
|
||||
// Long-running streaming / bench commands run until they finish (or Ctrl+C)
|
||||
// by default; a caller-supplied --timeout still applies if present. The
|
||||
// bench commands default to --duration-seconds=30 --warmup-seconds=3 plus
|
||||
// a per-session stagger, which already exceeds the default 30 s wall-clock
|
||||
// budget, so applying that budget would cancel them mid-window and emit a
|
||||
// zero-throughput JSON payload (see Client.Dotnet-015).
|
||||
bool isLongRunning = command is "galaxy-watch" or "bench-read-bulk" or "bench-stream-events";
|
||||
string? rawTimeout = arguments.GetOptional("timeout");
|
||||
if (isLongRunning && string.IsNullOrWhiteSpace(rawTimeout))
|
||||
{
|
||||
@@ -968,11 +972,25 @@ public static class MxGatewayClientCli
|
||||
}
|
||||
}, streamCts.Token);
|
||||
|
||||
await Task.Delay(steadyEnd - warmupStart, cancellationToken).ConfigureAwait(false);
|
||||
streamCts.Cancel();
|
||||
try { await streamTask.ConfigureAwait(false); }
|
||||
catch (OperationCanceledException) { }
|
||||
catch (Grpc.Core.RpcException ex) when (ex.StatusCode is Grpc.Core.StatusCode.Cancelled) { }
|
||||
// The inner streamTask MUST be observed on every path — including when
|
||||
// the outer cancellationToken cancels during the Task.Delay below — or
|
||||
// its fault surfaces as a TaskScheduler.UnobservedTaskException after
|
||||
// GC. Use try/finally so the cancel + await pair always runs (see
|
||||
// Client.Dotnet-016). RpcException(Cancelled) never reaches here in
|
||||
// production because GrpcMxGatewayClientTransport.StreamEventsAsync
|
||||
// routes through RpcExceptionMapper.Map, which returns OCE for
|
||||
// StatusCode.Cancelled.
|
||||
try
|
||||
{
|
||||
await Task.Delay(steadyEnd - warmupStart, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
finally
|
||||
{
|
||||
streamCts.Cancel();
|
||||
try { await streamTask.ConfigureAwait(false); }
|
||||
catch (OperationCanceledException) { }
|
||||
catch (MxGatewayException) { }
|
||||
}
|
||||
|
||||
try
|
||||
{
|
||||
|
||||
@@ -186,7 +186,8 @@ The CLI exposes the same RPC via `galaxy-watch`:
|
||||
```powershell
|
||||
go run ./cmd/mxgw-go galaxy-watch -plaintext
|
||||
go run ./cmd/mxgw-go galaxy-watch -plaintext -json
|
||||
go run ./cmd/mxgw-go galaxy-watch -plaintext -last-seen-deploy-time 2026-04-28T10:00:00Z
|
||||
go run ./cmd/mxgw-go galaxy-watch -plaintext -last-seen-deploy-time 2026-04-28T10:00:00Z # whole-second RFC 3339
|
||||
go run ./cmd/mxgw-go galaxy-watch -plaintext -last-seen-deploy-time 2026-04-28T10:00:00.123Z # fractional seconds also accepted
|
||||
go run ./cmd/mxgw-go galaxy-watch -plaintext -limit 5
|
||||
```
|
||||
|
||||
|
||||
@@ -589,14 +589,19 @@ func runBenchReadBulk(ctx context.Context, args []string, stdout, stderr io.Writ
|
||||
}()
|
||||
|
||||
// Warm-up: drive identical calls so any first-call JIT / connection-pool
|
||||
// setup is amortised before the measurement window opens.
|
||||
// setup is amortised before the measurement window opens. Honor ctx so
|
||||
// Ctrl+C or a parent-cancel (e.g. the cross-language bench driver killing
|
||||
// the child early) exits promptly rather than spinning failing calls until
|
||||
// the wall-clock deadline.
|
||||
warmupDeadline := time.Now().Add(time.Duration(*warmupSeconds) * time.Second)
|
||||
timeout := time.Duration(*timeoutMs) * time.Millisecond
|
||||
for time.Now().Before(warmupDeadline) {
|
||||
for time.Now().Before(warmupDeadline) && ctx.Err() == nil {
|
||||
_, _ = session.ReadBulk(ctx, serverHandle, tags, timeout)
|
||||
}
|
||||
|
||||
// Steady state: per-call latency captured via time.Now() deltas.
|
||||
// Steady state: per-call latency captured via time.Now() deltas. Same ctx
|
||||
// guard as warm-up; on cancel we stop the loop and report the truncated
|
||||
// window faithfully.
|
||||
latenciesMs := make([]float64, 0, 65536)
|
||||
var totalReadResults int64
|
||||
var cachedReadResults int64
|
||||
@@ -604,7 +609,7 @@ func runBenchReadBulk(ctx context.Context, args []string, stdout, stderr io.Writ
|
||||
steadyStart := time.Now()
|
||||
steadyDeadline := steadyStart.Add(time.Duration(*durationSeconds) * time.Second)
|
||||
|
||||
for time.Now().Before(steadyDeadline) {
|
||||
for time.Now().Before(steadyDeadline) && ctx.Err() == nil {
|
||||
callStart := time.Now()
|
||||
results, err := session.ReadBulk(ctx, serverHandle, tags, timeout)
|
||||
elapsed := time.Since(callStart)
|
||||
@@ -772,8 +777,15 @@ func runStreamEvents(ctx context.Context, args []string, stdout, stderr io.Write
|
||||
}
|
||||
defer client.Close()
|
||||
|
||||
// Mirror runGalaxyWatch so Ctrl+C on a long-running stream-events command
|
||||
// cancels the gRPC stream cleanly (the gateway sees codes.Canceled rather
|
||||
// than a torn TCP connection) and the deferred subscription.Close() /
|
||||
// client.Close() actually run.
|
||||
signalCtx, stopSignals := signal.NotifyContext(ctx, os.Interrupt, syscall.SIGTERM)
|
||||
defer stopSignals()
|
||||
|
||||
session := mxgateway.NewSessionForID(client, *sessionID)
|
||||
streamCtx, cancelStream := context.WithCancel(ctx)
|
||||
streamCtx, cancelStream := context.WithCancel(signalCtx)
|
||||
defer cancelStream()
|
||||
subscription, err := session.SubscribeEventsAfter(streamCtx, *after)
|
||||
if err != nil {
|
||||
@@ -956,31 +968,31 @@ func parseValue(valueType, valueText string) (*mxgateway.MxValue, error) {
|
||||
case "bool":
|
||||
value, err := strconv.ParseBool(valueText)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
return nil, fmt.Errorf("invalid -value for -type %s: %q: %w", valueType, valueText, err)
|
||||
}
|
||||
return mxgateway.BoolValue(value), nil
|
||||
case "int32":
|
||||
value, err := strconv.ParseInt(valueText, 10, 32)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
return nil, fmt.Errorf("invalid -value for -type %s: %q: %w", valueType, valueText, err)
|
||||
}
|
||||
return mxgateway.Int32Value(int32(value)), nil
|
||||
case "int64":
|
||||
value, err := strconv.ParseInt(valueText, 10, 64)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
return nil, fmt.Errorf("invalid -value for -type %s: %q: %w", valueType, valueText, err)
|
||||
}
|
||||
return mxgateway.Int64Value(value), nil
|
||||
case "float":
|
||||
value, err := strconv.ParseFloat(valueText, 32)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
return nil, fmt.Errorf("invalid -value for -type %s: %q: %w", valueType, valueText, err)
|
||||
}
|
||||
return mxgateway.FloatValue(float32(value)), nil
|
||||
case "double":
|
||||
value, err := strconv.ParseFloat(valueText, 64)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
return nil, fmt.Errorf("invalid -value for -type %s: %q: %w", valueType, valueText, err)
|
||||
}
|
||||
return mxgateway.DoubleValue(value), nil
|
||||
case "string":
|
||||
@@ -1201,7 +1213,7 @@ func runGalaxyWatch(ctx context.Context, args []string, stdout, stderr io.Writer
|
||||
flags.SetOutput(stderr)
|
||||
common := bindCommonFlags(flags)
|
||||
jsonOutput := flags.Bool("json", false, "write JSON output")
|
||||
lastSeen := flags.String("last-seen-deploy-time", "", "RFC3339 timestamp; when set, suppresses the bootstrap event")
|
||||
lastSeen := flags.String("last-seen-deploy-time", "", "RFC 3339 timestamp (with optional fractional seconds); when set, suppresses the bootstrap event")
|
||||
limit := flags.Int("limit", 0, "maximum events to read; 0 means unbounded (Ctrl+C to stop)")
|
||||
|
||||
if err := flags.Parse(args); err != nil {
|
||||
@@ -1210,7 +1222,11 @@ func runGalaxyWatch(ctx context.Context, args []string, stdout, stderr io.Writer
|
||||
|
||||
var lastSeenPtr *time.Time
|
||||
if *lastSeen != "" {
|
||||
parsed, err := time.Parse(time.RFC3339, *lastSeen)
|
||||
// Use RFC3339Nano so values copy-pasted from galaxy-watch -json output
|
||||
// (which formatDeployEvent emits with fractional seconds) round-trip;
|
||||
// RFC3339Nano also accepts whole-second values, so the layout switch is
|
||||
// strictly broader than the previous time.RFC3339 parse.
|
||||
parsed, err := time.Parse(time.RFC3339Nano, *lastSeen)
|
||||
if err != nil {
|
||||
return fmt.Errorf("invalid -last-seen-deploy-time: %w", err)
|
||||
}
|
||||
|
||||
@@ -3,6 +3,7 @@ package main
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
@@ -85,3 +86,166 @@ func TestParseInt32ListReturnsErrorOnMalformedToken(t *testing.T) {
|
||||
t.Fatalf("parseInt32List() error = %q, want it to name the bad token", err.Error())
|
||||
}
|
||||
}
|
||||
|
||||
// TestParseValueWrapsStrconvErrorWithFlagContext pins Client.Go-017: each
|
||||
// typed branch of parseValue wraps the bare strconv error with `%w` and names
|
||||
// the offending flag and value, so the CLI surface is consistent with
|
||||
// parseInt32List ("invalid item handle %q: %w") and parseRfc3339Timestamp
|
||||
// ("invalid RFC 3339 timestamp %q: %w").
|
||||
func TestParseValueWrapsStrconvErrorWithFlagContext(t *testing.T) {
|
||||
cases := []struct {
|
||||
valueType string
|
||||
valueText string
|
||||
}{
|
||||
{"bool", "notabool"},
|
||||
{"int32", "foo"},
|
||||
{"int64", "foo"},
|
||||
{"float", "notafloat"},
|
||||
{"double", "notadouble"},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.valueType, func(t *testing.T) {
|
||||
_, err := parseValue(tc.valueType, tc.valueText)
|
||||
if err == nil {
|
||||
t.Fatalf("parseValue(%q, %q) error = nil, want a parse error", tc.valueType, tc.valueText)
|
||||
}
|
||||
msg := err.Error()
|
||||
if !strings.Contains(msg, "-value") {
|
||||
t.Fatalf("parseValue() error = %q, want it to name the -value flag", msg)
|
||||
}
|
||||
if !strings.Contains(msg, tc.valueType) {
|
||||
t.Fatalf("parseValue() error = %q, want it to name the type %q", msg, tc.valueType)
|
||||
}
|
||||
if !strings.Contains(msg, tc.valueText) {
|
||||
t.Fatalf("parseValue() error = %q, want it to name the bad token %q", msg, tc.valueText)
|
||||
}
|
||||
// errors.Unwrap must reach the underlying strconv error so callers
|
||||
// can still errors.Is/As against strconv.ErrSyntax if they care.
|
||||
if errors.Unwrap(err) == nil {
|
||||
t.Fatalf("parseValue() returned unwrapped error %q, want a %%w wrap", msg)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunWriteBulkVariantGatesSecuredFlags pins the Client.Go-015 fix at the
|
||||
// CLI surface: secured-only flags (-current-user-id, -verifier-user-id) must
|
||||
// not be registered on the non-secured variants, and -user-id must not be
|
||||
// registered on the secured variants. The flag package rejects an unknown
|
||||
// flag with "flag provided but not defined", which a future refactor that
|
||||
// re-broadens flag registration would silently undo without this test.
|
||||
func TestRunWriteBulkVariantGatesSecuredFlags(t *testing.T) {
|
||||
cases := []struct {
|
||||
name string
|
||||
command string
|
||||
flag string
|
||||
}{
|
||||
{"write-bulk rejects -current-user-id", "write-bulk", "-current-user-id"},
|
||||
{"write-bulk rejects -verifier-user-id", "write-bulk", "-verifier-user-id"},
|
||||
{"write2-bulk rejects -current-user-id", "write2-bulk", "-current-user-id"},
|
||||
{"write-secured-bulk rejects -user-id", "write-secured-bulk", "-user-id"},
|
||||
{"write-secured2-bulk rejects -user-id", "write-secured2-bulk", "-user-id"},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runWithIO(t.Context(), []string{
|
||||
tc.command,
|
||||
"-plaintext",
|
||||
"-session-id", "sess",
|
||||
"-server-handle", "1",
|
||||
"-item-handles", "1",
|
||||
"-values", "1",
|
||||
tc.flag, "1",
|
||||
}, &stdout, &stderr)
|
||||
if err == nil {
|
||||
t.Fatalf("runWithIO(%s %s) error = nil, want flag-not-defined", tc.command, tc.flag)
|
||||
}
|
||||
combined := err.Error() + stderr.String()
|
||||
if !strings.Contains(combined, "flag provided but not defined") {
|
||||
t.Fatalf("runWithIO(%s %s) error/stderr = %q, want 'flag provided but not defined'", tc.command, tc.flag, combined)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunReadBulkRejectsMissingArgs pins the "session-id and items are
|
||||
// required" validation in runReadBulk before any network dial happens.
|
||||
func TestRunReadBulkRejectsMissingArgs(t *testing.T) {
|
||||
cases := []struct {
|
||||
name string
|
||||
args []string
|
||||
}{
|
||||
{"no flags", []string{"read-bulk"}},
|
||||
{"missing items", []string{"read-bulk", "-plaintext", "-session-id", "sess"}},
|
||||
{"missing session-id", []string{"read-bulk", "-plaintext", "-items", "Tag.Attr"}},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runWithIO(t.Context(), tc.args, &stdout, &stderr)
|
||||
if err == nil {
|
||||
t.Fatalf("runWithIO(%v) error = nil, want validation error", tc.args)
|
||||
}
|
||||
if !strings.Contains(err.Error(), "session-id and items are required") {
|
||||
t.Fatalf("runWithIO(%v) error = %q, want 'session-id and items are required'", tc.args, err.Error())
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunBenchReadBulkRejectsNonPositiveBulkSize pins the bulk-size>=1 check
|
||||
// at runBenchReadBulk's flag-parsing stage so a future refactor cannot drop
|
||||
// the positivity guard without breaking this test.
|
||||
func TestRunBenchReadBulkRejectsNonPositiveBulkSize(t *testing.T) {
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runWithIO(t.Context(), []string{
|
||||
"bench-read-bulk",
|
||||
"-plaintext",
|
||||
"-bulk-size", "0",
|
||||
}, &stdout, &stderr)
|
||||
if err == nil {
|
||||
t.Fatalf("runWithIO(bench-read-bulk -bulk-size 0) error = nil, want positivity error")
|
||||
}
|
||||
if !strings.Contains(err.Error(), "bulk-size must be positive") {
|
||||
t.Fatalf("runWithIO error = %q, want 'bulk-size must be positive'", err.Error())
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunBenchReadBulkRejectsNonPositiveDuration pins the duration-seconds>=1
|
||||
// check at runBenchReadBulk's flag-parsing stage.
|
||||
func TestRunBenchReadBulkRejectsNonPositiveDuration(t *testing.T) {
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runWithIO(t.Context(), []string{
|
||||
"bench-read-bulk",
|
||||
"-plaintext",
|
||||
"-duration-seconds", "0",
|
||||
}, &stdout, &stderr)
|
||||
if err == nil {
|
||||
t.Fatalf("runWithIO(bench-read-bulk -duration-seconds 0) error = nil, want positivity error")
|
||||
}
|
||||
if !strings.Contains(err.Error(), "duration-seconds must be positive") {
|
||||
t.Fatalf("runWithIO error = %q, want 'duration-seconds must be positive'", err.Error())
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunWriteBulkVariantRejectsMismatchedHandlesAndValues pins the explicit
|
||||
// "item-handles count ... does not match values count ..." check at the CLI
|
||||
// surface so the validation error surfaces before any dial happens.
|
||||
func TestRunWriteBulkVariantRejectsMismatchedHandlesAndValues(t *testing.T) {
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runWithIO(t.Context(), []string{
|
||||
"write-bulk",
|
||||
"-plaintext",
|
||||
"-session-id", "sess",
|
||||
"-server-handle", "1",
|
||||
"-item-handles", "1,2,3",
|
||||
"-values", "10,20",
|
||||
}, &stdout, &stderr)
|
||||
if err == nil {
|
||||
t.Fatalf("runWithIO(write-bulk mismatched counts) error = nil, want mismatch error")
|
||||
}
|
||||
if !strings.Contains(err.Error(), "item-handles count") || !strings.Contains(err.Error(), "values count") {
|
||||
t.Fatalf("runWithIO error = %q, want 'item-handles count ... values count ...'", err.Error())
|
||||
}
|
||||
}
|
||||
|
||||
+45
-11
@@ -857,6 +857,10 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
try {
|
||||
List<BulkReadResult> results = session.readBulk(serverHandle, tags, timeoutMs);
|
||||
long elapsed = System.nanoTime() - callStart;
|
||||
// Only record successful-call latencies — including failed-call
|
||||
// durations would pollute the p50/p95/p99 percentile summary
|
||||
// (Client.Java-024, mirrors Client.Rust-015). The cross-language
|
||||
// bench matrix expects success-only latency histograms.
|
||||
if (latencyCount >= latenciesNanos.length) {
|
||||
long[] grown = new long[latenciesNanos.length * 2];
|
||||
System.arraycopy(latenciesNanos, 0, grown, 0, latencyCount);
|
||||
@@ -871,13 +875,9 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
}
|
||||
}
|
||||
} catch (Exception ex) {
|
||||
long elapsed = System.nanoTime() - callStart;
|
||||
if (latencyCount >= latenciesNanos.length) {
|
||||
long[] grown = new long[latenciesNanos.length * 2];
|
||||
System.arraycopy(latenciesNanos, 0, grown, 0, latencyCount);
|
||||
latenciesNanos = grown;
|
||||
}
|
||||
latenciesNanos[latencyCount++] = elapsed;
|
||||
// Failed-call duration is intentionally NOT recorded into
|
||||
// the success-latency histogram — only count the failure so
|
||||
// the failedCalls JSON field reflects it.
|
||||
failed++;
|
||||
}
|
||||
}
|
||||
@@ -1051,7 +1051,13 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
if (json) {
|
||||
client.out().println(protoJson(event));
|
||||
} else {
|
||||
client.out().printf("%d %s%n", event.getWorkerSequence(), event.getFamily());
|
||||
// worker_sequence is a proto uint64 — print as unsigned so
|
||||
// values past 2^63 do not render as negative signed longs.
|
||||
// JSON path goes through JsonFormat which already does this.
|
||||
client.out().printf(
|
||||
"%s %s%n",
|
||||
Long.toUnsignedString(event.getWorkerSequence()),
|
||||
event.getFamily());
|
||||
}
|
||||
count++;
|
||||
if (limit > 0 && count >= limit) {
|
||||
@@ -1134,6 +1140,12 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
@Option(names = "--timeout", defaultValue = "30s", description = "Per-call timeout.")
|
||||
String timeout;
|
||||
|
||||
@Option(
|
||||
names = "--shutdown-timeout",
|
||||
description =
|
||||
"Channel shutdown timeout (e.g. 10s, 500ms). When unset, the library default applies.")
|
||||
String shutdownTimeout;
|
||||
|
||||
/**
|
||||
* Returns this options object unchanged.
|
||||
*
|
||||
@@ -1173,15 +1185,35 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
return parseDuration(timeout);
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves the effective channel-shutdown timeout from the
|
||||
* {@code --shutdown-timeout} option, or {@code null} when the user did
|
||||
* not pass one (in which case the {@link MxGatewayClientOptions}
|
||||
* default applies). Computed on each call so there is no stale cached
|
||||
* state.
|
||||
*
|
||||
* @return the resolved shutdown timeout, or {@code null} when unset
|
||||
*/
|
||||
Duration resolvedShutdownTimeout() {
|
||||
if (shutdownTimeout == null || shutdownTimeout.isBlank()) {
|
||||
return null;
|
||||
}
|
||||
return parseDuration(shutdownTimeout);
|
||||
}
|
||||
|
||||
MxGatewayClientOptions toClientOptions() {
|
||||
return MxGatewayClientOptions.builder()
|
||||
MxGatewayClientOptions.Builder builder = MxGatewayClientOptions.builder()
|
||||
.endpoint(endpoint)
|
||||
.apiKey(resolvedApiKey())
|
||||
.plaintext(plaintext)
|
||||
.caCertificatePath(caFile)
|
||||
.serverNameOverride(serverNameOverride)
|
||||
.callTimeout(resolvedTimeout())
|
||||
.build();
|
||||
.callTimeout(resolvedTimeout());
|
||||
Duration resolvedShutdownTimeout = resolvedShutdownTimeout();
|
||||
if (resolvedShutdownTimeout != null) {
|
||||
builder.shutdownTimeout(resolvedShutdownTimeout);
|
||||
}
|
||||
return builder.build();
|
||||
}
|
||||
|
||||
Map<String, Object> redactedJsonMap() {
|
||||
@@ -1193,6 +1225,8 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
values.put("caFile", caFile == null ? "" : caFile.toString());
|
||||
values.put("serverNameOverride", serverNameOverride);
|
||||
values.put("timeout", timeout);
|
||||
Duration resolvedShutdownTimeout = resolvedShutdownTimeout();
|
||||
values.put("shutdownTimeout", resolvedShutdownTimeout == null ? "" : resolvedShutdownTimeout.toString());
|
||||
return values;
|
||||
}
|
||||
}
|
||||
|
||||
+283
-5
@@ -149,6 +149,21 @@ final class MxGatewayCliTests {
|
||||
assertFalse(text.contains("seq=-1"), "must not render as signed -1");
|
||||
}
|
||||
|
||||
@Test
|
||||
void streamEventsWorkerSequenceRendersAsUnsignedForHighUint64() {
|
||||
// Client.Java-023 regression: stream-events text output now uses
|
||||
// Long.toUnsignedString to format the proto uint64 worker_sequence
|
||||
// field, mirroring the Client.Java-020 fix for DeployEvent.sequence.
|
||||
long highUnsigned = -1L; // bit-pattern for 2^64 - 1, i.e. 18446744073709551615 unsigned
|
||||
String text = String.format(
|
||||
"%s %s",
|
||||
Long.toUnsignedString(highUnsigned),
|
||||
"MX_EVENT_FAMILY_DATA_CHANGE");
|
||||
|
||||
assertTrue(text.startsWith("18446744073709551615 "), "expected unsigned rendering, got: " + text);
|
||||
assertFalse(text.startsWith("-1 "), "must not render as signed -1");
|
||||
}
|
||||
|
||||
@Test
|
||||
void unsubscribeBulkCommandPrintsResults() {
|
||||
CliRun run = execute(
|
||||
@@ -168,6 +183,209 @@ final class MxGatewayCliTests {
|
||||
assertTrue(run.output().contains("\"wasSuccessful\":true"));
|
||||
}
|
||||
|
||||
// ---- Client.Java-026: CLI-level coverage for bulk subcommands ----
|
||||
|
||||
@Test
|
||||
void readBulkCommandForwardsTimeoutAndPrintsResults() {
|
||||
FakeClientFactory factory = new FakeClientFactory();
|
||||
CliRun run = execute(
|
||||
factory,
|
||||
"read-bulk",
|
||||
"--session-id",
|
||||
"session-cli",
|
||||
"--server-handle",
|
||||
"42",
|
||||
"--items",
|
||||
"TestMachine_001.TestChangingInt,TestMachine_002.TestChangingInt",
|
||||
"--timeout-ms",
|
||||
"750",
|
||||
"--json");
|
||||
|
||||
assertEquals(0, run.exitCode());
|
||||
assertEquals(750, factory.client.session.lastReadBulkTimeoutMs);
|
||||
assertEquals(2, factory.client.session.lastReadBulkItems.size());
|
||||
assertTrue(run.output().contains("\"command\":\"read-bulk\""));
|
||||
assertTrue(run.output().contains("\"tagAddress\":\"TestMachine_001.TestChangingInt\""));
|
||||
assertTrue(run.output().contains("\"itemHandle\":200"));
|
||||
assertTrue(run.output().contains("\"wasCached\":true"));
|
||||
assertTrue(run.output().contains("\"quality\":192"));
|
||||
}
|
||||
|
||||
@Test
|
||||
void writeBulkCommandParsesTypedValuesAndPrintsResults() {
|
||||
FakeClientFactory factory = new FakeClientFactory();
|
||||
CliRun run = execute(
|
||||
factory,
|
||||
"write-bulk",
|
||||
"--session-id",
|
||||
"session-cli",
|
||||
"--server-handle",
|
||||
"42",
|
||||
"--item-handles",
|
||||
"100,101",
|
||||
"--type",
|
||||
"int32",
|
||||
"--values",
|
||||
"111,222",
|
||||
"--user-id",
|
||||
"5",
|
||||
"--json");
|
||||
|
||||
assertEquals(0, run.exitCode());
|
||||
assertEquals(2, factory.client.session.lastWriteBulkEntries.size());
|
||||
assertEquals(111, factory.client.session.lastWriteBulkEntries.get(0).getValue().getInt32Value());
|
||||
assertEquals(222, factory.client.session.lastWriteBulkEntries.get(1).getValue().getInt32Value());
|
||||
assertEquals(5, factory.client.session.lastWriteBulkEntries.get(0).getUserId());
|
||||
assertTrue(run.output().contains("\"command\":\"write-bulk\""));
|
||||
assertTrue(run.output().contains("\"itemHandle\":100"));
|
||||
assertTrue(run.output().contains("\"wasSuccessful\":true"));
|
||||
}
|
||||
|
||||
@Test
|
||||
void write2BulkCommandForwardsTimestampAndPrintsResults() {
|
||||
FakeClientFactory factory = new FakeClientFactory();
|
||||
CliRun run = execute(
|
||||
factory,
|
||||
"write2-bulk",
|
||||
"--session-id",
|
||||
"session-cli",
|
||||
"--server-handle",
|
||||
"42",
|
||||
"--item-handles",
|
||||
"100",
|
||||
"--type",
|
||||
"string",
|
||||
"--values",
|
||||
"hello",
|
||||
"--timestamp",
|
||||
"2026-05-20T00:00:00Z",
|
||||
"--json");
|
||||
|
||||
assertEquals(0, run.exitCode());
|
||||
assertEquals(1, factory.client.session.lastWrite2BulkEntries.size());
|
||||
assertEquals(
|
||||
"hello",
|
||||
factory.client.session.lastWrite2BulkEntries.get(0).getValue().getStringValue());
|
||||
assertTrue(
|
||||
factory.client.session.lastWrite2BulkEntries.get(0).hasTimestampValue(),
|
||||
"expected timestampValue to be forwarded");
|
||||
assertTrue(run.output().contains("\"command\":\"write2-bulk\""));
|
||||
assertTrue(run.output().contains("\"itemHandle\":100"));
|
||||
assertTrue(run.output().contains("\"wasSuccessful\":true"));
|
||||
}
|
||||
|
||||
@Test
|
||||
void writeSecuredBulkCommandForwardsUserIdsAndPrintsResults() {
|
||||
FakeClientFactory factory = new FakeClientFactory();
|
||||
CliRun run = execute(
|
||||
factory,
|
||||
"write-secured-bulk",
|
||||
"--session-id",
|
||||
"session-cli",
|
||||
"--server-handle",
|
||||
"42",
|
||||
"--item-handles",
|
||||
"100",
|
||||
"--type",
|
||||
"int32",
|
||||
"--values",
|
||||
"9",
|
||||
"--current-user-id",
|
||||
"7",
|
||||
"--verifier-user-id",
|
||||
"8",
|
||||
"--json");
|
||||
|
||||
assertEquals(0, run.exitCode());
|
||||
assertEquals(1, factory.client.session.lastWriteSecuredBulkEntries.size());
|
||||
assertEquals(7, factory.client.session.lastWriteSecuredBulkEntries.get(0).getCurrentUserId());
|
||||
assertEquals(8, factory.client.session.lastWriteSecuredBulkEntries.get(0).getVerifierUserId());
|
||||
assertEquals(9, factory.client.session.lastWriteSecuredBulkEntries.get(0).getValue().getInt32Value());
|
||||
assertTrue(run.output().contains("\"command\":\"write-secured-bulk\""));
|
||||
assertTrue(run.output().contains("\"wasSuccessful\":true"));
|
||||
}
|
||||
|
||||
@Test
|
||||
void writeSecured2BulkCommandForwardsTimestampAndUserIdsAndPrintsResults() {
|
||||
FakeClientFactory factory = new FakeClientFactory();
|
||||
CliRun run = execute(
|
||||
factory,
|
||||
"write-secured2-bulk",
|
||||
"--session-id",
|
||||
"session-cli",
|
||||
"--server-handle",
|
||||
"42",
|
||||
"--item-handles",
|
||||
"100",
|
||||
"--type",
|
||||
"string",
|
||||
"--values",
|
||||
"value",
|
||||
"--timestamp",
|
||||
"2026-05-20T00:00:00Z",
|
||||
"--current-user-id",
|
||||
"7",
|
||||
"--verifier-user-id",
|
||||
"8",
|
||||
"--json");
|
||||
|
||||
assertEquals(0, run.exitCode());
|
||||
assertEquals(1, factory.client.session.lastWriteSecured2BulkEntries.size());
|
||||
assertEquals(7, factory.client.session.lastWriteSecured2BulkEntries.get(0).getCurrentUserId());
|
||||
assertEquals(8, factory.client.session.lastWriteSecured2BulkEntries.get(0).getVerifierUserId());
|
||||
assertTrue(
|
||||
factory.client.session.lastWriteSecured2BulkEntries.get(0).hasTimestampValue(),
|
||||
"expected timestampValue to be forwarded");
|
||||
assertTrue(run.output().contains("\"command\":\"write-secured2-bulk\""));
|
||||
assertTrue(run.output().contains("\"wasSuccessful\":true"));
|
||||
}
|
||||
|
||||
@Test
|
||||
void benchReadBulkCommandEmitsJsonSchemaKeys() {
|
||||
// Short bench window (1 s steady, 0 s warmup) keeps the test fast; we assert
|
||||
// the JSON schema rather than numeric values so the cross-language matrix
|
||||
// (.NET / Go / Rust / Python) and the Java path agree on the output shape.
|
||||
FakeClientFactory factory = new FakeClientFactory();
|
||||
CliRun run = execute(
|
||||
factory,
|
||||
"bench-read-bulk",
|
||||
"--duration-seconds",
|
||||
"1",
|
||||
"--warmup-seconds",
|
||||
"0",
|
||||
"--bulk-size",
|
||||
"2",
|
||||
"--tag-start",
|
||||
"1",
|
||||
"--tag-prefix",
|
||||
"TestMachine_",
|
||||
"--tag-attribute",
|
||||
"TestChangingInt",
|
||||
"--timeout-ms",
|
||||
"100",
|
||||
"--json");
|
||||
|
||||
assertEquals(0, run.exitCode());
|
||||
String output = run.output();
|
||||
assertTrue(output.contains("\"language\":\"java\""), output);
|
||||
assertTrue(output.contains("\"command\":\"bench-read-bulk\""), output);
|
||||
assertTrue(output.contains("\"bulkSize\":2"), output);
|
||||
assertTrue(output.contains("\"durationSeconds\":1"), output);
|
||||
assertTrue(output.contains("\"warmupSeconds\":0"), output);
|
||||
assertTrue(output.contains("\"totalCalls\":"), output);
|
||||
assertTrue(output.contains("\"successfulCalls\":"), output);
|
||||
assertTrue(output.contains("\"failedCalls\":"), output);
|
||||
assertTrue(output.contains("\"callsPerSecond\":"), output);
|
||||
assertTrue(output.contains("\"latencyMs\":"), output);
|
||||
assertTrue(output.contains("\"p50\":"), output);
|
||||
assertTrue(output.contains("\"p95\":"), output);
|
||||
assertTrue(output.contains("\"p99\":"), output);
|
||||
assertTrue(output.contains("\"tags\":"), output);
|
||||
// Bench tag synthesis: TestMachine_001.TestChangingInt, TestMachine_002.TestChangingInt.
|
||||
assertTrue(output.contains("TestMachine_001.TestChangingInt"), output);
|
||||
assertTrue(output.contains("TestMachine_002.TestChangingInt"), output);
|
||||
}
|
||||
|
||||
private static CliRun execute(MxGatewayCli.MxGatewayCliClientFactory factory, String... args) {
|
||||
StringWriter output = new StringWriter();
|
||||
StringWriter errors = new StringWriter();
|
||||
@@ -322,29 +540,89 @@ final class MxGatewayCliTests {
|
||||
return results;
|
||||
}
|
||||
|
||||
// Recorded so tests can assert the CLI forwarded the parsed options through to
|
||||
// the session interface. The bulk subcommands return at least one result so the
|
||||
// JSON output assertions exercise the *Map serialisers in MxGatewayCli.
|
||||
|
||||
private int lastReadBulkTimeoutMs;
|
||||
private List<String> lastReadBulkItems = new ArrayList<>();
|
||||
private List<WriteBulkEntry> lastWriteBulkEntries = new ArrayList<>();
|
||||
private List<Write2BulkEntry> lastWrite2BulkEntries = new ArrayList<>();
|
||||
private List<WriteSecuredBulkEntry> lastWriteSecuredBulkEntries = new ArrayList<>();
|
||||
private List<WriteSecured2BulkEntry> lastWriteSecured2BulkEntries = new ArrayList<>();
|
||||
|
||||
@Override
|
||||
public List<BulkReadResult> readBulk(int serverHandle, List<String> items, int timeoutMs) {
|
||||
return new ArrayList<>();
|
||||
lastReadBulkTimeoutMs = timeoutMs;
|
||||
lastReadBulkItems = items;
|
||||
List<BulkReadResult> results = new ArrayList<>();
|
||||
for (int index = 0; index < items.size(); index++) {
|
||||
results.add(BulkReadResult.newBuilder()
|
||||
.setServerHandle(serverHandle)
|
||||
.setTagAddress(items.get(index))
|
||||
.setItemHandle(200 + index)
|
||||
.setWasSuccessful(true)
|
||||
.setWasCached(index % 2 == 0)
|
||||
.setQuality(192)
|
||||
.build());
|
||||
}
|
||||
return results;
|
||||
}
|
||||
|
||||
@Override
|
||||
public List<BulkWriteResult> writeBulk(int serverHandle, List<WriteBulkEntry> entries) {
|
||||
return new ArrayList<>();
|
||||
lastWriteBulkEntries = entries;
|
||||
List<BulkWriteResult> results = new ArrayList<>();
|
||||
for (WriteBulkEntry entry : entries) {
|
||||
results.add(BulkWriteResult.newBuilder()
|
||||
.setServerHandle(serverHandle)
|
||||
.setItemHandle(entry.getItemHandle())
|
||||
.setWasSuccessful(true)
|
||||
.build());
|
||||
}
|
||||
return results;
|
||||
}
|
||||
|
||||
@Override
|
||||
public List<BulkWriteResult> write2Bulk(int serverHandle, List<Write2BulkEntry> entries) {
|
||||
return new ArrayList<>();
|
||||
lastWrite2BulkEntries = entries;
|
||||
List<BulkWriteResult> results = new ArrayList<>();
|
||||
for (Write2BulkEntry entry : entries) {
|
||||
results.add(BulkWriteResult.newBuilder()
|
||||
.setServerHandle(serverHandle)
|
||||
.setItemHandle(entry.getItemHandle())
|
||||
.setWasSuccessful(true)
|
||||
.build());
|
||||
}
|
||||
return results;
|
||||
}
|
||||
|
||||
@Override
|
||||
public List<BulkWriteResult> writeSecuredBulk(int serverHandle, List<WriteSecuredBulkEntry> entries) {
|
||||
return new ArrayList<>();
|
||||
lastWriteSecuredBulkEntries = entries;
|
||||
List<BulkWriteResult> results = new ArrayList<>();
|
||||
for (WriteSecuredBulkEntry entry : entries) {
|
||||
results.add(BulkWriteResult.newBuilder()
|
||||
.setServerHandle(serverHandle)
|
||||
.setItemHandle(entry.getItemHandle())
|
||||
.setWasSuccessful(true)
|
||||
.build());
|
||||
}
|
||||
return results;
|
||||
}
|
||||
|
||||
@Override
|
||||
public List<BulkWriteResult> writeSecured2Bulk(int serverHandle, List<WriteSecured2BulkEntry> entries) {
|
||||
return new ArrayList<>();
|
||||
lastWriteSecured2BulkEntries = entries;
|
||||
List<BulkWriteResult> results = new ArrayList<>();
|
||||
for (WriteSecured2BulkEntry entry : entries) {
|
||||
results.add(BulkWriteResult.newBuilder()
|
||||
.setServerHandle(serverHandle)
|
||||
.setItemHandle(entry.getItemHandle())
|
||||
.setWasSuccessful(true)
|
||||
.build());
|
||||
}
|
||||
return results;
|
||||
}
|
||||
|
||||
@Override
|
||||
|
||||
+50
-13
@@ -11,20 +11,29 @@ import java.util.NoSuchElementException;
|
||||
import java.util.Objects;
|
||||
import java.util.concurrent.ArrayBlockingQueue;
|
||||
import java.util.concurrent.BlockingQueue;
|
||||
import java.util.concurrent.atomic.AtomicBoolean;
|
||||
|
||||
/**
|
||||
* Iterator-style adaptor over the {@code WatchDeployEvents} server-streaming
|
||||
* RPC. Mirrors {@link MxEventStream}: events arrive on a background gRPC thread
|
||||
* and are buffered in a bounded blocking queue; the iterator drains them.
|
||||
* Closing the stream cancels the underlying gRPC call.
|
||||
*
|
||||
* <p><strong>Threading:</strong> the iterator methods ({@link #hasNext()} and
|
||||
* {@link #next()}) are <em>not</em> thread-safe and must be driven by a single
|
||||
* consumer thread. {@link #close()} may be called from any thread. Terminal
|
||||
* state transitions (queue overflow, server completion, and {@code close()})
|
||||
* are serialised so that the first terminal condition wins deterministically:
|
||||
* once an overflow exception has been observed it is never silently replaced
|
||||
* by an end-of-stream marker.
|
||||
*/
|
||||
public final class DeployEventStream implements Iterator<DeployEvent>, AutoCloseable {
|
||||
private static final Object END = new Object();
|
||||
|
||||
private final BlockingQueue<Object> queue;
|
||||
private final AtomicBoolean closed = new AtomicBoolean();
|
||||
private final Object terminalLock = new Object();
|
||||
private volatile ClientCallStreamObserver<WatchDeployEventsRequest> requestStream;
|
||||
private volatile boolean closed;
|
||||
private boolean terminated;
|
||||
private Object next;
|
||||
|
||||
DeployEventStream(int capacity) {
|
||||
@@ -36,7 +45,7 @@ public final class DeployEventStream implements Iterator<DeployEvent>, AutoClose
|
||||
@Override
|
||||
public void beforeStart(ClientCallStreamObserver<WatchDeployEventsRequest> requestStream) {
|
||||
DeployEventStream.this.requestStream = requestStream;
|
||||
if (closed.get()) {
|
||||
if (closed) {
|
||||
requestStream.cancel("client cancelled deploy event stream", null);
|
||||
}
|
||||
}
|
||||
@@ -48,7 +57,7 @@ public final class DeployEventStream implements Iterator<DeployEvent>, AutoClose
|
||||
|
||||
@Override
|
||||
public void onError(Throwable error) {
|
||||
if (Status.fromThrowable(error).getCode() == Status.Code.CANCELLED && closed.get()) {
|
||||
if (Status.fromThrowable(error).getCode() == Status.Code.CANCELLED && closed) {
|
||||
offer(END);
|
||||
return;
|
||||
}
|
||||
@@ -94,12 +103,12 @@ public final class DeployEventStream implements Iterator<DeployEvent>, AutoClose
|
||||
|
||||
@Override
|
||||
public void close() {
|
||||
closed.set(true);
|
||||
closed = true;
|
||||
ClientCallStreamObserver<WatchDeployEventsRequest> stream = requestStream;
|
||||
if (stream != null) {
|
||||
stream.cancel("client cancelled deploy event stream", null);
|
||||
}
|
||||
offer(END);
|
||||
terminate(null);
|
||||
}
|
||||
|
||||
private Object take() {
|
||||
@@ -117,10 +126,7 @@ public final class DeployEventStream implements Iterator<DeployEvent>, AutoClose
|
||||
private void offer(Object value) {
|
||||
Objects.requireNonNull(value, "value");
|
||||
if (value == END) {
|
||||
if (!queue.offer(value)) {
|
||||
queue.clear();
|
||||
queue.offer(value);
|
||||
}
|
||||
terminate(null);
|
||||
return;
|
||||
}
|
||||
if (!queue.offer(value)) {
|
||||
@@ -128,9 +134,40 @@ public final class DeployEventStream implements Iterator<DeployEvent>, AutoClose
|
||||
if (stream != null) {
|
||||
stream.cancel("client deploy event stream queue overflowed", null);
|
||||
}
|
||||
queue.clear();
|
||||
queue.offer(new MxGatewayException("galaxy watch deploy events queue overflowed"));
|
||||
queue.offer(END);
|
||||
terminate(new MxGatewayException("galaxy watch deploy events queue overflowed"));
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Drives the single terminal transition. The first caller wins: a later
|
||||
* end-of-stream or {@code close()} cannot overwrite or discard an overflow
|
||||
* exception that has already been published to the consumer. Mirrors the
|
||||
* {@link MxEventStream#terminate} contract — see Client.Java-002 for the
|
||||
* race this guards against.
|
||||
*
|
||||
* @param fault the fault to surface to the consumer, or {@code null} for a
|
||||
* clean end-of-stream
|
||||
*/
|
||||
private void terminate(MxGatewayException fault) {
|
||||
synchronized (terminalLock) {
|
||||
if (terminated) {
|
||||
return;
|
||||
}
|
||||
terminated = true;
|
||||
if (fault != null) {
|
||||
// Make room for the fault marker; the consumer only needs the
|
||||
// terminal signal, queued data events are no longer relevant.
|
||||
queue.clear();
|
||||
queue.offer(fault);
|
||||
queue.offer(END);
|
||||
return;
|
||||
}
|
||||
// Clean end-of-stream: ensure the END marker is delivered even when
|
||||
// the queue is currently full of undrained data events.
|
||||
if (!queue.offer(END)) {
|
||||
queue.clear();
|
||||
queue.offer(END);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
+31
-10
@@ -160,16 +160,37 @@ final class MxGatewayChannels {
|
||||
*
|
||||
* <p><strong>Cancellation contract:</strong> the returned future is a
|
||||
* {@link CancellingCompletableFuture} that overrides
|
||||
* {@link CompletableFuture#cancel(boolean)} so cancellation always forwards
|
||||
* to the source {@link ListenableFuture}, even when callers wrap the
|
||||
* future in additional {@code thenApply}/{@code thenCompose} stages. The
|
||||
* historical {@code whenComplete}-based forwarder was buggy because
|
||||
* {@code thenApply} returns a new {@code CompletableFuture} whose
|
||||
* cancellation does <em>not</em> propagate back to this future; with the
|
||||
* override-based design, calling {@code cancel(true)} on either the
|
||||
* direct return value or the user-facing chained future ultimately
|
||||
* invokes {@code source.cancel(true)} (chained futures forward to the
|
||||
* upstream stage they were derived from, which is this future).
|
||||
* {@link CompletableFuture#cancel(boolean)} so cancelling the
|
||||
* <em>direct return value</em> forwards to the source
|
||||
* {@link ListenableFuture}, aborting the underlying gRPC call. This is the
|
||||
* fix for Client.Java-015.
|
||||
*
|
||||
* <p><strong>Important — derived stages do <em>not</em> propagate
|
||||
* cancellation upstream.</strong> Calling
|
||||
* {@code cancel(...)} on a future obtained via
|
||||
* {@code thenApply}/{@code thenCompose}/{@code thenAccept}/{@code whenComplete}
|
||||
* of the value returned by this method only marks <em>that</em> derived stage
|
||||
* as cancelled; it does <strong>not</strong> propagate back to this
|
||||
* {@code CancellingCompletableFuture}, so the source RPC continues until its
|
||||
* deadline expires. {@link CompletableFuture#thenApply} (and the other
|
||||
* chaining methods) deliberately do not forward cancellation to the upstream
|
||||
* stage they were derived from.
|
||||
*
|
||||
* <p>If a caller needs cancellation through a chained pipeline, either:
|
||||
* <ul>
|
||||
* <li>use the {@link #toCompletable(ListenableFuture, String, Function)}
|
||||
* overload below, which inlines a validator into the
|
||||
* {@code FutureCallback} so the user-visible future is the same
|
||||
* future cancellation is bound to (this is what the {@code *Async}
|
||||
* methods on {@link MxGatewayClient} and the unary methods on
|
||||
* {@link GalaxyRepositoryClient} do); or</li>
|
||||
* <li>follow {@link GalaxyRepositoryClient#discoverHierarchyAsync}'s
|
||||
* pattern of returning a custom {@link CompletableFuture} subclass
|
||||
* that tracks the current in-flight stage via an
|
||||
* {@link java.util.concurrent.atomic.AtomicReference} and forwards
|
||||
* {@code cancel(...)} to it (necessary when chaining
|
||||
* {@code thenCompose} stages across paged calls).</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param source the gRPC future-stub result
|
||||
* @param operation the operation name used in normalised error messages
|
||||
|
||||
+58
@@ -175,6 +175,64 @@ final class GalaxyRepositoryClientTests {
|
||||
assertFalse(stream.hasNext());
|
||||
}
|
||||
|
||||
@Test
|
||||
void deployEventStreamOverflowExceptionSurvivesASubsequentClose() {
|
||||
// Client.Java-021 regression: mirror Client.Java-002's terminal-state
|
||||
// serialisation in DeployEventStream — an overflow enqueues the overflow
|
||||
// exception, and a later close() must NOT discard it. The first terminal
|
||||
// condition (overflow) must win and stay observable by next().
|
||||
DeployEventStream stream = new DeployEventStream(2);
|
||||
ClientResponseObserver<WatchDeployEventsRequest, DeployEvent> observer = stream.observer();
|
||||
observer.beforeStart(new RecordingClientCallStreamObserver());
|
||||
|
||||
// Force a queue overflow on a capacity-2 stream.
|
||||
for (int i = 0; i < 8; i++) {
|
||||
observer.onNext(DeployEvent.newBuilder().setSequence(i).build());
|
||||
}
|
||||
|
||||
// A close() arriving after the overflow must not erase the overflow signal.
|
||||
stream.close();
|
||||
|
||||
MxGatewayException error = assertThrows(MxGatewayException.class, () -> {
|
||||
while (stream.hasNext()) {
|
||||
stream.next();
|
||||
}
|
||||
});
|
||||
assertTrue(error.getMessage().contains("overflow"), error::getMessage);
|
||||
}
|
||||
|
||||
@Test
|
||||
void deployEventStreamConcurrentOverflowAndCloseAlwaysTerminate() throws Exception {
|
||||
// Client.Java-021 regression: the terminal-state transition must be
|
||||
// serialised so whatever the interleaving of overflow and close,
|
||||
// hasNext() always reaches a terminal state (no stuck consumer).
|
||||
for (int iteration = 0; iteration < 300; iteration++) {
|
||||
DeployEventStream stream = new DeployEventStream(2);
|
||||
ClientResponseObserver<WatchDeployEventsRequest, DeployEvent> observer = stream.observer();
|
||||
observer.beforeStart(new RecordingClientCallStreamObserver());
|
||||
|
||||
Thread filler = new Thread(() -> {
|
||||
for (int i = 0; i < 8; i++) {
|
||||
observer.onNext(DeployEvent.newBuilder().setSequence(i).build());
|
||||
}
|
||||
});
|
||||
Thread closer = new Thread(stream::close);
|
||||
filler.start();
|
||||
closer.start();
|
||||
filler.join();
|
||||
closer.join();
|
||||
|
||||
try {
|
||||
while (stream.hasNext()) {
|
||||
stream.next();
|
||||
}
|
||||
} catch (MxGatewayException expected) {
|
||||
assertTrue(expected.getMessage().contains("overflow"), expected::getMessage);
|
||||
}
|
||||
assertFalse(stream.hasNext());
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void discoverHierarchyRejectsRepeatedPageToken() throws Exception {
|
||||
TestService service = new TestService() {
|
||||
|
||||
@@ -59287,9 +59287,11 @@ public final class MxaccessGateway extends com.google.protobuf.GeneratedFile {
|
||||
* <pre>
|
||||
* Per-item result for the four bulk write families. `item_handle` mirrors the
|
||||
* request entry's item_handle so callers can correlate inputs to outputs even
|
||||
* when the gateway's tag-allowlist filter dropped some entries before reaching
|
||||
* the worker. Per-item failures populate `error_message` + `hresult` and never
|
||||
* raise — callers iterate and inspect each entry.
|
||||
* when the gateway's per-entry `IConstraintEnforcer.CheckWriteHandleAsync`
|
||||
* filter (see `MxAccessGatewayService.ReplaceWriteBulkEntries` and
|
||||
* `docs/Authorization.md`) dropped some entries before reaching the worker.
|
||||
* Per-item failures populate `error_message` + `hresult` and never raise —
|
||||
* callers iterate and inspect each entry.
|
||||
* </pre>
|
||||
*
|
||||
* Protobuf type {@code mxaccess_gateway.v1.BulkWriteResult}
|
||||
@@ -59686,9 +59688,11 @@ public final class MxaccessGateway extends com.google.protobuf.GeneratedFile {
|
||||
* <pre>
|
||||
* Per-item result for the four bulk write families. `item_handle` mirrors the
|
||||
* request entry's item_handle so callers can correlate inputs to outputs even
|
||||
* when the gateway's tag-allowlist filter dropped some entries before reaching
|
||||
* the worker. Per-item failures populate `error_message` + `hresult` and never
|
||||
* raise — callers iterate and inspect each entry.
|
||||
* when the gateway's per-entry `IConstraintEnforcer.CheckWriteHandleAsync`
|
||||
* filter (see `MxAccessGatewayService.ReplaceWriteBulkEntries` and
|
||||
* `docs/Authorization.md`) dropped some entries before reaching the worker.
|
||||
* Per-item failures populate `error_message` + `hresult` and never raise —
|
||||
* callers iterate and inspect each entry.
|
||||
* </pre>
|
||||
*
|
||||
* Protobuf type {@code mxaccess_gateway.v1.BulkWriteResult}
|
||||
@@ -61295,6 +61299,20 @@ public final class MxaccessGateway extends com.google.protobuf.GeneratedFile {
|
||||
* an existing live subscription's last OnDataChange (the worker did not touch
|
||||
* the subscription); false when the worker took the AddItem + Advise + wait +
|
||||
* UnAdvise + RemoveItem snapshot lifecycle itself.
|
||||
*
|
||||
* On `was_successful = true`, `value`, `quality`, `source_timestamp`, and
|
||||
* `statuses` carry the read data (from the cached subscription or the snapshot
|
||||
* lifecycle, depending on `was_cached`) and `error_message` is empty. On
|
||||
* `was_successful = false`, only `server_handle`, `tag_address`, `item_handle`
|
||||
* (when allocated), `was_cached`, and `error_message` are populated; `value`,
|
||||
* `quality`, `source_timestamp`, and `statuses` are left at their proto3
|
||||
* defaults (null / 0 / null / empty) and must not be read as data — they are
|
||||
* wire-indistinguishable from "value is null with quality bad" data and serve
|
||||
* only as absent markers. ReadBulk has no `hresult` field by design (its
|
||||
* outcomes are timeout / cache / lifecycle states, not MXAccess COM return
|
||||
* codes — see `docs/DesignDecisions.md` "Bulk Command Family"). Per-tag
|
||||
* failures populate `error_message` and never raise — callers iterate and
|
||||
* inspect each entry.
|
||||
* </pre>
|
||||
*
|
||||
* Protobuf type {@code mxaccess_gateway.v1.BulkReadResult}
|
||||
@@ -61837,6 +61855,20 @@ public final class MxaccessGateway extends com.google.protobuf.GeneratedFile {
|
||||
* an existing live subscription's last OnDataChange (the worker did not touch
|
||||
* the subscription); false when the worker took the AddItem + Advise + wait +
|
||||
* UnAdvise + RemoveItem snapshot lifecycle itself.
|
||||
*
|
||||
* On `was_successful = true`, `value`, `quality`, `source_timestamp`, and
|
||||
* `statuses` carry the read data (from the cached subscription or the snapshot
|
||||
* lifecycle, depending on `was_cached`) and `error_message` is empty. On
|
||||
* `was_successful = false`, only `server_handle`, `tag_address`, `item_handle`
|
||||
* (when allocated), `was_cached`, and `error_message` are populated; `value`,
|
||||
* `quality`, `source_timestamp`, and `statuses` are left at their proto3
|
||||
* defaults (null / 0 / null / empty) and must not be read as data — they are
|
||||
* wire-indistinguishable from "value is null with quality bad" data and serve
|
||||
* only as absent markers. ReadBulk has no `hresult` field by design (its
|
||||
* outcomes are timeout / cache / lifecycle states, not MXAccess COM return
|
||||
* codes — see `docs/DesignDecisions.md` "Bulk Command Family"). Per-tag
|
||||
* failures populate `error_message` and never raise — callers iterate and
|
||||
* inspect each entry.
|
||||
* </pre>
|
||||
*
|
||||
* Protobuf type {@code mxaccess_gateway.v1.BulkReadResult}
|
||||
|
||||
@@ -256,6 +256,31 @@ Use TLS options for a secured gateway:
|
||||
mxgw-py smoke --endpoint mxgateway.example.local:5001 --tls --ca-file C:\certs\mxgateway-ca.pem --server-name-override mxgateway.example.local --api-key-env MXGATEWAY_API_KEY --item Object.Attribute --json
|
||||
```
|
||||
|
||||
### CLI Parity Gaps
|
||||
|
||||
The `mxgw-py` CLI does not currently ship the Galaxy Repository
|
||||
subcommands that the .NET (`mxgw`), Go (`mxgw-go`), Rust (`mxgw`), and
|
||||
Java (`mxgw-java`) CLIs expose:
|
||||
|
||||
- `galaxy-test-connection` — ping the Galaxy Repository SQL DB.
|
||||
- `galaxy-last-deploy` — fetch the last deploy timestamp.
|
||||
- `galaxy-discover` — enumerate the deployed object hierarchy with
|
||||
attributes.
|
||||
- `galaxy-watch` — stream `DeployEvent`s as the Galaxy is re-deployed.
|
||||
|
||||
The Python `GalaxyRepositoryClient` library wrapper is fully
|
||||
implemented and exercised by `tests/test_galaxy.py` and
|
||||
`tests/test_galaxy_iter_hierarchy.py` — use the library API (see
|
||||
[Galaxy Repository Browse](#galaxy-repository-browse) above) when
|
||||
calling these RPCs from Python. The four CLI subcommands above are a
|
||||
forward-looking parity item; see the matching .NET / Go / Rust / Java
|
||||
CLI implementations for the expected JSON shape when they are added.
|
||||
|
||||
The .NET CLI also ships `bench-stream-events`, which is .NET-only today
|
||||
and not yet present in Go / Rust / Java / Python. It will need
|
||||
matching coverage if the cross-language benchmark matrix grows a
|
||||
stream-events driver under `scripts/`.
|
||||
|
||||
## Integration Checks
|
||||
|
||||
Run live checks only when a gateway and MXAccess-backed worker are available:
|
||||
|
||||
@@ -8,7 +8,6 @@ version = "0.1.0"
|
||||
description = "Async Python client for MXAccess Gateway."
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.12"
|
||||
license = "Proprietary"
|
||||
authors = [
|
||||
{ name = "MXAccess Gateway Authors" },
|
||||
]
|
||||
@@ -24,6 +23,7 @@ classifiers = [
|
||||
"Development Status :: 4 - Beta",
|
||||
"Intended Audience :: Developers",
|
||||
"Intended Audience :: Information Technology",
|
||||
"License :: Other/Proprietary License",
|
||||
"Operating System :: Microsoft :: Windows",
|
||||
"Operating System :: POSIX",
|
||||
"Programming Language :: Python",
|
||||
@@ -59,6 +59,7 @@ where = ["src"]
|
||||
|
||||
[tool.setuptools.package-data]
|
||||
mxgateway = ["py.typed"]
|
||||
mxgateway_cli = ["py.typed"]
|
||||
|
||||
[tool.pytest.ini_options]
|
||||
addopts = "-ra"
|
||||
|
||||
@@ -0,0 +1,55 @@
|
||||
"""Packaging smoke test.
|
||||
|
||||
Guards against ``pyproject.toml`` regressions (see Client.Python-018) that
|
||||
break ``pip wheel`` / ``pip install -e`` while leaving the in-tree
|
||||
``pytest`` suite green via ``[tool.pytest.ini_options] pythonpath = ["src"]``.
|
||||
|
||||
The test invokes ``python -m pip wheel . --no-deps`` against the package
|
||||
root and asserts a wheel file is produced. Any future PEP 639 / SPDX
|
||||
violation (or any other ``setuptools.build_meta`` configuration error)
|
||||
will be caught here at test time rather than at first install on a clean
|
||||
machine.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pathlib
|
||||
import subprocess
|
||||
import sys
|
||||
|
||||
_PACKAGE_ROOT = pathlib.Path(__file__).resolve().parent.parent
|
||||
|
||||
|
||||
def test_pip_wheel_build_succeeds(tmp_path: pathlib.Path) -> None:
|
||||
"""``pip wheel .`` against the package root produces a wheel.
|
||||
|
||||
This exercises ``setuptools.build_meta`` end-to-end — the same path
|
||||
used by ``pip install -e .`` — and would have caught
|
||||
Client.Python-018 at commit time.
|
||||
"""
|
||||
|
||||
result = subprocess.run(
|
||||
[
|
||||
sys.executable,
|
||||
"-m",
|
||||
"pip",
|
||||
"wheel",
|
||||
".",
|
||||
"--no-deps",
|
||||
"--wheel-dir",
|
||||
str(tmp_path),
|
||||
],
|
||||
cwd=str(_PACKAGE_ROOT),
|
||||
capture_output=True,
|
||||
text=True,
|
||||
)
|
||||
assert result.returncode == 0, (
|
||||
f"pip wheel failed (exit {result.returncode}):\n"
|
||||
f"--- stdout ---\n{result.stdout}\n"
|
||||
f"--- stderr ---\n{result.stderr}"
|
||||
)
|
||||
wheels = list(tmp_path.glob("mxaccess_gateway_client-*.whl"))
|
||||
assert wheels, (
|
||||
"expected a mxaccess_gateway_client wheel in "
|
||||
f"{tmp_path}; got {list(tmp_path.iterdir())}"
|
||||
)
|
||||
@@ -93,23 +93,38 @@ impl Session {
|
||||
pub async fn subscribe_bulk(&self, server_handle: i32, tag_addresses: Vec<String>) -> Result<Vec<SubscribeResult>, Error>;
|
||||
pub async fn unsubscribe_bulk(&self, server_handle: i32, item_handles: Vec<i32>) -> Result<Vec<SubscribeResult>, Error>;
|
||||
pub async fn write(&self, server_handle: i32, item_handle: i32, value: MxValue, user_id: i32) -> Result<(), Error>;
|
||||
pub async fn write_bulk(&self, server_handle: i32, entries: Vec<WriteBulkEntry>, user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
|
||||
pub async fn write2_bulk(&self, server_handle: i32, entries: Vec<Write2BulkEntry>, timestamp: prost_types::Timestamp, user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
|
||||
pub async fn write_secured_bulk(&self, server_handle: i32, entries: Vec<WriteSecuredBulkEntry>, current_user_id: i32, verifier_user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
|
||||
pub async fn write_secured2_bulk(&self, server_handle: i32, entries: Vec<WriteSecured2BulkEntry>, timestamp: prost_types::Timestamp, current_user_id: i32, verifier_user_id: i32) -> Result<Vec<BulkWriteResult>, Error>;
|
||||
pub async fn read_bulk(&self, server_handle: i32, tags: &[String], timeout_ms: u32) -> Result<Vec<ReadBulkResult>, Error>;
|
||||
pub async fn write_bulk(&self, server_handle: i32, entries: Vec<WriteBulkEntry>) -> Result<Vec<BulkWriteResult>, Error>;
|
||||
pub async fn write2_bulk(&self, server_handle: i32, entries: Vec<Write2BulkEntry>) -> Result<Vec<BulkWriteResult>, Error>;
|
||||
pub async fn write_secured_bulk(&self, server_handle: i32, entries: Vec<WriteSecuredBulkEntry>) -> Result<Vec<BulkWriteResult>, Error>;
|
||||
pub async fn write_secured2_bulk(&self, server_handle: i32, entries: Vec<WriteSecured2BulkEntry>) -> Result<Vec<BulkWriteResult>, Error>;
|
||||
pub async fn read_bulk<S: AsRef<str>>(&self, server_handle: i32, tag_addresses: &[S], timeout_ms: u32) -> Result<Vec<BulkReadResult>, Error>;
|
||||
pub async fn events(&self) -> Result<impl Stream<Item = Result<MxEvent, Error>>, Error>;
|
||||
pub async fn close(&self) -> Result<(), Error>;
|
||||
}
|
||||
```
|
||||
|
||||
The five bulk-write helpers (`write_bulk`, `write2_bulk`, `write_secured_bulk`,
|
||||
The four bulk-write helpers (`write_bulk`, `write2_bulk`, `write_secured_bulk`,
|
||||
`write_secured2_bulk`) and `read_bulk` mirror the worker's bulk command shapes
|
||||
in `mxaccess_gateway.proto` and use the same correlation-id discipline as the
|
||||
unary helpers — `session::next_correlation_id` is `pub` so that consumers
|
||||
constructing raw `MxCommandRequest`/`CloseSessionRequest` payloads outside
|
||||
the `Session` helpers (notably the `mxgw` test CLI's `ping` and
|
||||
`close-session` subcommands) share the same id generation.
|
||||
unary helpers — `next_correlation_id` is part of the public SDK surface,
|
||||
re-exported at the crate root (`mxgateway_client::next_correlation_id`), so
|
||||
that consumers constructing raw `MxCommandRequest`/`CloseSessionRequest`
|
||||
payloads outside the `Session` helpers (notably the `mxgw` test CLI's `ping`
|
||||
and `close-session` subcommands) share the same id generation. The returned
|
||||
id is documented as an opaque token with three guaranteed properties
|
||||
(embeds the caller's label, unique within a process, carries no secret);
|
||||
its textual format is intentionally *not* part of the contract.
|
||||
|
||||
The per-entry fields that the matching MXAccess COM calls accept once per
|
||||
batch — `user_id` (`WriteBulkEntry`/`Write2BulkEntry`), `timestamp_value`
|
||||
(`Write2BulkEntry`/`WriteSecured2BulkEntry`), and `current_user_id` /
|
||||
`verifier_user_id` (`WriteSecuredBulkEntry`/`WriteSecured2BulkEntry`) — live
|
||||
on the entry structs themselves rather than as trailing positional arguments
|
||||
on the helper, matching the protobuf shapes in
|
||||
`mxaccess_gateway.proto` (`WriteBulkCommand` / `Write2BulkCommand` /
|
||||
`WriteSecuredBulkCommand` / `WriteSecured2BulkCommand`). `read_bulk` is
|
||||
generic over `AsRef<str>` so callers can pass `&[String]` or `&[&str]`
|
||||
without cloning at the call site.
|
||||
|
||||
## Authentication
|
||||
|
||||
|
||||
@@ -447,9 +447,7 @@ async fn run(cli: Cli) -> Result<(), Error> {
|
||||
let client = connect(connection).await?;
|
||||
let reply = client
|
||||
.invoke(MxCommandRequest {
|
||||
client_correlation_id: mxgateway_client::session::next_correlation_id(
|
||||
"cli-ping",
|
||||
),
|
||||
client_correlation_id: mxgateway_client::next_correlation_id("cli-ping"),
|
||||
command: Some(MxCommand {
|
||||
kind: MxCommandKind::Ping as i32,
|
||||
payload: Some(mxgateway_client::generated::mxaccess_gateway::v1::mx_command::Payload::Ping(
|
||||
@@ -496,7 +494,7 @@ async fn run(cli: Cli) -> Result<(), Error> {
|
||||
let reply = client
|
||||
.close_session_raw(CloseSessionRequest {
|
||||
session_id,
|
||||
client_correlation_id: mxgateway_client::session::next_correlation_id(
|
||||
client_correlation_id: mxgateway_client::next_correlation_id(
|
||||
"cli-close-session",
|
||||
),
|
||||
})
|
||||
@@ -1088,16 +1086,17 @@ async fn run_bench_read_bulk(
|
||||
|
||||
/// Per-iteration accounting for `bench-read-bulk`.
|
||||
///
|
||||
/// Only successful `read_bulk` calls contribute to the success-latency
|
||||
/// histogram (`success_latencies_ms`). Failures are tracked separately in
|
||||
/// `failure_latencies_ms` and the first failure's redacted error string is
|
||||
/// stashed in `first_failure` so a partial-failure run is visible in the
|
||||
/// emitted JSON. This keeps the cross-language `latencyMs.p99`/`max`
|
||||
/// contract honest: it reports successful-call latency only and never
|
||||
/// folds in a per-call timeout from a failed RPC.
|
||||
/// Every `read_bulk` call's elapsed time contributes to the all-calls
|
||||
/// histogram (`latencies_ms`), matching the .NET/Go/Python/Java bench
|
||||
/// implementations whose `latencyMs` field is the cross-language comparison
|
||||
/// contract collated by `scripts/bench-read-bulk.ps1`. Failures additionally
|
||||
/// land in `failure_latencies_ms` and the first failure's redacted error
|
||||
/// string is stashed in `first_failure`, both surfaced through the JSON as
|
||||
/// Rust-only enrichment so a partial-failure run is still visible at the
|
||||
/// report layer without breaking the side-by-side comparison.
|
||||
#[derive(Default)]
|
||||
struct BenchReadBulkStats {
|
||||
success_latencies_ms: Vec<f64>,
|
||||
latencies_ms: Vec<f64>,
|
||||
failure_latencies_ms: Vec<f64>,
|
||||
total_read_results: u64,
|
||||
cached_read_results: u64,
|
||||
@@ -1112,7 +1111,7 @@ impl BenchReadBulkStats {
|
||||
elapsed_ms: f64,
|
||||
results: &[mxgateway_client::generated::mxaccess_gateway::v1::BulkReadResult],
|
||||
) {
|
||||
self.success_latencies_ms.push(elapsed_ms);
|
||||
self.latencies_ms.push(elapsed_ms);
|
||||
self.successful_calls += 1;
|
||||
for result in results {
|
||||
self.total_read_results += 1;
|
||||
@@ -1123,6 +1122,7 @@ impl BenchReadBulkStats {
|
||||
}
|
||||
|
||||
fn record_failure(&mut self, elapsed_ms: f64, error: &Error) {
|
||||
self.latencies_ms.push(elapsed_ms);
|
||||
self.failure_latencies_ms.push(elapsed_ms);
|
||||
self.failed_calls += 1;
|
||||
if self.first_failure.is_none() {
|
||||
@@ -1145,7 +1145,7 @@ impl BenchReadBulkStats {
|
||||
|
||||
fn to_json(&self, context: &BenchReadBulkContext<'_>) -> serde_json::Value {
|
||||
let calls_per_second = self.calls_per_second(context.steady_elapsed);
|
||||
let success_summary = percentile_summary(&self.success_latencies_ms);
|
||||
let latency_summary = percentile_summary(&self.latencies_ms);
|
||||
let failure_summary = percentile_summary(&self.failure_latencies_ms);
|
||||
serde_json::json!({
|
||||
"language": "rust",
|
||||
@@ -1163,7 +1163,7 @@ impl BenchReadBulkStats {
|
||||
"totalReadResults": self.total_read_results,
|
||||
"cachedReadResults": self.cached_read_results,
|
||||
"callsPerSecond": round_to(calls_per_second, 2),
|
||||
"latencyMs": success_summary,
|
||||
"latencyMs": latency_summary,
|
||||
"failureLatencyMs": failure_summary,
|
||||
"firstFailure": self.first_failure,
|
||||
})
|
||||
@@ -1737,7 +1737,7 @@ mod tests {
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn bench_read_bulk_stats_keeps_failures_out_of_success_latency_histogram() {
|
||||
fn bench_read_bulk_stats_tracks_all_calls_in_latency_and_failures_separately() {
|
||||
use mxgateway_client::generated::mxaccess_gateway::v1::BulkReadResult;
|
||||
use mxgateway_client::Error;
|
||||
|
||||
@@ -1753,8 +1753,10 @@ mod tests {
|
||||
..BulkReadResult::default()
|
||||
};
|
||||
|
||||
// Two fast successes and one slow failure: the slow failure must
|
||||
// not pollute the success p99/max histogram.
|
||||
// Two fast successes and one slow failure: every call lands in the
|
||||
// all-calls histogram (the cross-language `latencyMs` contract) and
|
||||
// the failure additionally surfaces through `failureLatencyMs` plus
|
||||
// `firstFailure` as Rust-only enrichment.
|
||||
stats.record_success(1.5, std::slice::from_ref(&cached));
|
||||
stats.record_success(2.0, std::slice::from_ref(&uncached));
|
||||
let failure = Error::MalformedReply {
|
||||
@@ -1762,7 +1764,7 @@ mod tests {
|
||||
};
|
||||
stats.record_failure(1_500.0, &failure);
|
||||
|
||||
assert_eq!(stats.success_latencies_ms, vec![1.5, 2.0]);
|
||||
assert_eq!(stats.latencies_ms, vec![1.5, 2.0, 1_500.0]);
|
||||
assert_eq!(stats.failure_latencies_ms, vec![1_500.0]);
|
||||
assert_eq!(stats.successful_calls, 2);
|
||||
assert_eq!(stats.failed_calls, 1);
|
||||
@@ -1786,10 +1788,12 @@ mod tests {
|
||||
tags: &[],
|
||||
};
|
||||
let payload = stats.to_json(&context);
|
||||
// The success-latency histogram must never see the 1_500 ms failure.
|
||||
assert_eq!(payload["latencyMs"]["max"].as_f64().unwrap(), 2.0);
|
||||
assert!(payload["latencyMs"]["p99"].as_f64().unwrap() <= 2.0);
|
||||
// The failure-latency histogram must own it instead.
|
||||
// The all-calls histogram (cross-language `latencyMs` contract)
|
||||
// includes the failure latency so the side-by-side comparison with
|
||||
// .NET/Go/Python/Java stays apples-to-apples.
|
||||
assert_eq!(payload["latencyMs"]["max"].as_f64().unwrap(), 1_500.0);
|
||||
// The Rust-only `failureLatencyMs` enrichment surfaces failures
|
||||
// separately for partial-failure diagnostics.
|
||||
assert_eq!(
|
||||
payload["failureLatencyMs"]["max"].as_f64().unwrap(),
|
||||
1_500.0
|
||||
|
||||
@@ -32,7 +32,7 @@ pub use galaxy::{DeployEventStream, GalaxyClient};
|
||||
#[doc(inline)]
|
||||
pub use options::ClientOptions;
|
||||
#[doc(inline)]
|
||||
pub use session::Session;
|
||||
pub use session::{next_correlation_id, Session};
|
||||
#[doc(inline)]
|
||||
pub use value::{MxArrayProjection, MxArrayValue, MxStatus, MxValue, MxValueProjection};
|
||||
#[doc(inline)]
|
||||
|
||||
@@ -37,8 +37,20 @@ static CORRELATION_SEQUENCE: AtomicU64 = AtomicU64::new(0);
|
||||
/// Exposed so consumers that construct raw [`MxCommandRequest`] /
|
||||
/// [`CloseSessionRequest`] payloads outside the `Session` helpers — notably
|
||||
/// the `mxgw` test CLI — share the same correlation-id discipline as the
|
||||
/// library. The returned id is `rust-client-{label}-{N}` where `N` comes
|
||||
/// from a process-wide atomic sequence.
|
||||
/// library. Also re-exported at the crate root as
|
||||
/// [`mxgateway_client::next_correlation_id`](crate::next_correlation_id).
|
||||
///
|
||||
/// The returned id has the following guaranteed properties:
|
||||
///
|
||||
/// - it embeds the supplied `label` verbatim so log readers can pick out
|
||||
/// which call site emitted it;
|
||||
/// - it is unique within the lifetime of a single process (driven by an
|
||||
/// internal monotonically-increasing atomic sequence);
|
||||
/// - it carries no embedded secret or user-supplied payload beyond `label`.
|
||||
///
|
||||
/// The exact textual format (currently `rust-client-{label}-{N}`) is *not*
|
||||
/// part of the public contract and may change between releases — callers
|
||||
/// must not parse it. Treat the returned `String` as an opaque token.
|
||||
#[must_use]
|
||||
pub fn next_correlation_id(label: &str) -> String {
|
||||
let sequence = CORRELATION_SEQUENCE.fetch_add(1, Ordering::Relaxed);
|
||||
|
||||
Reference in New Issue
Block a user