Code-review 2026-05-20 sweep #2: re-review at a020350, resolve 48 findings

Second re-review pass at commit a020350 caught 48 new findings — including one High-severity regression I introduced in the prior sweep — and fixed them all in one parallel wave. High (1) - Client.Python-018: prior sweep set `license = "Proprietary"` in pyproject.toml. setuptools >= 77 enforces PEP 639 and rejects the string (it must be a valid SPDX expression), so `pip wheel .` and `pip install -e .` both fail before any source compiles. Tests still pass because pytest bypasses the build backend via `pythonpath`. Dropped the invalid license string, kept the `License :: Other/Proprietary License` classifier, and added `tests/test_packaging.py` so a future regression of the same shape is caught in CI. Mediums (6) - Worker-023: `HeartbeatStuckCeiling` (default 75s = 5x HeartbeatGrace) on WorkerPipeSessionOptions bounds the in-flight-command watchdog suppression so a truly stuck COM call still triggers StaHung instead of permanently defeating the watchdog. - Client.Rust-018: reverted Rust's `latencyMs` split so the cross-language bench comparison is apples-to-apples again; `failureLatencyMs` kept as Rust-only enrichment. - Client.Java-021: applied Client.Java-002's terminal-state serialisation pattern to DeployEventStream so close() arriving after queue-overflow can't erase the overflow exception. - IntegrationTests-017: teardown-parity test now uses a two-window stability check after UnAdvise instead of strict equality against the pre-UnAdvise count (which raced against in-flight events). - IntegrationTests-019: new RecordingTestOutputHelper wraps every log sink the WriteSecured live test owns (worker stdout/stderr, gateway logs, direct WriteLine) so the credential is proven absent from the full output buffer, not just the diagnostic message. - Tests-020: added MxAccessGatewayServiceConstraintTests coverage for the previously-uncovered Write2Bulk and WriteSecured2Bulk arms of WriteBulkConstraintPlan.SetPayload. Lows (41 — highlights) - Server: Galaxy glob cache eviction is race-free (Server-024); GalaxyRepositoryGrpcService takes IGalaxyRepository (Server-025); AlarmsOptions validated at startup (Server-026); Authorization.md Constraint Enforcement snippet/prose enumerate the bulk write/read family (Server-027); bulk-read-commands and bulk-write-commands capability tokens added to OpenSession (Server-029); NotWiredAlarmRpcDispatcher XML doc and missing scope-resolver and state-machine tests cleaned up (023, 028). - Worker: AlarmCommandHandler now invokes the same STA-affinity guard the poll path uses, at every command entry (Worker-024); RunAsync null-checks the runtime-session factory result (Worker-025). - Worker.Tests: shared LiveMxAccessOptInVariableName lives on GatewayContractInfo (Worker.Tests-025); MxAccessSession.CreateForTesting rejects production sinks (Worker.Tests-026); FakeRuntimeSession's CancelCommandReturnValue serialised under lock (Worker.Tests-027); Probes namespace lifted to MxGateway.Worker.Tests.Probes (Worker.Tests-029); cancel-envelope sequence numbers monotonised (Worker.Tests-030); docs/GatewayTesting.md gains a "Dev-rig Probes" section (Worker.Tests-028). - Tests: ManualTimeProvider consolidated into one TestSupport/ copy (Tests-021); SessionManagerBulkTests adds a mid-flight cancellation test backed by a TaskCompletionSource fake (Tests-022); companion FakeWorkerProcess.WaitForExitAsync no longer fakes its exit signal (Tests-023); constraint plan reply-count divergence pinned (Tests-024). - IntegrationTests: TryGetSession chain carries [MaybeNullWhen(false)] end-to-end (IntegrationTests-018); abnormal-exit keyword set tightened to pipe-disconnected/end-of-stream and the test now asserts streamTask.IsFaulted (020, 021). - Client.Dotnet: bench commands added to isLongRunning so the default 30s wall-clock budget doesn't kill them (015); BenchStreamEventsAsync observes the inner stream task on every exit path (016). - Client.Go: parseValue wraps strconv errors with flag context and %w (017); bench loops honour ctx.Done() (018); galaxy-watch parses RFC3339Nano with fractional seconds (019); runStreamEvents installs signal.NotifyContext like runGalaxyWatch (020); five new CLI-level table-driven tests cover the bulk/bench subcommands (021). - Client.Java: toCompletable Javadoc rewritten to match the actual cancellation contract Client.Java-015 established (022); stream-events text path uses Long.toUnsignedString for worker_sequence (023); bench-read-bulk no longer pollutes success-latency histogram with failure durations (024); --shutdown-timeout CLI option propagates through to ClientOptions (025); seven new MxGatewayCliTests cover the bulk and bench commands (026). - Client.Python: mxgateway_cli ships its own py.typed marker (019); wheel-build smoke test added under tests/test_packaging.py (020); README documents the Galaxy CLI parity gap explicitly (021). - Client.Rust: RustClientDesign.md signatures match session.rs and document the AsRef<str> read_bulk genericism (019); next_correlation_id re-exported at the crate root, with a property-style doc contract and an explicit disclaimer that the literal textual format is not part of the contract (020). - Contracts: BulkWriteResult comment names the actual IConstraintEnforcer mechanism instead of "tag-allowlist filter" (014); BulkReadResult gains explicit per-arm payload-population documentation for the success vs failure cases (015). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:28:54 -04:00
parent a0203503a7
commit 1aafd6bde4
74 changed files with 3349 additions and 395 deletions
@@ -447,9 +447,7 @@ async fn run(cli: Cli) -> Result<(), Error> {
            let client = connect(connection).await?;
            let reply = client
                .invoke(MxCommandRequest {
-                    client_correlation_id: mxgateway_client::session::next_correlation_id(
-                        "cli-ping",
-                    ),
+                    client_correlation_id: mxgateway_client::next_correlation_id("cli-ping"),
                    command: Some(MxCommand {
                        kind: MxCommandKind::Ping as i32,
                        payload: Some(mxgateway_client::generated::mxaccess_gateway::v1::mx_command::Payload::Ping(
@@ -496,7 +494,7 @@ async fn run(cli: Cli) -> Result<(), Error> {
            let reply = client
                .close_session_raw(CloseSessionRequest {
                    session_id,
-                    client_correlation_id: mxgateway_client::session::next_correlation_id(
+                    client_correlation_id: mxgateway_client::next_correlation_id(
                        "cli-close-session",
                    ),
                })
@@ -1088,16 +1086,17 @@ async fn run_bench_read_bulk(

 /// Per-iteration accounting for `bench-read-bulk`.
 ///
-/// Only successful `read_bulk` calls contribute to the success-latency
-/// histogram (`success_latencies_ms`). Failures are tracked separately in
-/// `failure_latencies_ms` and the first failure's redacted error string is
-/// stashed in `first_failure` so a partial-failure run is visible in the
-/// emitted JSON. This keeps the cross-language `latencyMs.p99`/`max`
-/// contract honest: it reports successful-call latency only and never
-/// folds in a per-call timeout from a failed RPC.
+/// Every `read_bulk` call's elapsed time contributes to the all-calls
+/// histogram (`latencies_ms`), matching the .NET/Go/Python/Java bench
+/// implementations whose `latencyMs` field is the cross-language comparison
+/// contract collated by `scripts/bench-read-bulk.ps1`. Failures additionally
+/// land in `failure_latencies_ms` and the first failure's redacted error
+/// string is stashed in `first_failure`, both surfaced through the JSON as
+/// Rust-only enrichment so a partial-failure run is still visible at the
+/// report layer without breaking the side-by-side comparison.
 #[derive(Default)]
 struct BenchReadBulkStats {
-    success_latencies_ms: Vec<f64>,
+    latencies_ms: Vec<f64>,
    failure_latencies_ms: Vec<f64>,
    total_read_results: u64,
    cached_read_results: u64,
@@ -1112,7 +1111,7 @@ impl BenchReadBulkStats {
        elapsed_ms: f64,
        results: &[mxgateway_client::generated::mxaccess_gateway::v1::BulkReadResult],
    ) {
-        self.success_latencies_ms.push(elapsed_ms);
+        self.latencies_ms.push(elapsed_ms);
        self.successful_calls += 1;
        for result in results {
            self.total_read_results += 1;
@@ -1123,6 +1122,7 @@ impl BenchReadBulkStats {
    }

    fn record_failure(&mut self, elapsed_ms: f64, error: &Error) {
+        self.latencies_ms.push(elapsed_ms);
        self.failure_latencies_ms.push(elapsed_ms);
        self.failed_calls += 1;
        if self.first_failure.is_none() {
@@ -1145,7 +1145,7 @@ impl BenchReadBulkStats {

    fn to_json(&self, context: &BenchReadBulkContext<'_>) -> serde_json::Value {
        let calls_per_second = self.calls_per_second(context.steady_elapsed);
-        let success_summary = percentile_summary(&self.success_latencies_ms);
+        let latency_summary = percentile_summary(&self.latencies_ms);
        let failure_summary = percentile_summary(&self.failure_latencies_ms);
        serde_json::json!({
            "language": "rust",
@@ -1163,7 +1163,7 @@ impl BenchReadBulkStats {
            "totalReadResults": self.total_read_results,
            "cachedReadResults": self.cached_read_results,
            "callsPerSecond": round_to(calls_per_second, 2),
-            "latencyMs": success_summary,
+            "latencyMs": latency_summary,
            "failureLatencyMs": failure_summary,
            "firstFailure": self.first_failure,
        })
@@ -1737,7 +1737,7 @@ mod tests {
    }

    #[test]
-    fn bench_read_bulk_stats_keeps_failures_out_of_success_latency_histogram() {
+    fn bench_read_bulk_stats_tracks_all_calls_in_latency_and_failures_separately() {
        use mxgateway_client::generated::mxaccess_gateway::v1::BulkReadResult;
        use mxgateway_client::Error;

@@ -1753,8 +1753,10 @@ mod tests {
            ..BulkReadResult::default()
        };

-        // Two fast successes and one slow failure: the slow failure must
-        // not pollute the success p99/max histogram.
+        // Two fast successes and one slow failure: every call lands in the
+        // all-calls histogram (the cross-language `latencyMs` contract) and
+        // the failure additionally surfaces through `failureLatencyMs` plus
+        // `firstFailure` as Rust-only enrichment.
        stats.record_success(1.5, std::slice::from_ref(&cached));
        stats.record_success(2.0, std::slice::from_ref(&uncached));
        let failure = Error::MalformedReply {
@@ -1762,7 +1764,7 @@ mod tests {
        };
        stats.record_failure(1_500.0, &failure);

-        assert_eq!(stats.success_latencies_ms, vec![1.5, 2.0]);
+        assert_eq!(stats.latencies_ms, vec![1.5, 2.0, 1_500.0]);
        assert_eq!(stats.failure_latencies_ms, vec![1_500.0]);
        assert_eq!(stats.successful_calls, 2);
        assert_eq!(stats.failed_calls, 1);
@@ -1786,10 +1788,12 @@ mod tests {
            tags: &[],
        };
        let payload = stats.to_json(&context);
-        // The success-latency histogram must never see the 1_500 ms failure.
-        assert_eq!(payload["latencyMs"]["max"].as_f64().unwrap(), 2.0);
-        assert!(payload["latencyMs"]["p99"].as_f64().unwrap() <= 2.0);
-        // The failure-latency histogram must own it instead.
+        // The all-calls histogram (cross-language `latencyMs` contract)
+        // includes the failure latency so the side-by-side comparison with
+        // .NET/Go/Python/Java stays apples-to-apples.
+        assert_eq!(payload["latencyMs"]["max"].as_f64().unwrap(), 1_500.0);
+        // The Rust-only `failureLatencyMs` enrichment surfaces failures
+        // separately for partial-failure diagnostics.
        assert_eq!(
            payload["failureLatencyMs"]["max"].as_f64().unwrap(),
            1_500.0