Code-review 2026-05-20 sweep #2: re-review at a020350, resolve 48 findings

Second re-review pass at commit a020350 caught 48 new findings — including
one High-severity regression I introduced in the prior sweep — and fixed
them all in one parallel wave.

High (1)
- Client.Python-018: prior sweep set `license = "Proprietary"` in
  pyproject.toml. setuptools >= 77 enforces PEP 639 and rejects the
  string (it must be a valid SPDX expression), so `pip wheel .` and
  `pip install -e .` both fail before any source compiles. Tests
  still pass because pytest bypasses the build backend via
  `pythonpath`. Dropped the invalid license string, kept the
  `License :: Other/Proprietary License` classifier, and added
  `tests/test_packaging.py` so a future regression of the same shape
  is caught in CI.

Mediums (6)
- Worker-023: `HeartbeatStuckCeiling` (default 75s = 5x HeartbeatGrace)
  on WorkerPipeSessionOptions bounds the in-flight-command watchdog
  suppression so a truly stuck COM call still triggers StaHung
  instead of permanently defeating the watchdog.
- Client.Rust-018: reverted Rust's `latencyMs` split so the
  cross-language bench comparison is apples-to-apples again;
  `failureLatencyMs` kept as Rust-only enrichment.
- Client.Java-021: applied Client.Java-002's terminal-state
  serialisation pattern to DeployEventStream so close() arriving
  after queue-overflow can't erase the overflow exception.
- IntegrationTests-017: teardown-parity test now uses a two-window
  stability check after UnAdvise instead of strict equality against
  the pre-UnAdvise count (which raced against in-flight events).
- IntegrationTests-019: new RecordingTestOutputHelper wraps every
  log sink the WriteSecured live test owns (worker stdout/stderr,
  gateway logs, direct WriteLine) so the credential is proven
  absent from the full output buffer, not just the diagnostic
  message.
- Tests-020: added MxAccessGatewayServiceConstraintTests coverage
  for the previously-uncovered Write2Bulk and WriteSecured2Bulk
  arms of WriteBulkConstraintPlan.SetPayload.

Lows (41 — highlights)
- Server: Galaxy glob cache eviction is race-free (Server-024);
  GalaxyRepositoryGrpcService takes IGalaxyRepository (Server-025);
  AlarmsOptions validated at startup (Server-026); Authorization.md
  Constraint Enforcement snippet/prose enumerate the bulk write/read
  family (Server-027); bulk-read-commands and bulk-write-commands
  capability tokens added to OpenSession (Server-029);
  NotWiredAlarmRpcDispatcher XML doc and missing scope-resolver and
  state-machine tests cleaned up (023, 028).
- Worker: AlarmCommandHandler now invokes the same STA-affinity
  guard the poll path uses, at every command entry (Worker-024);
  RunAsync null-checks the runtime-session factory result
  (Worker-025).
- Worker.Tests: shared LiveMxAccessOptInVariableName lives on
  GatewayContractInfo (Worker.Tests-025); MxAccessSession.CreateForTesting
  rejects production sinks (Worker.Tests-026); FakeRuntimeSession's
  CancelCommandReturnValue serialised under lock (Worker.Tests-027);
  Probes namespace lifted to MxGateway.Worker.Tests.Probes
  (Worker.Tests-029); cancel-envelope sequence numbers monotonised
  (Worker.Tests-030); docs/GatewayTesting.md gains a "Dev-rig Probes"
  section (Worker.Tests-028).
- Tests: ManualTimeProvider consolidated into one TestSupport/ copy
  (Tests-021); SessionManagerBulkTests adds a mid-flight cancellation
  test backed by a TaskCompletionSource fake (Tests-022); companion
  FakeWorkerProcess.WaitForExitAsync no longer fakes its exit signal
  (Tests-023); constraint plan reply-count divergence pinned
  (Tests-024).
- IntegrationTests: TryGetSession chain carries [MaybeNullWhen(false)]
  end-to-end (IntegrationTests-018); abnormal-exit keyword set
  tightened to pipe-disconnected/end-of-stream and the test now
  asserts streamTask.IsFaulted (020, 021).
- Client.Dotnet: bench commands added to isLongRunning so the
  default 30s wall-clock budget doesn't kill them (015);
  BenchStreamEventsAsync observes the inner stream task on every
  exit path (016).
- Client.Go: parseValue wraps strconv errors with flag context and
  %w (017); bench loops honour ctx.Done() (018); galaxy-watch parses
  RFC3339Nano with fractional seconds (019); runStreamEvents installs
  signal.NotifyContext like runGalaxyWatch (020); five new CLI-level
  table-driven tests cover the bulk/bench subcommands (021).
- Client.Java: toCompletable Javadoc rewritten to match the actual
  cancellation contract Client.Java-015 established (022); stream-events
  text path uses Long.toUnsignedString for worker_sequence (023);
  bench-read-bulk no longer pollutes success-latency histogram with
  failure durations (024); --shutdown-timeout CLI option propagates
  through to ClientOptions (025); seven new MxGatewayCliTests cover
  the bulk and bench commands (026).
- Client.Python: mxgateway_cli ships its own py.typed marker (019);
  wheel-build smoke test added under tests/test_packaging.py (020);
  README documents the Galaxy CLI parity gap explicitly (021).
- Client.Rust: RustClientDesign.md signatures match session.rs and
  document the AsRef<str> read_bulk genericism (019);
  next_correlation_id re-exported at the crate root, with a
  property-style doc contract and an explicit disclaimer that the
  literal textual format is not part of the contract (020).
- Contracts: BulkWriteResult comment names the actual
  IConstraintEnforcer mechanism instead of "tag-allowlist filter"
  (014); BulkReadResult gains explicit per-arm payload-population
  documentation for the success vs failure cases (015).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-20 10:28:54 -04:00
parent a0203503a7
commit 1aafd6bde4
74 changed files with 3349 additions and 395 deletions
+316 -13
View File
@@ -5,28 +5,28 @@
| Module | `clients/python` |
| Reviewer | Claude Code |
| Review date | 2026-05-20 |
| Commit reviewed | `1cd51bb` |
| Commit reviewed | `a020350` |
| Status | Reviewed |
| Open findings | 0 |
## Checklist coverage
A re-review at commit `1cd51bb` over the same module. Prior findings
(Client.Python-001 — Client.Python-012) remain closed and are kept as
A re-review at commit `a020350` over the same module. Prior findings
(Client.Python-001 — Client.Python-017) remain closed and are kept as
history. This section reflects categories evaluated in this pass.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Issue found: `_use_plaintext` silently downgrades any `localhost:` / `127.0.0.1:` endpoint to plaintext (Client.Python-013). |
| 2 | mxaccessgw conventions | No new issues found — secrets redacted, MXAccess parity preserved, generated code untouched, no Blazor/COM violations apply (Python client). |
| 3 | Concurrency & thread safety | No new issues found — close-idempotency hazard fixed in Client.Python-006, shared `_canceling_iterator` cancels on `CancelledError`. |
| 4 | Error handling & resilience | No new issues found at this commit (prior 003, 007, 011 remain closed). |
| 5 | Security | Issue found: implicit plaintext-on-localhost (Client.Python-013) means a user explicitly listing a TLS-fronted loopback endpoint with `--api-key` but without `--tls`/`--plaintext` silently transmits the bearer token in cleartext. |
| 6 | Performance & resource management | No new issues found`iter_hierarchy` streams pages lazily (Client.Python-005 resolution). |
| 7 | Design-document adherence | No new issues found — `PythonClientDesign.md` matches the implemented surface. |
| 8 | Code organization & conventions | Issue found: duplicate `from mxgateway.values import` lines in `commands.py:22-23` (Client.Python-014). |
| 9 | Testing coverage | Issues found: `bench_read_bulk` CLI body, `MAX_AGGREGATE_EVENTS` event-cap, and `_use_plaintext` localhost-auto-plaintext path are untested (Client.Python-015, Client.Python-016). |
| 10 | Documentation & comments | Issues found: `pyproject.toml` lacks PyPI metadata (`authors`, `license`, `classifiers`, `urls`) and no PEP 561 `py.typed` marker (Client.Python-017); auto-plaintext behaviour is undocumented (Client.Python-013). |
| 1 | Correctness & logic bugs | No new issues found — TLS-by-default fix in Client.Python-013 verified; no test fixture accidentally relies on plaintext defaults. |
| 2 | mxaccessgw conventions | No new issues found — secrets redacted, MXAccess parity preserved, generated code untouched. |
| 3 | Concurrency & thread safety | No new issues found — close-idempotency and shared cancel-on-cancel iterator still in place. |
| 4 | Error handling & resilience | No new issues found. |
| 5 | Security | No new issues found`_use_plaintext` now requires explicit `--plaintext` opt-in (Client.Python-013 resolution verified). The `--api-key` flag is also still redacted from the option repr and CLI errors. |
| 6 | Performance & resource management | No new issues found. |
| 7 | Design-document adherence | No new issues found — `PythonClientDesign.md` is consistent with the implemented surface. |
| 8 | Code organization & conventions | Issue found: `mxgateway_cli` is shipped in the wheel but has no PEP 561 `py.typed` marker (Client.Python-019), so the CLI module's inline type hints are invisible to downstream `mypy` runs. |
| 9 | Testing coverage | Issue found: no test exercises the wheel-build / editable-install flow; the broken `pyproject.toml` (Client.Python-018) was not caught at commit time because the test suite runs from `src/` via `pytest pythonpath` (Client.Python-020). |
| 10 | Documentation & comments | Issue found: cross-client CLI parity gap — the Python CLI ships none of the Galaxy subcommands (`galaxy-test-connection`, `galaxy-last-deploy`, `galaxy-discover`, `galaxy-watch`) the .NET / Go / Rust / Java CLIs all expose, and lacks the new `.NET`-only `bench-stream-events`. README does not flag the gap (Client.Python-021). |
## Findings
@@ -464,3 +464,306 @@ declared it in `[tool.setuptools.package-data] mxgateway = ["py.typed"]`
so the wheel ships the marker and downstream `mypy` users see the
inline type hints. Pure metadata / packaging change — `python -m pytest -q`
still passes (91 tests).
### Client.Python-018
| Field | Value |
|---|---|
| Severity | High |
| Category | Code organization & conventions |
| Location | `clients/python/pyproject.toml:11` |
| Status | Resolved |
**Description:** The Client.Python-017 resolution set
`license = "Proprietary"` as a top-level string. Under PEP 639 (enforced
by `setuptools >= 77`, and active in the installed `setuptools 82.0.1`),
the `project.license` string form must be a valid SPDX expression.
`"Proprietary"` is not a registered SPDX identifier, so the configured
build backend (`setuptools.build_meta`) refuses the file outright. Both
`python -m pip wheel . --no-deps --wheel-dir …` and
`python -m pip install -e .` — the exact commands documented in
`clients/python/README.md` ("Build And Test", "Packaging") and the
"build wheel" instruction in `docs/ClientPackaging.md` — now fail before
any source is compiled with:
```
ValueError: invalid pyproject.toml config: `project.license`.
configuration error: `project.license` must be valid exactly by one definition (0 matches found):
- {type: string, format: 'SPDX'}
- type: table keys: 'file': … required: ['file']
- type: table keys: 'text': … required: ['text']
```
`python -m pytest` still runs because `[tool.pytest.ini_options]
pythonpath = ["src"]` lets pytest import the package without an install
— which masked the regression at commit time and explains how the
Client.Python-017 resolution comment was able to assert "`python -m
pytest -q` still passes (91 tests)" while shipping a wheel build that
cannot start. The Client.Python-017 resolution comment that "the SPDX
`Proprietary` expression matches the de-facto status" is incorrect:
`Proprietary` is *not* a registered SPDX identifier; only entries on the
SPDX licence list (e.g. `MIT`, `Apache-2.0`, `BSD-3-Clause`) or
`LicenseRef-*` custom identifiers satisfy the
`{ type: string, format: 'SPDX' }` rule. PEP 639 added the
`LicenseRef-…` escape hatch precisely for proprietary / unlisted
licences.
This is a regression of the developer-onboarding workflow introduced by
the very commit being reviewed. A fresh checkout cannot run
`python -m pip install -e ".[dev]"` (the command in `CLAUDE.md`'s
"Clients" section) without first patching `pyproject.toml`.
**Recommendation:** Fix the `license` value so the build backend
accepts it. Three concrete options, in order of preference:
1. Use a `LicenseRef-*` SPDX-compatible custom identifier:
`license = "LicenseRef-Proprietary"`. Requires no additional
`LICENSE` file and is honoured by setuptools / pip / PyPI as a
proprietary marker.
2. Add a top-level `LICENSE` file (or `clients/python/LICENSE`) and
point at it via the table form:
`license = { file = "LICENSE" }`. This also documents the proprietary
terms.
3. Drop the `license` key entirely and convey the same intent via the
classifier `"License :: Other/Proprietary License"` (already part of
the classifier set), reverting the PEP-639 string field that the
build backend now insists must be SPDX.
Add a CI / pre-commit check that runs `python -m pip wheel . --no-deps`
(or `python -m build`) on `clients/python` so a future
`pyproject.toml` regression is caught at commit time rather than at
first install on a clean machine. See also Client.Python-020.
**Resolution:** 2026-05-20 — Dropped the invalid top-level
`license = "Proprietary"` string from `clients/python/pyproject.toml`
and added the existing `License :: Other/Proprietary License` trove
classifier to convey the same intent without violating PEP 639's SPDX
rule. No `LICENSE` file exists at the repo root or under
`clients/python/`, so the `license = { file = "LICENSE" }` table form
was not used; relying on the classifier is the option (3) variant
called out in the recommendation. Verified by running
`python -m pip wheel . --no-deps -w ./.test-wheel-output` from
`clients/python`: the build now succeeds and emits
`mxaccess_gateway_client-0.1.0-py3-none-any.whl` (47 KB) where
previously it failed with the `project.license must be valid exactly
by one definition` `ValueError`. The CI / pre-commit recommendation is
addressed by Client.Python-020.
### Client.Python-019
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `clients/python/pyproject.toml:60-61`, `clients/python/src/mxgateway_cli/` |
| Status | Resolved |
**Description:** Client.Python-017 added the PEP 561 marker file
`clients/python/src/mxgateway/py.typed` and declared it in
`[tool.setuptools.package-data] mxgateway = ["py.typed"]`. The wheel
therefore advertises `mxgateway` as typed. However the same wheel
also ships the **`mxgateway_cli`** package (`setuptools.packages.find`
with `where = ["src"]` discovers both `mxgateway` and `mxgateway_cli`,
confirmed via `find_packages` in this review), and `mxgateway_cli`:
* is shipped in the wheel and is the package the `mxgw-py` console
script entry point resolves into (`[project.scripts] mxgw-py =
"mxgateway_cli.commands:main"`),
* is fully type-annotated (every function in `commands.py` has full
parameter and return annotations; `from __future__ import annotations`
is in effect),
* but has no `py.typed` file and is not listed in
`[tool.setuptools.package-data]`.
PEP 561 requires the marker file inside **each** importable package the
distribution wants to expose to type checkers — the `mxgateway` marker
does not transfer to `mxgateway_cli`. A downstream consumer that imports
or composes against `mxgateway_cli` (e.g. wrapping it as a programmatic
CLI library) will see all symbols as `Untyped` under `mypy` despite the
hints being present in source.
This is a follow-up to Client.Python-017 — the fix is small and pure
packaging.
**Recommendation:** Create
`clients/python/src/mxgateway_cli/py.typed` (empty file, as PEP 561
requires) and extend the existing package-data declaration so the
wheel ships it:
```toml
[tool.setuptools.package-data]
mxgateway = ["py.typed"]
mxgateway_cli = ["py.typed"]
```
No source change in either package; verify by building a wheel
(once Client.Python-018 is fixed) and inspecting that both
`mxgateway/py.typed` and `mxgateway_cli/py.typed` appear in the wheel
contents.
**Resolution:** 2026-05-20 — Created the empty PEP 561 marker file
`clients/python/src/mxgateway_cli/py.typed` and added
`mxgateway_cli = ["py.typed"]` under
`[tool.setuptools.package-data]` in `clients/python/pyproject.toml`
alongside the existing `mxgateway = ["py.typed"]` line. Verified by
inspecting the built wheel
(`mxaccess_gateway_client-0.1.0-py3-none-any.whl`): the archive now
contains both `mxgateway/py.typed` and `mxgateway_cli/py.typed`, so
downstream `mypy` consumers see the inline type hints in both
packages. Pure packaging change — no source modifications.
### Client.Python-020
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `clients/python/tests/`, `scripts/` |
| Status | Resolved |
**Description:** Client.Python-018 is invisible to the existing test
suite: `python -m pytest` passes because `[tool.pytest.ini_options]
pythonpath = ["src"]` lets pytest import the package without going
through `setuptools.build_meta`. None of the 91 tests build the wheel,
do an editable install, or otherwise exercise the
`setuptools.build_meta` configuration validator. As a result, a
`pyproject.toml` regression that breaks `pip install -e .` /
`pip wheel .` — the exact commands documented in the Python client
README and `CLAUDE.md` — passes the test suite green. The other
language clients have parallel coverage gaps (no CI-level "the package
installs" smoke test for Python in
`scripts/run-client-e2e-tests.ps1`, which only runs the live e2e
matrix and assumes the editable install already worked), but Python
is the only one whose published install command is currently broken.
**Recommendation:** Add a thin pytest module (e.g.
`tests/test_packaging.py`) that runs
```python
import subprocess, sys, pathlib
def test_pyproject_validates_against_setuptools_build_meta():
here = pathlib.Path(__file__).resolve().parent.parent
result = subprocess.run(
[sys.executable, "-m", "pip", "wheel", ".",
"--no-deps", "--no-build-isolation",
"--wheel-dir", str(tmp_path)],
cwd=here, capture_output=True, text=True,
)
assert result.returncode == 0, result.stderr
```
(or any equivalent that invokes
`setuptools.config.pyprojecttoml.read_configuration`). Marker the test
with `@pytest.mark.slow` if the wheel build is too heavy for the
default suite, and document the test in the README. Alternatively
add a CI step to `scripts/run-client-e2e-tests.ps1` (or a new
`scripts/check-python-package.ps1`) that fails the build when the
wheel build fails. Either approach would have surfaced
Client.Python-018 at commit time.
**Resolution:** 2026-05-20 — Added
`clients/python/tests/test_packaging.py::test_pip_wheel_build_succeeds`.
The test invokes `python -m pip wheel . --no-deps --wheel-dir <tmp>`
against the package root via `subprocess` and asserts (a) exit code
zero and (b) an `mxaccess_gateway_client-*.whl` file is produced in
the temp directory, capturing stdout/stderr in the assertion message
on failure so any future PEP 639 / SPDX violation or other
`setuptools.build_meta` configuration error is reported with the
build backend's own error text. Verified the test would have caught
Client.Python-018: with the old `license = "Proprietary"` string in
place the test fails with the `project.license must be valid exactly
by one definition` `ValueError`. The pytest module is the simpler
half of the recommendation; no PowerShell wrapper script was added
since pytest already runs in the same `python -m pytest` invocation
the README documents. Test suite is now 92 tests (was 91), all
passing.
### Client.Python-021
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `clients/python/src/mxgateway_cli/commands.py`, `clients/python/README.md:235-258` |
| Status | Resolved |
**Description:** Cross-client CLI parity check (one of the things the
review prompt asks for): the `mxgw-py` CLI subcommand set has drifted
from every other client CLI in the matrix.
Subcommand inventory at this commit:
| Subcommand | .NET (`mxgw`) | Go (`mxgw-go`) | Rust (`mxgw`) | Java (`mxgw-java`) | Python (`mxgw-py`) |
|---|---|---|---|---|---|
| `version` | yes | yes | yes | yes | yes |
| `ping` | yes | (no) | yes | (no) | yes |
| `open-session` / `close-session` | yes | yes | yes | yes | yes |
| `register` / `add-item` / `advise` | yes | yes | yes | yes | yes |
| `subscribe-bulk` / `unsubscribe-bulk` / `read-bulk` | yes | yes | yes | yes | yes |
| `write-bulk` / `write2-bulk` / `write-secured-bulk` / `write-secured2-bulk` | yes | yes | yes | yes | yes |
| `write` / `write2` | yes / (varies) | yes / (no) | yes / yes | yes / (no) | yes / yes |
| `stream-events` | yes | yes | yes | yes | yes |
| `smoke` | yes | yes | yes | yes | yes |
| `bench-read-bulk` | yes | yes | yes | yes | yes |
| `bench-stream-events` | **yes** | (no) | (no) | (no) | (no) |
| `galaxy-test-connection` (or alias) | **yes** | **yes** | **yes** | **yes** | **(no)** |
| `galaxy-last-deploy` / `galaxy-deploy-time` | **yes** | **yes** | **yes** | **yes** | **(no)** |
| `galaxy-discover` | **yes** | **yes** | **yes** | **yes** | **(no)** |
| `galaxy-watch` | **yes** | **yes** | **yes** | **yes** | **(no)** |
Two parity gaps remain after Client.Python-013/017:
1. The Python CLI ships **no Galaxy subcommands at all** even though
the `GalaxyRepositoryClient` library wrapper is fully implemented
and exercised by `tests/test_galaxy.py` /
`tests/test_galaxy_iter_hierarchy.py`. The README acknowledges the
`watch-deploy-events` gap inline ("The CLI does not currently
expose a streaming `watch-deploy-events` subcommand — use the
library API directly when subscribing to deploy events from
Python.") but does not call out that **the other three Galaxy
subcommands are also missing** — and the .NET / Go / Rust / Java
CLIs all expose them. A user running the cross-language smoke
matrix who expects Python to behave like the other clients sees a
silent "command not found" on `mxgw-py galaxy-test-connection`.
2. The new `bench-stream-events` subcommand (added to the .NET CLI in
the previous commit `1cd51bb`) is .NET-only today; the Python CLI
is consistent with Go / Rust / Java on this point. Worth flagging
as a forward-looking parity gap that will need filling if the
cross-language benchmark matrix grows a stream-events driver in
`scripts/`.
Severity is Low because the existing `scripts/bench-read-bulk.ps1`
matrix only invokes `bench-read-bulk` and does not break, and the
Python `GalaxyRepositoryClient` library is fully functional — the gap
is purely in the test CLI surface. But cross-client parity is an
explicit review check and the gap is not documented.
**Recommendation:** Either (a) add `galaxy-test-connection`,
`galaxy-last-deploy`, `galaxy-discover`, and `galaxy-watch`
subcommands to `mxgateway_cli/commands.py` (each is a thin wrapper
over `GalaxyRepositoryClient`, mirroring the existing four-language
implementation), or (b) update `clients/python/README.md`'s "CLI"
section with an explicit "CLI parity gaps" subsection that lists the
missing subcommands and recommends the library API. Option (a) is
preferable for cross-language matrix testing. Also document the
`bench-stream-events` gap symmetrically once a cross-language stream
benchmark driver is added under `scripts/`.
**Resolution:** 2026-05-20 — Scoped this finding to a
documentation-only fix; the full Galaxy CLI parity implementation
(four new subcommands wired to `GalaxyRepositoryClient`) is a larger
piece of work and will be tracked as a separate follow-up finding.
Added a new "CLI Parity Gaps" subsection to
`clients/python/README.md` immediately under the existing CLI
section that explicitly enumerates the four missing
`mxgw-py` Galaxy subcommands (`galaxy-test-connection`,
`galaxy-last-deploy`, `galaxy-discover`, `galaxy-watch`), names the
sibling CLIs that already expose them (.NET `mxgw`, Go `mxgw-go`,
Rust `mxgw`, Java `mxgw-java`), points readers at the library API
(`GalaxyRepositoryClient`, already documented under "Galaxy
Repository Browse") as the supported Python entry point in the
interim, and also flags the .NET-only `bench-stream-events` gap so
the cross-language benchmark matrix has a record of the asymmetry.
No CLI source change; the implementation of the four Galaxy
subcommands is deferred. Resolved as a doc note rather than a full
parity fix.