code-reviews: re-review Client.Python at 42b0037

Append 5 new findings (Client.Python-022..026): README flags for new alarm subcommands do not exist; Client.Python-013 regression — the silent localhost auto-plaintext branch is still present (the prior Resolution did not survive the rename); production batch path uses the click.testing.CliRunner helper; no behavioural tests for new SDK + CLI; bench cleanup swallows exceptions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:55 -04:00
parent 15fceed536
commit bc28fee641
1 changed files with 268 additions and 3 deletions
@@ -5,9 +5,9 @@
 | Module | `clients/python` |
 | Reviewer | Claude Code |
 | Review date | 2026-05-24 |
-| Commit reviewed | `d692232` |
-| Status | Reviewed |
-| Open findings | 0 |
+| Commit reviewed | `42b0037` |
+| Status | Re-reviewed |
+| Open findings | 5 |

 ## Checklist coverage

@@ -28,6 +28,38 @@ history. This section reflects categories evaluated in this pass.
 | 9 | Testing coverage | Issue found: no test exercises the wheel-build / editable-install flow; the broken `pyproject.toml` (Client.Python-018) was not caught at commit time because the test suite runs from `src/` via `pytest pythonpath` (Client.Python-020). |
 | 10 | Documentation & comments | Issue found: cross-client CLI parity gap — the Python CLI ships none of the Galaxy subcommands (`galaxy-test-connection`, `galaxy-last-deploy`, `galaxy-discover`, `galaxy-watch`) the .NET / Go / Rust / Java CLIs all expose, and lacks the new `.NET`-only `bench-stream-events`. README does not flag the gap (Client.Python-021). |

+### 2026-05-24 re-review (commit 42b0037)
+
+Re-review pass at `42b0037`. The diff against the previous review base
+`d692232` is four commits affecting `clients/python`:
+
+- `71d2c39` e2e: port `batch` subcommand to all five client CLIs
+- `6add4b4` Python client: port bulk read/write SDK methods + CLI subcommands
+- `828e3e6` Python client: port stream-alarms and acknowledge-alarm
+- `8738735` clients: document StreamAlarms + AcknowledgeAlarm in each README
+
+Surface area added: `Session.read_bulk` / `write_bulk` / `write2_bulk` /
+`write_secured_bulk` / `write_secured2_bulk`; `GatewayClient.stream_alarms`
+ `_canceling_alarm_feed_iterator`; the corresponding CLI subcommands
+`read-bulk`, `write-bulk`, `write2-bulk`, `write-secured-bulk`,
+`write-secured2-bulk`, `bench-read-bulk`, `stream-alarms`,
+`acknowledge-alarm`, and `batch`; new README CLI examples for the alarm
+subcommands; new CLI tests for `stream-alarms` / `acknowledge-alarm`
+registration and `batch` semantics.
+
+| # | Category | Result |
+|---|---|---|
+| 1 | Correctness & logic bugs | Issue found: `bench-read-bulk` does a function-local `import time` and uses bare `except Exception: pass` in cleanup blocks (Client.Python-026). |
+| 2 | mxaccessgw conventions | No new issues found — secured writes still redact, generated code untouched, MXAccess parity preserved. |
+| 3 | Concurrency & thread safety | No new issues found — `_canceling_alarm_feed_iterator` follows the same shape as `_canceling_iterator` (Client.Python-007 helper). |
+| 4 | Error handling & resilience | No new issues found in the new SDK methods; new RPC mapping `map_rpc_error("stream alarms", ...)` is correct. |
+| 5 | Security | Issue found: `_use_plaintext` localhost auto-plaintext branch resolved under Client.Python-013 is back in the renamed CLI module and was carried forward through the new commit untouched (Client.Python-023). |
+| 6 | Performance & resource management | No new issues found. |
+| 7 | Design-document adherence | No new issues found in the new alarm / bulk surface — matches the cross-client parity matrix expectation. |
+| 8 | Code organization & conventions | Issue found: the new `batch` subcommand uses `click.testing.CliRunner` from production code (Client.Python-024). |
+| 9 | Testing coverage | Issue found: the new SDK methods and most of the new CLI subcommand bodies have no behavioural tests — only `--help` smoke tests for the alarm CLI (Client.Python-025). |
+| 10 | Documentation & comments | Issue found: the README CLI examples for `stream-alarms` and `acknowledge-alarm` use flags that do not exist on the implemented commands (Client.Python-022). |
+
 ### 2026-05-24 review (commit d692232)

 Re-review pass at `d692232`. Diff against `a020350` is commit `397d3c5`:
@@ -795,3 +827,236 @@ the cross-language benchmark matrix has a record of the asymmetry.
 No CLI source change; the implementation of the four Galaxy
 subcommands is deferred. Resolved as a doc note rather than a full
 parity fix.
+
+### Client.Python-022
+
+| Field | Value |
+|---|---|
+| Severity | High |
+| Category | Documentation & comments |
+| Location | `clients/python/README.md:201-202`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:389-420` |
+| Status | Open |
+
+**Description:** The README CLI examples added by commit `8738735` for the
+new alarm subcommands cite flags the CLI does not accept:
+
+```
+mxgw-py stream-alarms --session-id <id> --max-messages 1 --json
+mxgw-py acknowledge-alarm --session-id <id> --alarm-reference "\\Galaxy\Area001.Pump001.PumpFault" --json
+```
+
+Both subcommands are session-less (the alarm feed is served by the gateway
+itself, not a worker session — see the docstring on `acknowledge_alarm`,
+"Acknowledge an active MXAccess alarm condition (session-less)"). Neither
+`@main.command("stream-alarms")` nor `@main.command("acknowledge-alarm")`
+declares a `--session-id` option, and `acknowledge-alarm` declares the
+ack-target as `--reference`, **not** `--alarm-reference`. A user copy-pasting
+either example gets `Error: no such option: --session-id` (or
+`--alarm-reference`) and exits non-zero before any RPC is attempted.
+
+This drift is invisible to the test suite because
+`tests/test_cli.py::test_acknowledge_alarm_requires_reference` only asserts
+that the missing-flag error mentions `--reference` — it does not validate
+the README at all. The .NET / Go / Rust / Java alarm CLI examples in the
+sibling READMEs are consistent with their actual flag names, so the Python
+README is the only one out of step with its implementation.
+
+**Recommendation:** Either fix the README examples to match the implementation
+(remove `--session-id` from both lines, rename `--alarm-reference` to
+`--reference`), or — if cross-client parity wants the longer flag name —
+rename the CLI option to `--alarm-reference` and add a test that copy-pastes
+the README examples through `CliRunner` to assert they parse. Option (1) is
+the smaller change.
+
+### Client.Python-023
+
+| Field | Value |
+|---|---|
+| Severity | Medium |
+| Category | Security |
+| Location | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:901-906` |
+| Status | Open |
+
+**Description:** Client.Python-013 (severity Medium, Security) was marked
+**Resolved** on 2026-05-20 with the explicit claim that the silent
+`localhost:` / `127.0.0.1:` auto-plaintext branch had been removed from
+`_use_plaintext`. The re-review at `d692232` re-asserted this in its
+checklist ("`_use_plaintext` now requires explicit `--plaintext` opt-in
+(Client.Python-013 resolution verified)").
+
+The branch is still present in the reviewed source at HEAD `42b0037`:
+
+```python
+def _use_plaintext(kwargs: dict[str, Any]) -> bool:
+    if kwargs.get("use_tls"):
+        return False
+    if kwargs.get("plaintext"):
+        return True
+    return kwargs["endpoint"].startswith("localhost:") or kwargs["endpoint"].startswith("127.0.0.1:")
+```
+
+The same code is present in `git show d692232:clients/python/src/zb_mom_ww_mxgateway_cli/commands.py`, so the regression entered at or
+before the rename commit `397d3c5` (which created
+`src/zb_mom_ww_mxgateway_cli/commands.py` from scratch with the
+pre-Client.Python-013 body) and was not noticed in the prior re-review.
+
+The original security argument is unchanged: a user who runs the gateway
+behind TLS on loopback for production-shaped local testing and passes
+`--api-key mxgw_<secret>` against `localhost:5001` silently gets a plaintext
+gRPC channel, with the bearer token attached to it. The other clients
+(.NET https-prefix detection, Go / Java explicit `--plaintext`, Rust
+opt-in) do not auto-downgrade. The Client.Python-013 resolution also added
+six regression tests in `tests/test_cli.py` that asserted the explicit-flag
+contract; those tests do not exist in the current `tests/test_cli.py` —
+either they were lost in the rename or never carried over.
+
+**Recommendation:** Re-apply the Client.Python-013 fix on the current
+source: drop the `endpoint.startswith("localhost:") or endpoint.startswith("127.0.0.1:")`
+branch, make `--plaintext` and `--tls` mutually exclusive, default to TLS,
+and add an assertion-time test that copy-pastes the Client.Python-013
+regression-test fixture into `tests/test_cli.py`. Because Client.Python-013
+is marked Resolved with a 2026-05-20 commit reference, do **not** silently
+re-resolve this finding — keep it Open with a fresh ID so the regression
+audit trail is preserved.
+
+### Client.Python-024
+
+| Field | Value |
+|---|---|
+| Severity | Medium |
+| Category | Code organization & conventions |
+| Location | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:13,48-119` |
+| Status | Open |
+
+**Description:** The new `batch` subcommand (commit `71d2c39`) implements
+the cross-language batch protocol by importing `click.testing.CliRunner`
+into production code and calling `runner.invoke(main, args, catch_exceptions=True)`
+in a `for raw_line in sys.stdin:` loop. `CliRunner` is documented as a
+**testing** helper:
+
+* It replaces `sys.stdin` / `sys.stdout` / `sys.stderr` with `io.StringIO`
+  during each `invoke()`, swallowing any side-channel output the inner
+  command writes directly to the real streams (the existing CLI bodies do
+  not, but any future helper that calls `print()` mid-command will be
+  silently captured into `result.output` rather than reaching the
+  harness real-time).
+* It captures the inner command into an `Exception` rather than letting
+  Click's normal exit code propagate, so `result.exit_code` is the
+  pseudo-exit of a `SystemExit` translation, not the real process exit.
+* Click does not guarantee `CliRunner` is stable across versions —
+  click 9 has already deprecated `runner.invoke(..., mix_stderr=...)`,
+  and a future Click release could change the return-tuple shape.
+* It is recursively re-entrant: `runner.invoke(main, ["batch"], ...)`
+  inside batch silently spawns a nested batch reading from the same
+  StringIO-replaced stdin (empty), so a stdin line of `batch` exits
+  cleanly with no error — almost certainly not the intended semantics.
+
+The other client CLIs in the matrix (.NET, Go, Rust, Java) implement
+`batch` by dispatching to their command parser directly, not by
+re-invoking the test runner.
+
+**Recommendation:** Replace `CliRunner` with a direct call into the Click
+parser, e.g. `main.main(args, standalone_mode=False)` wrapped in a
+`try/except click.ClickException` to convert Click exit conditions into
+the `{"error": ..., "type": ...}` payload. Capture stdout via a per-line
+context manager (e.g. `contextlib.redirect_stdout(io.StringIO())`) so the
+batch loop can interleave inner-command output with the
+`__MXGW_BATCH_EOR__` sentinel without depending on the testing API. Add
+a regression test that drives `batch` with `batch\n` on stdin and asserts
+recursive invocation is either rejected or correctly bounded.
+
+### Client.Python-025
+
+| Field | Value |
+|---|---|
+| Severity | Low |
+| Category | Testing coverage |
+| Location | `clients/python/tests/test_cli.py`, `clients/python/src/zb_mom_ww_mxgateway/{client.py,session.py}`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py` |
+| Status | Open |
+
+**Description:** Commits `6add4b4` and `828e3e6` added five new SDK
+methods (`Session.read_bulk`, `Session.write_bulk`, `Session.write2_bulk`,
+`Session.write_secured_bulk`, `Session.write_secured2_bulk`),
+`GatewayClient.stream_alarms`, the helper
+`_canceling_alarm_feed_iterator`, and eight new CLI subcommands
+(`read-bulk`, `write-bulk`, `write2-bulk`, `write-secured-bulk`,
+`write-secured2-bulk`, `bench-read-bulk`, `stream-alarms`,
+`acknowledge-alarm`). The only test coverage added in `tests/test_cli.py`
+is:
+
+* `test_stream_alarms_is_registered` — `--help` smoke only.
+* `test_acknowledge_alarm_requires_reference` — verifies the missing-flag
+  Click error contains `--reference`; no happy-path test.
+
+There is no test that:
+
+1. Asserts `Session.read_bulk` / `write_bulk` / `write2_bulk` /
+   `write_secured_bulk` / `write_secured2_bulk` builds the expected
+   `MxCommand` shape (kind, sub-message, server_handle, entries) — the
+   prior Client.Python-009 coverage pattern (`test_add_item2_sends_*`,
+   `test_write2_sends_value_and_timestamp_value`) is not extended to the
+   bulk family even though they ship the same wire-shape risk.
+2. Exercises `_canceling_alarm_feed_iterator` cancel-on-task-cancellation
+   (the Client.Python-007 helper test pattern is not extended).
+3. Drives `stream-alarms` / `acknowledge-alarm` / `read-bulk` /
+   `write-bulk` / `write-secured-bulk` happy paths through `CliRunner`
+   with a fake stub — the existing
+   `test_cli_register_happy_path_emits_server_handle` pattern is not
+   extended.
+4. Asserts `bench-read-bulk` emits the cross-language schema for the new
+   `read-bulk`-shaped fields (the Client.Python-015 pattern existed for
+   the previous bench command but no equivalent exists for this one).
+
+A silent drift in any of the four bulk-write request shapes — or a
+schema drift on `bench-read-bulk` — would not be caught.
+
+**Recommendation:** Extend the Client.Python-009 / Client.Python-015 /
+Client.Python-016 patterns: add request-shape tests for the four new
+bulk methods, a CLI happy-path test for `read-bulk` / `write-bulk` /
+`stream-alarms` / `acknowledge-alarm` against fake stubs, and a
+cross-language schema test for `bench-read-bulk` mirroring
+`test_bench_read_bulk_emits_cross_language_schema` (with `--read-bulk`
+applied to the renamed bench). At minimum, add a request-shape test for
+`write_secured_bulk` since the secured family is the highest-risk
+parity surface.
+
+### Client.Python-026
+
+| Field | Value |
+|---|---|
+| Severity | Low |
+| Category | Correctness & logic bugs |
+| Location | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:674-738` |
+| Status | Open |
+
+**Description:** Two minor quality issues in the new `_bench_read_bulk`
+body (commit `6add4b4`):
+
+1. `import time` is done inside the function body (line 676) rather than
+   at module top. `PythonStyleGuide.md` does not state this explicitly,
+   but every other helper in `commands.py` imports its dependencies at
+   module top, and `time` is already imported (transitively) elsewhere
+   in the package. The function-local import is a vestige of incremental
+   development and should be hoisted.
+2. The `finally` cleanup block uses two consecutive bare
+   `except Exception: pass` blocks to swallow `unsubscribe_bulk` and
+   `session.close()` failures (lines 733-734 and 737-738). This silently
+   discards diagnostic information about cleanup failures — e.g. a
+   transient gateway crash mid-benchmark or a protocol error during
+   unsubscribe — and matches an anti-pattern the rest of the module
+   avoids (the `_bench_read_bulk` analogues in the other clients log the
+   cleanup failure before swallowing it).
+
+Both are Low severity. The bench command is best-effort by design (the
+benchmark output is what matters; cleanup failures are not user-facing),
+but a single log line on cleanup failure would make a future regression
+visible at the next benchmark run rather than silently corrupting the
+worker's subscription bookkeeping until a session-level GC sweep.
+
+**Recommendation:** Move `import time` to the module-level import block.
+Replace each `except Exception: pass` with `except Exception as exc:
+logger.warning("bench-read-bulk cleanup: %s", exc)` against a
+module-level `logger = logging.getLogger(__name__)`. No behavioural
+change in the happy path; failure path becomes diagnosable. No new test
+required for the import hoist; the logger change is exercised by the
+existing bench smoke test once `caplog` is added to the test signature.