Files

T

Joseph Doherty f5b50c4484 Resolve Client.Python-022..026: TLS-by-default, batch CLI, README

Client.Python-022  README CLI examples for stream-alarms and
                   acknowledge-alarm now use the correct flags;
                   regression test parses every documented line through
                   Click.
Client.Python-023  Re-applied Client.Python-013 — _use_plaintext drops
                   the silent localhost / 127.0.0.1 auto-downgrade
                   branch; --plaintext and --tls are mutually exclusive
                   and TLS is the default.
Client.Python-024  batch dispatch routes through main.main(...,
                   standalone_mode=False) under a redirected stdout
                   instead of click.testing.CliRunner; recursive batch
                   lines are rejected outright.
Client.Python-025  Added behavioural tests for the five bulk SDK methods,
                   stream_alarms, and the new CLI subcommands.
Client.Python-026  _bench_read_bulk hoists 'import time' to module scope
                   and logs cleanup failures instead of swallowing them.

All resolved at 2026-05-24; python -m pytest is 65/65 green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-24 08:50:27 -04:00

72 KiB

Raw Blame History

Code Review — Client.Python

Field	Value
Module	`clients/python`
Reviewer	Claude Code
Review date	2026-05-24
Commit reviewed	`42b0037`
Status	Re-reviewed
Open findings	0

Checklist coverage

A re-review at commit a020350 over the same module. Prior findings (Client.Python-001 — Client.Python-017) remain closed and are kept as history. This section reflects categories evaluated in this pass.

#	Category	Result
1	Correctness & logic bugs	No new issues found — TLS-by-default fix in Client.Python-013 verified; no test fixture accidentally relies on plaintext defaults.
2	mxaccessgw conventions	No new issues found — secrets redacted, MXAccess parity preserved, generated code untouched.
3	Concurrency & thread safety	No new issues found — close-idempotency and shared cancel-on-cancel iterator still in place.
4	Error handling & resilience	No new issues found.
5	Security	No new issues found — `_use_plaintext` now requires explicit `--plaintext` opt-in (Client.Python-013 resolution verified). The `--api-key` flag is also still redacted from the option repr and CLI errors.
6	Performance & resource management	No new issues found.
7	Design-document adherence	No new issues found — `PythonClientDesign.md` is consistent with the implemented surface.
8	Code organization & conventions	Issue found: `mxgateway_cli` is shipped in the wheel but has no PEP 561 `py.typed` marker (Client.Python-019), so the CLI module's inline type hints are invisible to downstream `mypy` runs.
9	Testing coverage	Issue found: no test exercises the wheel-build / editable-install flow; the broken `pyproject.toml` (Client.Python-018) was not caught at commit time because the test suite runs from `src/` via `pytest pythonpath` (Client.Python-020).
10	Documentation & comments	Issue found: cross-client CLI parity gap — the Python CLI ships none of the Galaxy subcommands (`galaxy-test-connection`, `galaxy-last-deploy`, `galaxy-discover`, `galaxy-watch`) the .NET / Go / Rust / Java CLIs all expose, and lacks the new `.NET`-only `bench-stream-events`. README does not flag the gap (Client.Python-021).

2026-05-24 re-review (commit `42b0037`)

Re-review pass at 42b0037. The diff against the previous review base d692232 is four commits affecting clients/python:

71d2c39 e2e: port batch subcommand to all five client CLIs
6add4b4 Python client: port bulk read/write SDK methods + CLI subcommands
828e3e6 Python client: port stream-alarms and acknowledge-alarm
8738735 clients: document StreamAlarms + AcknowledgeAlarm in each README

Surface area added: Session.read_bulk / write_bulk / write2_bulk / write_secured_bulk / write_secured2_bulk; GatewayClient.stream_alarms

_canceling_alarm_feed_iterator; the corresponding CLI subcommands read-bulk, write-bulk, write2-bulk, write-secured-bulk, write-secured2-bulk, bench-read-bulk, stream-alarms, acknowledge-alarm, and batch; new README CLI examples for the alarm subcommands; new CLI tests for stream-alarms / acknowledge-alarm registration and batch semantics.

#	Category	Result
1	Correctness & logic bugs	Issue found: `bench-read-bulk` does a function-local `import time` and uses bare `except Exception: pass` in cleanup blocks (Client.Python-026).
2	mxaccessgw conventions	No new issues found — secured writes still redact, generated code untouched, MXAccess parity preserved.
3	Concurrency & thread safety	No new issues found — `_canceling_alarm_feed_iterator` follows the same shape as `_canceling_iterator` (Client.Python-007 helper).
4	Error handling & resilience	No new issues found in the new SDK methods; new RPC mapping `map_rpc_error("stream alarms", ...)` is correct.
5	Security	Issue found: `_use_plaintext` localhost auto-plaintext branch resolved under Client.Python-013 is back in the renamed CLI module and was carried forward through the new commit untouched (Client.Python-023).
6	Performance & resource management	No new issues found.
7	Design-document adherence	No new issues found in the new alarm / bulk surface — matches the cross-client parity matrix expectation.
8	Code organization & conventions	Issue found: the new `batch` subcommand uses `click.testing.CliRunner` from production code (Client.Python-024).
9	Testing coverage	Issue found: the new SDK methods and most of the new CLI subcommand bodies have no behavioural tests — only `--help` smoke tests for the alarm CLI (Client.Python-025).
10	Documentation & comments	Issue found: the README CLI examples for `stream-alarms` and `acknowledge-alarm` use flags that do not exist on the implemented commands (Client.Python-022).

2026-05-24 review (commit `d692232`)

Re-review pass at d692232. Diff against a020350 is commit 397d3c5: package directories renamed (src/mxgateway → src/zb_mom_ww_mxgateway, src/mxgateway_cli → src/zb_mom_ww_mxgateway_cli), distribution name changed to zb-mom-ww-mxaccess-gateway-client, console-script mxgw-py retained, every from mxgateway / import mxgateway updated. A first-pass case-insensitive regex sweep corrupted the binary descriptor bytes in the generated _pb2.py files; the fix was to restore the original _pb2.py artifacts from the pre-rename directory before deleting it, so the csharp_namespace bytes still carry the old string — this is documented as wire-level metadata not used by Python at runtime. Hostname / cert / temp-dir example identifiers (mxgateway.example.local, mxgateway-ca.pem, mxgateway-python-wheel) were intentionally preserved.

#	Category	Result
1	Correctness & logic bugs	No issues found in the a020350..d692232 diff.
2	mxaccessgw conventions	No issues found — wire identifiers preserved.
3	Concurrency & thread safety	No issues found in this diff.
4	Error handling & resilience	No issues found in this diff.
5	Security	No issues found in this diff.
6	Performance & resource management	No issues found in this diff.
7	Design-document adherence	No issues found — `PythonClientDesign.md` reflects new paths.
8	Code organization & conventions	No issues found in this diff.
9	Testing coverage	No issues found in this diff — alarm test fixtures correctly drop retired `session_id` from `AcknowledgeAlarmRequest` while retaining it on `QueryActiveAlarmsRequest`.
10	Documentation & comments	No issues found in this diff.

Findings

Client.Python-001

Field	Value
Severity	Low
Category	Documentation & comments
Location	`clients/python/pyproject.toml:8,25`, `clients/python/src/mxgateway_cli/commands.py:25`
Status	Resolved

Description: The package description in pyproject.toml still says "Async Python client scaffold" even though the client is fully implemented. Stale "scaffold" wording misrepresents maturity to anyone reading PyPI metadata. (The mxgw-py console-script name is itself consistent between pyproject.toml and the README.)

Recommendation: Update the pyproject.toml description to drop "scaffold"; keep README CLI examples in sync with the actual mxgw-py entry point.

Resolution: 2026-05-18 — Confirmed: pyproject.toml:8 description read "Async Python client scaffold for MXAccess Gateway." Changed to "Async Python client for MXAccess Gateway." The mxgw-py console-script name was already consistent with the README, so no README change was needed. Pure metadata fix — no test required.

Client.Python-002

Field	Value
Severity	Low
Category	Code organization & conventions
Location	`clients/python/src/mxgateway/__init__.py:27`
Status	Resolved

Description: MxGatewayCommandError is imported into __init__.py and is a documented public exception, but it is missing from __all__. It is the parent of MxAccessError and a meaningful catch target, so omitting it from the public surface is inconsistent — from mxgateway import * will not expose it and tooling that respects __all__ treats it as private.

Recommendation: Add "MxGatewayCommandError" to the __all__ list.

Resolution: 2026-05-18 — Re-triaged: this finding is stale against the reviewed source. clients/python/src/mxgateway/__init__.py already imports MxGatewayCommandError (line 16) and lists "MxGatewayCommandError" in __all__ (line 38). from mxgateway import * exposes it correctly. Verified at runtime ('MxGatewayCommandError' in mxgateway.__all__ is True). No source change required — the defect described no longer exists.

Client.Python-003

Field	Value
Severity	Medium
Category	Error handling & resilience
Location	`clients/python/src/mxgateway/client.py:125-137,155-173`
Status	Resolved

Description: stream_events_raw and query_active_alarms call the stub directly with a timeout kwarg when stream_timeout is set, with no TypeError fallback. galaxy.py:watch_deploy_events and _unary do have a fallback that strips timeout if the callable rejects it. This asymmetry means a fake/older stub that does not accept timeout crashes for gateway streams but not Galaxy streams. It is only masked today because stream_timeout defaults to None.

Recommendation: Apply the same try/except TypeError timeout-fallback pattern to stream_events_raw and query_active_alarms, or remove the fallback everywhere and standardise on a single behaviour.

Resolution: 2026-05-18 — Confirmed: both stream methods in client.py called the stub with timeout unconditionally and had no TypeError fallback, unlike _unary and galaxy.watch_deploy_events. Added a shared _open_stream helper in client.py that opens a server-streaming call and strips the timeout kwarg when the stub raises TypeError: ... unexpected keyword argument 'timeout', then routed both stream_events_raw and query_active_alarms through it. Regression tests in tests/test_stream_timeout_fallback.py (test_stream_events_raw_falls_back_when_stub_rejects_timeout, test_query_active_alarms_falls_back_when_stub_rejects_timeout, test_stream_events_raw_still_passes_timeout_to_capable_stub) failed before the fix and pass after. No public behaviour change for real gRPC stubs, so no README update needed.

Client.Python-004

Field	Value
Severity	Low
Category	Correctness & logic bugs
Location	`clients/python/src/mxgateway_cli/commands.py:386,402-404`
Status	Resolved

Description: In _smoke, the local variable closed is set to False and never reassigned; the finally block's if not closed: is therefore always true. This is dead/misleading code suggesting a removed early-close path.

Recommendation: Remove the closed variable and the if not closed: guard; call await session.close() directly in the finally block (or use async with session:).

Resolution: 2026-05-18 — Confirmed: closed = False was set and never reassigned, making if not closed: dead code. Replaced the try/finally with async with session: so the session is closed via the documented async context manager — Session already implements __aexit__ → close(). Behaviour is unchanged (the session is still closed on every exit path); no test needed for the dead-code removal — exercised by the existing CLI smoke test.

Client.Python-005

Field	Value
Severity	Medium
Category	Performance & resource management
Location	`clients/python/src/mxgateway/galaxy.py:117-140`
Status	Resolved

Description: discover_hierarchy pages through the entire Galaxy object hierarchy and accumulates every GalaxyObject (each carrying its full attribute list) into a single in-memory list before returning. For a large Galaxy this is a very large allocation with no streaming alternative and no caller-side bound.

Recommendation: Offer an async-generator variant (e.g. iter_hierarchy()) that yields objects/pages as they arrive, keeping discover_hierarchy() as a convenience wrapper. At minimum document the memory characteristic.

Resolution: 2026-05-18 — Confirmed: discover_hierarchy buffered the entire hierarchy with no streaming alternative. Added GalaxyRepositoryClient.iter_hierarchy, an async generator that fetches one DiscoverHierarchyRequest page at a time and yields each GalaxyObject as it arrives, so peak memory is bounded by a single page (_DISCOVER_HIERARCHY_PAGE_SIZE). Pages are fetched lazily — the next page is only requested after the current page is fully consumed. discover_hierarchy is now a thin convenience wrapper ([obj async for obj in self.iter_hierarchy()]) that preserves its list[GalaxyObject] contract, including the repeated-page-token guard. Regression tests in tests/test_galaxy_iter_hierarchy.py (test_iter_hierarchy_yields_objects_across_pages, test_iter_hierarchy_is_lazy_and_does_not_prefetch_next_page, test_iter_hierarchy_rejects_repeated_page_token, test_discover_hierarchy_still_returns_full_list) failed before the fix and pass after. clients/python/README.md updated with the iter_hierarchy usage and memory guidance since this adds a new public method.

Client.Python-006

Field	Value
Severity	Low
Category	Concurrency & thread safety
Location	`clients/python/src/mxgateway/client.py:74-82`, `clients/python/src/mxgateway/galaxy.py:85-93`, `clients/python/src/mxgateway/session.py:38-55`
Status	Resolved

Description: close() on the clients and Session.close() use a plain self._closed check-then-set with an await between, with no lock. If two coroutines call close() concurrently both can pass the guard before either sets it, causing a double channel.close() / double CloseSession RPC. Single-task usage is the documented contract, so impact is low, but the idempotency guarantee asserted in docstrings only holds for sequential calls.

Recommendation: Set self._closed = True before the await, or guard with an asyncio.Lock, so the idempotency claim holds under concurrent close.

Resolution: 2026-05-18 — Confirmed the check-then-set window. Fixed GatewayClient.close, GalaxyRepositoryClient.close, and Session.close to set self._closed = True before the await (channel close / CloseSession RPC). A second coroutine entering close() while the first is still awaiting now hits the early-return guard and does not issue a second channel.close() / CloseSession. Docstrings updated to state the idempotency holds under concurrent calls. TDD: regression tests in tests/test_low_severity_findings.py (test_gateway_client_concurrent_close_closes_channel_once, test_galaxy_client_concurrent_close_closes_channel_once, test_session_concurrent_close_sends_one_close_session_rpc) — each uses a fake channel/client that stalls inside close/close_session_raw so two concurrent close() calls interleave at the exact race window; they failed before the fix and pass after.

Client.Python-007

Field	Value
Severity	Low
Category	Error handling & resilience
Location	`clients/python/src/mxgateway/client.py:204-213`
Status	Resolved

Description: _canceling_iterator (gateway event stream) does not catch asyncio.CancelledError to invoke call.cancel() explicitly — it relies on the finally block. galaxy.py:_canceling_iterator does explicitly catch CancelledError, cancel, and re-raise. The two are functionally equivalent today, but the inconsistency between near-identical helpers invites future divergence.

Recommendation: Make the two _canceling_iterator helpers identical, ideally by factoring a single shared helper.

Resolution: 2026-05-18 — Confirmed the divergence. Factored a single shared helper: client._canceling_iterator(call, operation) now takes the map_rpc_error operation string as a parameter, explicitly catches asyncio.CancelledError (cancels the call, re-raises) and grpc.RpcError, and repeats the cancel in finally. This replaces both the gateway _canceling_iterator and the gateway _canceling_active_alarms_iterator; galaxy.py now imports and delegates to the same helper instead of defining its own, so the gateway and Galaxy stream helpers are byte-for-byte identical. TDD: tests/test_low_severity_findings.py::test_gateway_stream_iterator_cancels_call_on_task_cancellation drives a cancellable fake stream and asserts the gateway iterator cancels the underlying call on task cancellation. All existing stream-cancellation tests still pass.

Client.Python-008

Field	Value
Severity	Low
Category	Correctness & logic bugs
Location	`clients/python/src/mxgateway/values.py:62-67,83-88`
Status	Resolved

Description: to_mx_value maps any Python float to VT_R8/MX_DATA_TYPE_DOUBLE with no handling for nan/inf, which are serialised and forwarded to MXAccess which may reject or mis-handle them. bytes is mapped to VT_RECORD/MX_DATA_TYPE_UNKNOWN, a questionable default. The data_type keyword exists but Session.write never forwards it.

Recommendation: Document the float/bytes mapping assumptions, optionally validate finiteness, and consider plumbing the data_type keyword through Session.write/write2.

Resolution: 2026-05-18 — Confirmed the non-finite-float hazard. Added an _ensure_finite guard in values.py: to_mx_value now raises ValueError for nan/inf/-inf, both for a scalar float and for a non-finite element inside a float sequence — MXAccess has no defined wire representation for non-finite doubles, so rejecting client-side is the correct fail-fast. The float/bytes mapping assumptions (finite-only doubles; bytes as an opaque VT_RECORD pass-through) are now documented in the values.py module docstring and clients/python/README.md. Plumbing data_type through Session.write/write2 was deliberately not done: it is a larger public-API surface change the finding only marks as "consider", and the documented MXAccess-parity convention is type-by-Python-value; the data_type keyword stays available on to_mx_value for callers that build the MxValue directly. TDD: tests/test_low_severity_findings.py adds test_to_mx_value_rejects_nan, test_to_mx_value_rejects_positive_infinity, test_to_mx_value_rejects_negative_infinity, test_to_mx_value_rejects_non_finite_float_in_sequence, and test_to_mx_value_accepts_finite_float. README updated since to_mx_value (used by Session.write/write2) now rejects an input it previously accepted.

Client.Python-009

Field	Value
Severity	Medium
Category	Testing coverage
Location	`clients/python/tests/`
Status	Resolved

Description: Several non-trivial public paths are untested: Session.write2/add_item2 request construction; the bulk-size limit _ensure_bulk_size/MAX_BULK_ITEMS guard; the None-argument TypeError guards in bulk methods; the TLS ca_file read path in create_channel; most CLI command bodies; and map_rpc_error's default (non-auth) branch.

Recommendation: Add tests for write2/add_item2 request shape, the bulk-size ValueError, the ca_file TLS branch, the generic map_rpc_error fallthrough, and at least one happy-path CLI command using a fake stub.

Resolution: 2026-05-18 — Confirmed coverage gap against the existing tests/ files. Added tests/test_coverage_gaps.py covering every path the finding lists: test_add_item2_sends_item_context_and_returns_handle and test_write2_sends_value_and_timestamp_value (request shape + MxValue oneof), test_subscribe_bulk_rejects_oversized_request and test_add_item_bulk_at_limit_is_allowed (the MAX_BULK_ITEMS _ensure_bulk_size boundary), test_advise_item_bulk_rejects_none_argument (the None-argument TypeError guard), test_create_channel_reads_ca_file and test_create_channel_missing_ca_file_raises (the TLS ca_file read path), test_map_rpc_error_generic_branch_returns_transport_error and test_map_rpc_error_handles_error_without_code (the non-auth map_rpc_error fallthrough and the no-code path), and test_cli_register_happy_path_emits_server_handle (a happy-path CLI command body driven end to end through CliRunner with a fake stub via a monkeypatched _connect). All 10 new tests pass. No source change required — this is a pure coverage finding.

Client.Python-010

Field	Value
Severity	Low
Category	Code organization & conventions
Location	`clients/python/src/mxgateway/session.py:404`, `clients/python/src/mxgateway_cli/commands.py:422-425`
Status	Resolved

Description: session.py ends with a module-level late import from .client import GatewayClient # noqa: E402 purely to satisfy a string type hint, and commands.py:_session does a function-local import. Both work around a circular dependency that from __future__ import annotations (already in effect) makes unnecessary. _session also lacks a return type annotation.

Recommendation: Drop the runtime late import in session.py and use a TYPE_CHECKING-guarded import for the hint; add the -> Session return annotation to commands.py:_session.

Resolution: 2026-05-18 — Confirmed: with from __future__ import annotations in effect all annotations are strings, so the runtime late import was unnecessary. Removed the trailing from .client import GatewayClient # noqa: E402 in session.py and replaced it with a top-of-file if TYPE_CHECKING: import that satisfies the GatewayClient hint without a runtime dependency (no import cycle: client.py does not import session at module scope). In commands.py, hoisted the function-local from mxgateway.session import Session to a module-level import and added the -> Session return annotation to _session. Verified import mxgateway and import mxgateway_cli.commands succeed with no circular-import error. Pure refactor — covered by the existing import and CLI tests; no new test needed.

Client.Python-011

Field	Value
Severity	Low
Category	Error handling & resilience
Location	`clients/python/src/mxgateway/errors.py:122-148`
Status	Resolved

Description: ensure_mxaccess_success raises MxAccessError if any mx_status.success == 0. This treats success == 0 as the failure sentinel, but 0 is also the proto3 scalar default for an unset MxStatusProxy. If the gateway ever returns a reply with an unpopulated status entry (e.g. a partially-filled bulk result), the client raises MxAccessError even though no real failure occurred.

Recommendation: Confirm against the proto/gateway contract whether success is guaranteed populated for every statuses entry; if not, key the failure decision on an explicit failure field rather than the success == 0 default.

Resolution: 2026-05-18 — Confirmed against the gateway contract: success is not guaranteed populated for every statuses entry. src/MxGateway.Worker/Conversion/MxStatusProxyConverter.cs::ConvertMany emits a placeholder MxStatusProxy for a null MXSTATUS_PROXY COM array entry, setting Category/DetectedBy to Unknown but leaving Success at its proto3 default of 0. A fully-default proto entry likewise has success == 0. Under the old client logic either placeholder would falsely raise MxAccessError. Fixed ensure_mxaccess_success to key the per-status failure decision on a new _is_mxaccess_status_failure helper that requires success == 0 and a populated, non-OK category — a status with category of MX_STATUS_CATEGORY_UNSPECIFIED (default proto) or MX_STATUS_CATEGORY_UNKNOWN (the null-entry placeholder) is treated as unpopulated and ignored. MX_STATUS_CATEGORY_OK is also excluded so a genuine success entry never raises. Real failures (categories WARNING and the error categories, raw value ≥ 2) still raise as before — the existing write.mxaccess-failure fixture (SECURITY_ERROR/OPERATIONAL_ERROR statuses) and the MXACCESS_FAILURE protocol-status path are unaffected. TDD: tests/test_low_severity_findings.py adds test_ensure_mxaccess_success_ignores_unpopulated_status_entry (default + null-placeholder entries, no raise), test_ensure_mxaccess_success_raises_on_populated_failure_status (populated COMMUNICATION_ERROR, raises), and test_ensure_mxaccess_success_passes_when_status_reports_success. No public-behaviour change for genuine replies, so no README update.

Client.Python-012

Field	Value
Severity	Low
Category	mxaccessgw conventions
Location	`clients/python/src/mxgateway/client.py:84-108`, `clients/python/src/mxgateway/session.py:57-77`
Status	Won't Fix

Description: Session.invoke_raw does not run ensure_mxaccess_success while Session.invoke does, so a caller using invoke_raw for parity tests gets a reply where an MXAccess HRESULT failure is silently embedded with no exception. This is by design but under-documented — the README's "preserve raw replies" sentence does not state that *_raw methods skip MXAccess-failure detection entirely.

Recommendation: Document explicitly (README + docstring) that *_raw methods surface MXAccess HRESULT/status failures only inside the reply and do not raise MxAccessError, so parity-test callers know to inspect protocol_status/hresult/statuses themselves.

Resolution: 2026-05-18 — Won't Fix (no behaviour change). Confirmed this is intentional, correct parity behaviour: the *_raw methods exist precisely so parity-test callers can inspect an unmodified gateway reply, including embedded MXAccess HRESULT/status failures, without an exception masking them. Changing invoke_raw to raise MxAccessError would defeat its purpose and duplicate Session.invoke. The finding's only actionable point is the documentation gap, which has been addressed: clients/python/README.md now states explicitly that *_raw methods enforce gateway protocol success only and do not run MXAccess-failure detection, and the docstrings of GatewayClient.invoke_raw and Session.invoke_raw say the same and point callers to inspect protocol_status/hresult/statuses (and to Session.invoke for the checked variant). No code/test change — the runtime contract is unchanged and correct.

Client.Python-013

Field	Value
Severity	Medium
Category	Security
Location	`clients/python/src/mxgateway_cli/commands.py:757-762`
Status	Resolved

Description: _use_plaintext silently returns True whenever the endpoint string starts with localhost: or 127.0.0.1:, even if neither --plaintext nor --tls is supplied on the command line. Any CLI subcommand (e.g. mxgw-py open-session --endpoint localhost:5001 --api-key mxgw_<secret>) then attaches the API key to a plaintext gRPC channel without warning. This is a silent security downgrade: a user who deliberately ran the gateway behind TLS on loopback (e.g. for testing a production-shaped TLS config locally) and who passes --api-key expecting the secret to be transport-protected gets a plaintext bearer token instead. The auto-downgrade is also undocumented — README.md and the CLI --help text both describe --plaintext and --tls as the controls, with no mention that endpoint-prefix matching can override either. The other client CLIs do not auto-downgrade: the .NET CLI uses https://-prefix detection on a URI scheme (an explicit signal), Go and Java require an explicit --plaintext/--tls choice, and Rust defaults to plaintext only when plaintext = true is set on the options struct.

Recommendation: Drop the localhost-prefix auto-plaintext branch and require the user to pass --plaintext or --tls (or default to TLS to match the rest of the matrix). If the implicit-localhost behaviour is kept for ergonomics, document it prominently in both README.md and --help, emit a stderr warning when --api-key is combined with the auto-downgrade path, and add a CLI test asserting the auto-downgrade is in fact active so it is not silently lost in a future refactor.

Resolution: 2026-05-20 — Removed the silent localhost: / 127.0.0.1: auto-plaintext branch from _use_plaintext. The new contract matches the Go and Java CLIs: TLS is the default, --plaintext is the only way to opt in to an unencrypted channel, and --tls is accepted as a redundant, explicit affirmation of the default (mutually exclusive with --plaintext, which now raises click.UsageError). The --plaintext / --tls --help text and clients/python/README.md both call out the new behaviour. Added six regression tests in clients/python/tests/test_cli.py covering: (a) a localhost: endpoint with no flags resolves to TLS, (b) a 127.0.0.1: endpoint with no flags resolves to TLS, (c) --plaintext opts in to plaintext, (d) --tls is accepted and idempotent with the default, (e) --plaintext combined with --tls is rejected, and (f) an end-to-end CliRunner test asserting ClientOptions.plaintext == False flows through to GatewayClient.connect when no flag is supplied against a localhost: endpoint. Behaviour change for callers: scripts that previously relied on mxgw-py … --endpoint localhost:5000 … selecting plaintext silently must now add an explicit --plaintext flag (or set up TLS on the gateway). Calling mxgw-py with an --api-key against a plaintext-only gateway without --plaintext will now fail to connect rather than silently leaking the bearer token.

Client.Python-014

Field	Value
Severity	Low
Category	Code organization & conventions
Location	`clients/python/src/mxgateway_cli/commands.py:22-23`
Status	Resolved

Description: commands.py has two consecutive from mxgateway.values import lines:

from mxgateway.values import to_mx_value
from mxgateway.values import MxValueInput

These import from the same module and should be combined into a single from mxgateway.values import MxValueInput, to_mx_value. The split form is inconsistent with the rest of the file (every other module is imported in a single statement) and would be flagged by ruff/isort if any linter were configured. Pure style, no behavioural impact.

Recommendation: Collapse the two imports into one statement, ordered to match the conventional alphabetical-within-module pattern: from mxgateway.values import MxValueInput, to_mx_value.

Resolution: 2026-05-20 — Collapsed the two consecutive from mxgateway.values import to_mx_value / from mxgateway.values import MxValueInput lines in clients/python/src/mxgateway_cli/commands.py into a single from mxgateway.values import MxValueInput, to_mx_value statement, matching the alphabetical-within-module pattern used elsewhere in the file. Pure style fix — no behavioural impact, covered by the existing CLI tests.

Client.Python-015

Field	Value
Severity	Low
Category	Testing coverage
Location	`clients/python/src/mxgateway_cli/commands.py:273-294,564-647`, `clients/python/tests/`
Status	Resolved

Description: _bench_read_bulk is a ~80-line CLI body that opens its own session, registers, subscribe_bulks, runs a warm-up loop, a measurement loop, collects per-call latencies, computes a percentile summary, and emits the shared cross-language JSON schema. It is the largest untested CLI command in the module — tests/ has no bench_read_bulk test, fake-stub-driven or otherwise. A drift in the schema field names (callsPerSecond, cachedReadResults, latencyMs.p50, …) would break the cross-language scripts/bench-read-bulk.ps1 aggregation silently. _percentile_summary and _percentile are also untested — the boundary cases (n == 0, n == 1, quantile interpolation) would benefit from a small unit test since the identical algorithm is duplicated in the .NET / Go / Rust / Java drivers and a divergence would corrupt cross-language comparisons.

Recommendation: Add a fake-stub-driven bench_read_bulk test that drives a short --duration-seconds 0 --warmup-seconds 0 run through CliRunner and asserts the JSON schema (language == "python", the full key set, latencyMs.p50/p95/p99/max/mean present). Add unit tests for _percentile covering n == 0, n == 1, and a known-good interpolated value at p95 so the implementation cannot silently drift from the other clients.

Resolution: 2026-05-20 — Added clients/python/tests/test_cli_bench_and_helpers.py with three layers of coverage. (1) _percentile unit tests pin the cross-language algorithm (rank = q * (n - 1), linear interpolation between adjacent ranks): empty sample returns 0.0, single element returns that element, exact-rank queries return the sample value (p50 of [10,20,30,40,50] is 30.0), and the interpolated p95/p99 values (48.0 / 49.6 for that same five-element sample) are locked down so any drift from the .NET / Go / Rust / Java drivers fails fast. (2) _percentile_summary tests assert the full {p50, p95, p99, max, mean} dict shape, the zero-sample placeholder, and the 3-decimal rounding contract. (3) A bench-read-bulk smoke test (test_bench_read_bulk_emits_cross_language_schema) drives the CLI through CliRunner with --duration-seconds 0 --warmup-seconds 0 against a fake stub that handles OpenSession, Register, SubscribeBulk, ReadBulk, and UnsubscribeBulk, then asserts the emitted JSON has exactly the 16 cross-language schema keys (language, command, endpoint, clientName, bulkSize, durationSeconds, warmupSeconds, durationMs, tags, totalCalls, successfulCalls, failedCalls, totalReadResults, cachedReadResults, callsPerSecond, latencyMs) and that latencyMs is a {p50, p95, p99, max, mean} sub-object — guarding against silent breakage of scripts/bench-read-bulk.ps1's cross-language aggregation. No source change — this is a pure coverage finding.

Client.Python-016

Field	Value
Severity	Low
Category	Testing coverage
Location	`clients/python/src/mxgateway_cli/commands.py:25,757-775,805-830`
Status	Resolved

Description: Three CLI helper paths are not covered by tests/:

_use_plaintext localhost auto-downgrade (line 762) — the endpoint.startswith("localhost:") or endpoint.startswith("127.0.0.1:") branch (see also Client.Python-013) is untested; no test asserts that an endpoint without --plaintext and without --tls resolves to plaintext.
_collect_events MAX_AGGREGATE_EVENTS guard (line 811-815) — passing --max-events greater than MAX_AGGREGATE_EVENTS raises click.BadParameter, but no test exercises the guard. A silent removal of the constant or the comparison would not be caught.
_api_key_from_env (line 765-768) — only the implicit path through _secrets is exercised; there is no test that verifies an env-var name resolves to a value and that an unset env var produces None.

These are all small, fake-stub-driven CLI behaviours rather than end-to-end paths. The previous coverage finding (Client.Python-009) closed without adding tests for these specific paths.

Recommendation: Add three small CliRunner / unit tests: one asserting the localhost auto-plaintext (or its replacement, if Client.Python-013 is fixed), one asserting --max-events 10001 exits non-zero with the MAX_AGGREGATE_EVENTS error message, and one asserting _api_key_from_env("MXGATEWAY_API_KEY") returns the env value and None for an unset variable.

Resolution: 2026-05-20 — Scope adjusted: Client.Python-013 has since removed the _use_plaintext localhost auto-plaintext branch, so item (1) is no longer a real code path — the test_use_plaintext_requires_explicit_flag_for_localhost_endpoint and test_cli_localhost_endpoint_defaults_to_tls_via_open_session regressions added under Client.Python-013 already pin the new TLS-by-default contract. The remaining two helpers are now covered in clients/python/tests/test_cli_bench_and_helpers.py. (2) MAX_AGGREGATE_EVENTS cap: test_collect_events_rejects_max_events_above_aggregate_cap drives stream-events with --max-events 10001 through CliRunner against stubbed _connect / _session fakes and asserts the CLI exits non-zero with the documented less than or equal to 10000 message; test_collect_events_accepts_max_events_at_aggregate_cap_boundary confirms --max-events 10000 is accepted at the boundary and returns an empty event list. (3) _api_key_from_env: test_api_key_from_env_resolves_value_when_variable_is_set (env-var populated → returned), test_api_key_from_env_returns_none_when_variable_is_unset (env-var unset → None), test_api_key_from_env_returns_none_when_name_is_none (the name is None early-return), and test_api_key_from_env_returns_none_when_name_is_empty_string (the if not name truthiness guard). No source change — pure coverage finding.

Client.Python-017

Field	Value
Severity	Low
Category	Documentation & comments
Location	`clients/python/pyproject.toml:5-25`, `clients/python/src/mxgateway/`
Status	Resolved

Description: The package metadata in pyproject.toml is minimal for a published wheel:

No authors field. PyPI / pip show will display no author.
No license field, no license-files field, and no LICENSE file is referenced from the project. The repo as a whole has no top-level LICENSE either, but other client packages (Java has a license entry, the .NET package has a license expression in the csproj) tend to set this.
No classifiers (no Programming Language :: Python :: 3.12, Operating System :: Microsoft :: Windows, Topic :: …, no development-status classifier). Without these the PyPI search facets are empty and tooling like pip cannot tell whether the package is alpha/beta/stable.
No keywords, no [project.urls] (no homepage / source / issue link pointing back to the repo).
The package ships no PEP 561 py.typed marker file in src/mxgateway/. Type hints are written throughout the module (from __future__ import annotations, full annotations on every public function), but downstream consumers running mypy on mxaccess-gateway-client will not see those hints — PEP 561 requires the marker file to opt the package into type-stub distribution.

Recommendation: Add authors, license = "<spdx>", keywords, and [project.urls] to pyproject.toml; add at least the standard classifiers trio (Development Status, Programming Language :: Python :: 3.12, Intended Audience); create an empty src/mxgateway/py.typed file and include it in the wheel via [tool.setuptools.package-data] so consumers running mypy against an installed wheel pick up the type information.

Resolution: 2026-05-20 — Filled out clients/python/pyproject.toml with the missing PyPI metadata: authors = [{ name = "MXAccess Gateway Authors" }], license = "Proprietary" (the repo has no top-level LICENSE file and no other client publishes under an OSS licence, so the SPDX Proprietary expression matches the de-facto status), the standard classifier set (Development Status :: 4 - Beta, Intended Audience :: Developers / Information Technology, Operating System :: Microsoft :: Windows and :: POSIX, Programming Language :: Python / Python :: 3 / Python :: 3.12, Topic :: Software Development :: Libraries :: Python Modules, Topic :: System :: Distributed Computing, and Typing :: Typed), a keywords list (mxaccess, archestra, gateway, grpc, industrial, scada), and [project.urls] with Homepage / Source / Issues pointing at the Gitea repo. Added the PEP 561 marker file clients/python/src/mxgateway/py.typed (empty, as the spec requires) and declared it in [tool.setuptools.package-data] mxgateway = ["py.typed"] so the wheel ships the marker and downstream mypy users see the inline type hints. Pure metadata / packaging change — python -m pytest -q still passes (91 tests).

Client.Python-018

Field	Value
Severity	High
Category	Code organization & conventions
Location	`clients/python/pyproject.toml:11`
Status	Resolved

Description: The Client.Python-017 resolution set license = "Proprietary" as a top-level string. Under PEP 639 (enforced by setuptools >= 77, and active in the installed setuptools 82.0.1), the project.license string form must be a valid SPDX expression. "Proprietary" is not a registered SPDX identifier, so the configured build backend (setuptools.build_meta) refuses the file outright. Both python -m pip wheel . --no-deps --wheel-dir … and python -m pip install -e . — the exact commands documented in clients/python/README.md ("Build And Test", "Packaging") and the "build wheel" instruction in docs/ClientPackaging.md — now fail before any source is compiled with:

ValueError: invalid pyproject.toml config: `project.license`.
configuration error: `project.license` must be valid exactly by one definition (0 matches found):
    - {type: string, format: 'SPDX'}
    - type: table keys: 'file': … required: ['file']
    - type: table keys: 'text': … required: ['text']

python -m pytest still runs because [tool.pytest.ini_options] pythonpath = ["src"] lets pytest import the package without an install — which masked the regression at commit time and explains how the Client.Python-017 resolution comment was able to assert "python -m pytest -q still passes (91 tests)" while shipping a wheel build that cannot start. The Client.Python-017 resolution comment that "the SPDX Proprietary expression matches the de-facto status" is incorrect: Proprietary is not a registered SPDX identifier; only entries on the SPDX licence list (e.g. MIT, Apache-2.0, BSD-3-Clause) or LicenseRef-* custom identifiers satisfy the { type: string, format: 'SPDX' } rule. PEP 639 added the LicenseRef-… escape hatch precisely for proprietary / unlisted licences.

This is a regression of the developer-onboarding workflow introduced by the very commit being reviewed. A fresh checkout cannot run python -m pip install -e ".[dev]" (the command in CLAUDE.md's "Clients" section) without first patching pyproject.toml.

Recommendation: Fix the license value so the build backend accepts it. Three concrete options, in order of preference:

Use a LicenseRef-* SPDX-compatible custom identifier: license = "LicenseRef-Proprietary". Requires no additional LICENSE file and is honoured by setuptools / pip / PyPI as a proprietary marker.
Add a top-level LICENSE file (or clients/python/LICENSE) and point at it via the table form: license = { file = "LICENSE" }. This also documents the proprietary terms.
Drop the license key entirely and convey the same intent via the classifier "License :: Other/Proprietary License" (already part of the classifier set), reverting the PEP-639 string field that the build backend now insists must be SPDX.

Add a CI / pre-commit check that runs python -m pip wheel . --no-deps (or python -m build) on clients/python so a future pyproject.toml regression is caught at commit time rather than at first install on a clean machine. See also Client.Python-020.

Resolution: 2026-05-20 — Dropped the invalid top-level license = "Proprietary" string from clients/python/pyproject.toml and added the existing License :: Other/Proprietary License trove classifier to convey the same intent without violating PEP 639's SPDX rule. No LICENSE file exists at the repo root or under clients/python/, so the license = { file = "LICENSE" } table form was not used; relying on the classifier is the option (3) variant called out in the recommendation. Verified by running python -m pip wheel . --no-deps -w ./.test-wheel-output from clients/python: the build now succeeds and emits mxaccess_gateway_client-0.1.0-py3-none-any.whl (47 KB) where previously it failed with the project.license must be valid exactly by one definition ValueError. The CI / pre-commit recommendation is addressed by Client.Python-020.

Client.Python-019

Field	Value
Severity	Low
Category	Code organization & conventions
Location	`clients/python/pyproject.toml:60-61`, `clients/python/src/mxgateway_cli/`
Status	Resolved

Description: Client.Python-017 added the PEP 561 marker file clients/python/src/mxgateway/py.typed and declared it in [tool.setuptools.package-data] mxgateway = ["py.typed"]. The wheel therefore advertises mxgateway as typed. However the same wheel also ships the mxgateway_cli package (setuptools.packages.find with where = ["src"] discovers both mxgateway and mxgateway_cli, confirmed via find_packages in this review), and mxgateway_cli:

is shipped in the wheel and is the package the mxgw-py console script entry point resolves into ([project.scripts] mxgw-py = "mxgateway_cli.commands:main"),
is fully type-annotated (every function in commands.py has full parameter and return annotations; from __future__ import annotations is in effect),
but has no py.typed file and is not listed in [tool.setuptools.package-data].

PEP 561 requires the marker file inside each importable package the distribution wants to expose to type checkers — the mxgateway marker does not transfer to mxgateway_cli. A downstream consumer that imports or composes against mxgateway_cli (e.g. wrapping it as a programmatic CLI library) will see all symbols as Untyped under mypy despite the hints being present in source.

This is a follow-up to Client.Python-017 — the fix is small and pure packaging.

Recommendation: Create clients/python/src/mxgateway_cli/py.typed (empty file, as PEP 561 requires) and extend the existing package-data declaration so the wheel ships it:

[tool.setuptools.package-data]
mxgateway = ["py.typed"]
mxgateway_cli = ["py.typed"]

No source change in either package; verify by building a wheel (once Client.Python-018 is fixed) and inspecting that both mxgateway/py.typed and mxgateway_cli/py.typed appear in the wheel contents.

Resolution: 2026-05-20 — Created the empty PEP 561 marker file clients/python/src/mxgateway_cli/py.typed and added mxgateway_cli = ["py.typed"] under [tool.setuptools.package-data] in clients/python/pyproject.toml alongside the existing mxgateway = ["py.typed"] line. Verified by inspecting the built wheel (mxaccess_gateway_client-0.1.0-py3-none-any.whl): the archive now contains both mxgateway/py.typed and mxgateway_cli/py.typed, so downstream mypy consumers see the inline type hints in both packages. Pure packaging change — no source modifications.

Client.Python-020

Field	Value
Severity	Low
Category	Testing coverage
Location	`clients/python/tests/`, `scripts/`
Status	Resolved

Description: Client.Python-018 is invisible to the existing test suite: python -m pytest passes because [tool.pytest.ini_options] pythonpath = ["src"] lets pytest import the package without going through setuptools.build_meta. None of the 91 tests build the wheel, do an editable install, or otherwise exercise the setuptools.build_meta configuration validator. As a result, a pyproject.toml regression that breaks pip install -e . / pip wheel . — the exact commands documented in the Python client README and CLAUDE.md — passes the test suite green. The other language clients have parallel coverage gaps (no CI-level "the package installs" smoke test for Python in scripts/run-client-e2e-tests.ps1, which only runs the live e2e matrix and assumes the editable install already worked), but Python is the only one whose published install command is currently broken.

Recommendation: Add a thin pytest module (e.g. tests/test_packaging.py) that runs

import subprocess, sys, pathlib
def test_pyproject_validates_against_setuptools_build_meta():
    here = pathlib.Path(__file__).resolve().parent.parent
    result = subprocess.run(
        [sys.executable, "-m", "pip", "wheel", ".",
         "--no-deps", "--no-build-isolation",
         "--wheel-dir", str(tmp_path)],
        cwd=here, capture_output=True, text=True,
    )
    assert result.returncode == 0, result.stderr

(or any equivalent that invokes setuptools.config.pyprojecttoml.read_configuration). Marker the test with @pytest.mark.slow if the wheel build is too heavy for the default suite, and document the test in the README. Alternatively add a CI step to scripts/run-client-e2e-tests.ps1 (or a new scripts/check-python-package.ps1) that fails the build when the wheel build fails. Either approach would have surfaced Client.Python-018 at commit time.

Resolution: 2026-05-20 — Added clients/python/tests/test_packaging.py::test_pip_wheel_build_succeeds. The test invokes python -m pip wheel . --no-deps --wheel-dir <tmp> against the package root via subprocess and asserts (a) exit code zero and (b) an mxaccess_gateway_client-*.whl file is produced in the temp directory, capturing stdout/stderr in the assertion message on failure so any future PEP 639 / SPDX violation or other setuptools.build_meta configuration error is reported with the build backend's own error text. Verified the test would have caught Client.Python-018: with the old license = "Proprietary" string in place the test fails with the project.license must be valid exactly by one definition ValueError. The pytest module is the simpler half of the recommendation; no PowerShell wrapper script was added since pytest already runs in the same python -m pytest invocation the README documents. Test suite is now 92 tests (was 91), all passing.

Client.Python-021

Field	Value
Severity	Low
Category	Documentation & comments
Location	`clients/python/src/mxgateway_cli/commands.py`, `clients/python/README.md:235-258`
Status	Resolved

Description: Cross-client CLI parity check (one of the things the review prompt asks for): the mxgw-py CLI subcommand set has drifted from every other client CLI in the matrix.

Subcommand inventory at this commit:

Subcommand	.NET (`mxgw`)	Go (`mxgw-go`)	Rust (`mxgw`)	Java (`mxgw-java`)	Python (`mxgw-py`)
`version`	yes	yes	yes	yes	yes
`ping`	yes	(no)	yes	(no)	yes
`open-session` / `close-session`	yes	yes	yes	yes	yes
`register` / `add-item` / `advise`	yes	yes	yes	yes	yes
`subscribe-bulk` / `unsubscribe-bulk` / `read-bulk`	yes	yes	yes	yes	yes
`write-bulk` / `write2-bulk` / `write-secured-bulk` / `write-secured2-bulk`	yes	yes	yes	yes	yes
`write` / `write2`	yes / (varies)	yes / (no)	yes / yes	yes / (no)	yes / yes
`stream-events`	yes	yes	yes	yes	yes
`smoke`	yes	yes	yes	yes	yes
`bench-read-bulk`	yes	yes	yes	yes	yes
`bench-stream-events`	yes	(no)	(no)	(no)	(no)
`galaxy-test-connection` (or alias)	yes	yes	yes	yes	(no)
`galaxy-last-deploy` / `galaxy-deploy-time`	yes	yes	yes	yes	(no)
`galaxy-discover`	yes	yes	yes	yes	(no)
`galaxy-watch`	yes	yes	yes	yes	(no)

Two parity gaps remain after Client.Python-013/017:

The Python CLI ships no Galaxy subcommands at all even though the GalaxyRepositoryClient library wrapper is fully implemented and exercised by tests/test_galaxy.py / tests/test_galaxy_iter_hierarchy.py. The README acknowledges the watch-deploy-events gap inline ("The CLI does not currently expose a streaming watch-deploy-events subcommand — use the library API directly when subscribing to deploy events from Python.") but does not call out that the other three Galaxy subcommands are also missing — and the .NET / Go / Rust / Java CLIs all expose them. A user running the cross-language smoke matrix who expects Python to behave like the other clients sees a silent "command not found" on mxgw-py galaxy-test-connection.
The new bench-stream-events subcommand (added to the .NET CLI in the previous commit 1cd51bb) is .NET-only today; the Python CLI is consistent with Go / Rust / Java on this point. Worth flagging as a forward-looking parity gap that will need filling if the cross-language benchmark matrix grows a stream-events driver in scripts/.

Severity is Low because the existing scripts/bench-read-bulk.ps1 matrix only invokes bench-read-bulk and does not break, and the Python GalaxyRepositoryClient library is fully functional — the gap is purely in the test CLI surface. But cross-client parity is an explicit review check and the gap is not documented.

Recommendation: Either (a) add galaxy-test-connection, galaxy-last-deploy, galaxy-discover, and galaxy-watch subcommands to mxgateway_cli/commands.py (each is a thin wrapper over GalaxyRepositoryClient, mirroring the existing four-language implementation), or (b) update clients/python/README.md's "CLI" section with an explicit "CLI parity gaps" subsection that lists the missing subcommands and recommends the library API. Option (a) is preferable for cross-language matrix testing. Also document the bench-stream-events gap symmetrically once a cross-language stream benchmark driver is added under scripts/.

Resolution: 2026-05-20 — Scoped this finding to a documentation-only fix; the full Galaxy CLI parity implementation (four new subcommands wired to GalaxyRepositoryClient) is a larger piece of work and will be tracked as a separate follow-up finding. Added a new "CLI Parity Gaps" subsection to clients/python/README.md immediately under the existing CLI section that explicitly enumerates the four missing mxgw-py Galaxy subcommands (galaxy-test-connection, galaxy-last-deploy, galaxy-discover, galaxy-watch), names the sibling CLIs that already expose them (.NET mxgw, Go mxgw-go, Rust mxgw, Java mxgw-java), points readers at the library API (GalaxyRepositoryClient, already documented under "Galaxy Repository Browse") as the supported Python entry point in the interim, and also flags the .NET-only bench-stream-events gap so the cross-language benchmark matrix has a record of the asymmetry. No CLI source change; the implementation of the four Galaxy subcommands is deferred. Resolved as a doc note rather than a full parity fix.

Client.Python-022

Field	Value
Severity	High
Category	Documentation & comments
Location	`clients/python/README.md:201-202`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:389-420`
Status	Resolved

Description: The README CLI examples added by commit 8738735 for the new alarm subcommands cite flags the CLI does not accept:

mxgw-py stream-alarms --session-id <id> --max-messages 1 --json
mxgw-py acknowledge-alarm --session-id <id> --alarm-reference "\\Galaxy\Area001.Pump001.PumpFault" --json

Both subcommands are session-less (the alarm feed is served by the gateway itself, not a worker session — see the docstring on acknowledge_alarm, "Acknowledge an active MXAccess alarm condition (session-less)"). Neither @main.command("stream-alarms") nor @main.command("acknowledge-alarm") declares a --session-id option, and acknowledge-alarm declares the ack-target as --reference, not --alarm-reference. A user copy-pasting either example gets Error: no such option: --session-id (or --alarm-reference) and exits non-zero before any RPC is attempted.

This drift is invisible to the test suite because tests/test_cli.py::test_acknowledge_alarm_requires_reference only asserts that the missing-flag error mentions --reference — it does not validate the README at all. The .NET / Go / Rust / Java alarm CLI examples in the sibling READMEs are consistent with their actual flag names, so the Python README is the only one out of step with its implementation.

Recommendation: Either fix the README examples to match the implementation (remove --session-id from both lines, rename --alarm-reference to --reference), or — if cross-client parity wants the longer flag name — rename the CLI option to --alarm-reference and add a test that copy-pastes the README examples through CliRunner to assert they parse. Option (1) is the smaller change.

Resolution: 2026-05-24 — Fixed the README examples to match the implementation (option 1, smaller change). clients/python/README.md:201-202 now reads mxgw-py stream-alarms --max-messages 1 --json and mxgw-py acknowledge-alarm --reference "\\Galaxy\Area001.Pump001.PumpFault" --json — --session-id is dropped from both lines (the alarm feed is gateway-served, session-less) and --alarm-reference is renamed to the real --reference flag. Regression test tests/test_review_findings_022_to_026.py::test_readme_alarm_examples_parse_against_cli extracts every mxgw-py … line from the README, appends --help so only the parser runs, and asserts that no example produces a no such option Click error. Failed before the fix (the original stream-alarms --session-id <id> … line emitted Error: No such option: --session-id), passes after.

Client.Python-023

Field	Value
Severity	Medium
Category	Security
Location	`clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:901-906`
Status	Resolved

Description: Client.Python-013 (severity Medium, Security) was marked Resolved on 2026-05-20 with the explicit claim that the silent localhost: / 127.0.0.1: auto-plaintext branch had been removed from _use_plaintext. The re-review at d692232 re-asserted this in its checklist ("_use_plaintext now requires explicit --plaintext opt-in (Client.Python-013 resolution verified)").

The branch is still present in the reviewed source at HEAD 42b0037:

def _use_plaintext(kwargs: dict[str, Any]) -> bool:
    if kwargs.get("use_tls"):
        return False
    if kwargs.get("plaintext"):
        return True
    return kwargs["endpoint"].startswith("localhost:") or kwargs["endpoint"].startswith("127.0.0.1:")

The same code is present in git show d692232:clients/python/src/zb_mom_ww_mxgateway_cli/commands.py, so the regression entered at or before the rename commit 397d3c5 (which created src/zb_mom_ww_mxgateway_cli/commands.py from scratch with the pre-Client.Python-013 body) and was not noticed in the prior re-review.

The original security argument is unchanged: a user who runs the gateway behind TLS on loopback for production-shaped local testing and passes --api-key mxgw_<secret> against localhost:5001 silently gets a plaintext gRPC channel, with the bearer token attached to it. The other clients (.NET https-prefix detection, Go / Java explicit --plaintext, Rust opt-in) do not auto-downgrade. The Client.Python-013 resolution also added six regression tests in tests/test_cli.py that asserted the explicit-flag contract; those tests do not exist in the current tests/test_cli.py — either they were lost in the rename or never carried over.

Recommendation: Re-apply the Client.Python-013 fix on the current source: drop the endpoint.startswith("localhost:") or endpoint.startswith("127.0.0.1:") branch, make --plaintext and --tls mutually exclusive, default to TLS, and add an assertion-time test that copy-pastes the Client.Python-013 regression-test fixture into tests/test_cli.py. Because Client.Python-013 is marked Resolved with a 2026-05-20 commit reference, do not silently re-resolve this finding — keep it Open with a fresh ID so the regression audit trail is preserved.

Resolution: 2026-05-24 — Re-applied the Client.Python-013 fix on the renamed CLI module. Dropped the endpoint.startswith("localhost:") or endpoint.startswith("127.0.0.1:") auto-plaintext branch from _use_plaintext in clients/python/src/zb_mom_ww_mxgateway_cli/commands.py. TLS is now the default and --plaintext is the only way to opt in to plaintext; --tls is accepted as a redundant affirmation and the two flags combined raise click.UsageError. Regression tests live in tests/test_review_findings_022_to_026.py: test_use_plaintext_does_not_auto_downgrade_for_localhost_endpoint and test_use_plaintext_does_not_auto_downgrade_for_loopback_ipv4_endpoint exercise the bare-endpoint path, test_use_plaintext_requires_explicit_plaintext_flag and test_use_plaintext_tls_flag_explicitly_disables_plaintext pin the explicit opt-in / opt-out contract, test_use_plaintext_rejects_plaintext_and_tls_combined asserts mutual exclusivity, and test_cli_localhost_endpoint_with_no_flags_uses_tls_channel is an end-to-end CliRunner test that intercepts GatewayClient.connect and asserts the resolved ClientOptions.plaintext is False for a localhost:5000 endpoint without --plaintext. All five tests failed against the pre-fix source and pass against the fix. Behaviour change for callers: scripts that previously relied on mxgw-py … --endpoint localhost:5000 … selecting plaintext silently must now add an explicit --plaintext flag (or set up TLS on the gateway).

Client.Python-024

Field	Value
Severity	Medium
Category	Code organization & conventions
Location	`clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:13,48-119`
Status	Resolved

Description: The new batch subcommand (commit 71d2c39) implements the cross-language batch protocol by importing click.testing.CliRunner into production code and calling runner.invoke(main, args, catch_exceptions=True) in a for raw_line in sys.stdin: loop. CliRunner is documented as a testing helper:

It replaces sys.stdin / sys.stdout / sys.stderr with io.StringIO during each invoke(), swallowing any side-channel output the inner command writes directly to the real streams (the existing CLI bodies do not, but any future helper that calls print() mid-command will be silently captured into result.output rather than reaching the harness real-time).
It captures the inner command into an Exception rather than letting Click's normal exit code propagate, so result.exit_code is the pseudo-exit of a SystemExit translation, not the real process exit.
Click does not guarantee CliRunner is stable across versions — click 9 has already deprecated runner.invoke(..., mix_stderr=...), and a future Click release could change the return-tuple shape.
It is recursively re-entrant: runner.invoke(main, ["batch"], ...) inside batch silently spawns a nested batch reading from the same StringIO-replaced stdin (empty), so a stdin line of batch exits cleanly with no error — almost certainly not the intended semantics.

The other client CLIs in the matrix (.NET, Go, Rust, Java) implement batch by dispatching to their command parser directly, not by re-invoking the test runner.

Recommendation: Replace CliRunner with a direct call into the Click parser, e.g. main.main(args, standalone_mode=False) wrapped in a try/except click.ClickException to convert Click exit conditions into the {"error": ..., "type": ...} payload. Capture stdout via a per-line context manager (e.g. contextlib.redirect_stdout(io.StringIO())) so the batch loop can interleave inner-command output with the __MXGW_BATCH_EOR__ sentinel without depending on the testing API. Add a regression test that drives batch with batch\n on stdin and asserts recursive invocation is either rejected or correctly bounded.

Resolution: 2026-05-24 — Removed the from click.testing import CliRunner import and the CliRunner() instantiation from clients/python/src/zb_mom_ww_mxgateway_cli/commands.py. The batch command body now dispatches each stdin line through a new _dispatch_batch_line helper that calls main.main(args=…, standalone_mode=False, prog_name="mxgw-py") directly and captures the subcommand's stdout via contextlib.redirect_stdout(io.StringIO()). Click exit conditions (click.exceptions.Exit, click.ClickException, SystemExit) are caught and rendered as {"error": …, "type": …} JSON; arbitrary exceptions are caught with a broad except Exception so the batch loop never dies. A nested batch line is rejected outright with a RecursiveBatchError JSON record before the dispatcher runs, eliminating the silent-recursive-spawn footgun the original CliRunner.invoke(main, ["batch"], …) path enabled. Regression tests: tests/test_review_findings_022_to_026.py::test_batch_command_does_not_use_clirunner_in_production asserts the production module no longer imports from click.testing or calls CliRunner(; and test_batch_recursive_batch_line_is_bounded drives a batch\nversion --json\n stdin payload and asserts the recursive batch line emits an error JSON record rather than silently exiting. The pre-existing batch tests (test_batch_runs_version_command_and_writes_eor, test_batch_terminates_on_empty_line, test_batch_continues_after_error_line) still pass against the new implementation, confirming the wire-level contract (one EOR per line, clean JSON error blocks) is preserved.

Client.Python-025

Field	Value
Severity	Low
Category	Testing coverage
Location	`clients/python/tests/test_cli.py`, `clients/python/src/zb_mom_ww_mxgateway/{client.py,session.py}`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py`
Status	Resolved

Description: Commits 6add4b4 and 828e3e6 added five new SDK methods (Session.read_bulk, Session.write_bulk, Session.write2_bulk, Session.write_secured_bulk, Session.write_secured2_bulk), GatewayClient.stream_alarms, the helper _canceling_alarm_feed_iterator, and eight new CLI subcommands (read-bulk, write-bulk, write2-bulk, write-secured-bulk, write-secured2-bulk, bench-read-bulk, stream-alarms, acknowledge-alarm). The only test coverage added in tests/test_cli.py is:

test_stream_alarms_is_registered — --help smoke only.
test_acknowledge_alarm_requires_reference — verifies the missing-flag Click error contains --reference; no happy-path test.

There is no test that:

Asserts Session.read_bulk / write_bulk / write2_bulk / write_secured_bulk / write_secured2_bulk builds the expected MxCommand shape (kind, sub-message, server_handle, entries) — the prior Client.Python-009 coverage pattern (test_add_item2_sends_*, test_write2_sends_value_and_timestamp_value) is not extended to the bulk family even though they ship the same wire-shape risk.
Exercises _canceling_alarm_feed_iterator cancel-on-task-cancellation (the Client.Python-007 helper test pattern is not extended).
Drives stream-alarms / acknowledge-alarm / read-bulk / write-bulk / write-secured-bulk happy paths through CliRunner with a fake stub — the existing test_cli_register_happy_path_emits_server_handle pattern is not extended.
Asserts bench-read-bulk emits the cross-language schema for the new read-bulk-shaped fields (the Client.Python-015 pattern existed for the previous bench command but no equivalent exists for this one).

A silent drift in any of the four bulk-write request shapes — or a schema drift on bench-read-bulk — would not be caught.

Recommendation: Extend the Client.Python-009 / Client.Python-015 / Client.Python-016 patterns: add request-shape tests for the four new bulk methods, a CLI happy-path test for read-bulk / write-bulk / stream-alarms / acknowledge-alarm against fake stubs, and a cross-language schema test for bench-read-bulk mirroring test_bench_read_bulk_emits_cross_language_schema (with --read-bulk applied to the renamed bench). At minimum, add a request-shape test for write_secured_bulk since the secured family is the highest-risk parity surface.

Resolution: 2026-05-24 — Added behavioural test coverage for the five new bulk SDK methods, stream_alarms, and the new CLI subcommand bodies in tests/test_review_findings_022_to_026.py. Request-shape tests (test_session_read_bulk_sends_expected_request_shape, test_session_write_bulk_sends_expected_request_shape, test_session_write2_bulk_sends_expected_request_shape, test_session_write_secured_bulk_sends_expected_request_shape, test_session_write_secured2_bulk_sends_expected_request_shape) drive each Session.*_bulk method against a fake Invoke stub and assert the captured MxCommand's kind, sub-message, server_handle, and per-entry fields (including current_user_id / verifier_user_id on the secured family — the highest-risk parity surface the finding calls out). test_stream_alarms_yields_feed_messages_and_cancels_on_close covers the GatewayClient.stream_alarms happy path including the _canceling_alarm_feed_iterator cancel-on-close contract and the authorization metadata header. CLI happy-path tests (test_cli_read_bulk_happy_path, test_cli_write_bulk_happy_path, test_cli_stream_alarms_happy_path, test_cli_acknowledge_alarm_happy_path) each drive their subcommand through CliRunner against a fake stub injected via a monkeypatched GatewayClient.connect and assert the emitted JSON shape and that the captured RPC request carries the expected fields. The four CLI happy-path tests passed even before any production fix (the implementations were correct; the finding is a coverage gap), but they now exist as regression guards against future drift. No source change — pure coverage finding.

Client.Python-026

Field	Value
Severity	Low
Category	Correctness & logic bugs
Location	`clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:674-738`
Status	Resolved

Description: Two minor quality issues in the new _bench_read_bulk body (commit 6add4b4):

import time is done inside the function body (line 676) rather than at module top. PythonStyleGuide.md does not state this explicitly, but every other helper in commands.py imports its dependencies at module top, and time is already imported (transitively) elsewhere in the package. The function-local import is a vestige of incremental development and should be hoisted.
The finally cleanup block uses two consecutive bare except Exception: pass blocks to swallow unsubscribe_bulk and session.close() failures (lines 733-734 and 737-738). This silently discards diagnostic information about cleanup failures — e.g. a transient gateway crash mid-benchmark or a protocol error during unsubscribe — and matches an anti-pattern the rest of the module avoids (the _bench_read_bulk analogues in the other clients log the cleanup failure before swallowing it).

Both are Low severity. The bench command is best-effort by design (the benchmark output is what matters; cleanup failures are not user-facing), but a single log line on cleanup failure would make a future regression visible at the next benchmark run rather than silently corrupting the worker's subscription bookkeeping until a session-level GC sweep.

Recommendation: Move import time to the module-level import block. Replace each except Exception: pass with except Exception as exc: logger.warning("bench-read-bulk cleanup: %s", exc) against a module-level logger = logging.getLogger(__name__). No behavioural change in the happy path; failure path becomes diagnosable. No new test required for the import hoist; the logger change is exercised by the existing bench smoke test once caplog is added to the test signature.

Resolution: 2026-05-24 — Hoisted import time to the module-level import block in clients/python/src/zb_mom_ww_mxgateway_cli/commands.py alongside the existing standard-library imports; the function-local import time line at the top of _bench_read_bulk is gone. Added a module-level logger = logging.getLogger(__name__) and rewrote the two finally cleanup blocks to bind the swallowed exception and log it at WARNING level — unsubscribe_bulk failures now emit "bench-read-bulk: unsubscribe_bulk cleanup failed: %s" and the session.close() failure path emits the equivalent — so a future regression in the cleanup path is diagnosable at the next benchmark run rather than silently corrupting subscription bookkeeping. Regression tests in tests/test_review_findings_022_to_026.py: test_commands_module_imports_time_at_module_scope uses inspect.getsource(_bench_read_bulk) to assert no function-local import time line, and asserts the module exposes time at module scope; test_commands_module_bench_read_bulk_does_not_use_bare_except_pass greps the function source for the except Exception:\n pass pattern and rejects it. Both tests failed against the pre-fix source and pass against the fix.

72 KiB Raw Blame History