Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit 1cd51bb, filed 72 new findings, and
fixed them in three priority waves (3 High, 17 Medium, 52 Low).
Highs
- Server-017: enumerate AcknowledgeAlarm / QueryActiveAlarms in
GatewayGrpcScopeResolver so non-admin keys can use them; document
the mapping in docs/Authorization.md; add interceptor tests.
- Client.Java-013: add the five missing bulk-method stubs to the
CLI FakeSession so the test module compiles on a clean tree.
- Client.Rust-013: fix the clippy::doc_lazy_continuation regression
in generated tonic code by reformatting the ReadBulkCommand proto
comment and scoping a #![allow(...)] to the generated submodules.
Mediums (highlights)
- Server: unify GatewaySession state-lock discipline (-015) and
make DisposeAsync race-safe against in-flight CloseAsync (-016);
add constraint-enforcement test coverage for the bulk-plan path
(-021).
- Worker: introduce StaRuntimeShutdownException so RunAlarmPollLoop
can distinguish graceful shutdown from a real STA-affinity
violation (-016); have the watchdog skip StaHung while
CurrentCommandCorrelationId is non-empty so a legitimate slow
ReadBulk no longer self-faults (-017).
- Tests: add per-method round-trip + cancellation coverage for the
11 GatewaySession bulk methods (-013); replace the real TCP probe
in GalaxyHierarchyCacheTests with an IGalaxyRepository fake
(-016).
- IntegrationTests: drive the StreamEvents writer in the live Write
test and assert OnWriteComplete (-012); add live tests for
Unadvise/RemoveItem/Unregister ordering, WriteSecured, and
abnormal worker exit (-014).
- Worker.Tests: replace MxAccessSession reflection with an internal
CreateForTesting factory (-016); cover WorkerCancel and
unexpected-body envelope branches (-017).
- Client.Java: cancel MxEventStream when close() races
beforeStart() (-014); return a CancellingCompletableFuture that
actually forwards cancellation through .thenApply chains (-015).
- Client.Python: drop the silent localhost-plaintext downgrade in
the CLI; require explicit --plaintext (-013).
- Client.Rust: stop bench-read-bulk from polluting success-latency
histograms with failed-call durations (-015); add coverage for
the five MalformedReply paths, the bulk-write helpers, the
Error::Unavailable mapping, and the unary-fault path (-016).
- Contracts: extend docs/Contracts.md with the bulk read/write
command family (-009).
Lows (highlights)
- Server: cap GalaxyGlobMatcher.RegexCache; align
WorkerAlarmRpcDispatcher missing-session handling; drop the
duplicate dashboard @page routes; refresh IAlarmRpcDispatcher
XML doc.
- Worker: surface SetXmlAlarmQuery COM failures; remove dead
subscriptionExpression / ExecutingCommand arms; preserve
factory-supplied runtime sessions; split MxAlarmSnapshot.cs into
three files.
- Tests: dispose the WebApplication in seven test classes; rebuild
FakeWorkerProcess.WaitForExitAsync against a real TaskCompletion
source; switch the heartbeat-expires test to ManualTimeProvider;
add InvariantCulture to the remaining DateTimeOffset.Parse sites;
document GalaxyFilterInputSafetyTests in GatewayTesting.md.
- IntegrationTests: comment fixes, RecordingServerStreamWriter
IDisposable, class-level [Trait], single-source ZB default
connection string.
- Worker.Tests: replace silent-return gating with LiveMxAccessFact
so absent env vars SKIP not pass; PascalCase rename of probe
[Fact]s; deterministic deadline test; new frame-protocol error
tests; ComputeTransitions diff-coverage; relocate dev-rig probes
to Probes/.
- Contracts: add round-trip coverage and per-field redaction /
Galaxy-identifier comments to the protos.
- Client.Dotnet: introduce clients/dotnet/Directory.Build.props so
TreatWarningsAsErrors / analysers apply; document
DiscoverHierarchyOptions and IMxGatewayCliClient; require typed
bulk-read handles in CLI; surface AcknowledgeAlarm transport
faults through Translate().
- Client.Go: kill dead code in alarms_test / fakeGalaxyServer /
runWriteBulkVariant; document the six new subcommands in
writeUsage; drain galaxy-watch events on limit; switch io.EOF
comparisons to errors.Is.
- Client.Java: shared shutdown helpers + new shutdownTimeout
option; regex-based credential redaction; Long.toUnsignedString
for uint64 sequence; doc fixes.
- Client.Python: combine duplicate imports; add coverage for
_percentile / bench-read-bulk / MAX_AGGREGATE_EVENTS /
_api_key_from_env; populate pyproject metadata and ship py.typed.
- Client.Rust: expose next_correlation_id() so CLI ping/close
stop hard-coding correlation IDs; resync RustClientDesign.md
with the current Session / Error surface and CLI subcommand set.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
35 KiB
Code Review — Client.Python
| Field | Value |
|---|---|
| Module | clients/python |
| Reviewer | Claude Code |
| Review date | 2026-05-20 |
| Commit reviewed | 1cd51bb |
| Status | Reviewed |
| Open findings | 0 |
Checklist coverage
A re-review at commit 1cd51bb over the same module. Prior findings
(Client.Python-001 — Client.Python-012) remain closed and are kept as
history. This section reflects categories evaluated in this pass.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Issue found: _use_plaintext silently downgrades any localhost: / 127.0.0.1: endpoint to plaintext (Client.Python-013). |
| 2 | mxaccessgw conventions | No new issues found — secrets redacted, MXAccess parity preserved, generated code untouched, no Blazor/COM violations apply (Python client). |
| 3 | Concurrency & thread safety | No new issues found — close-idempotency hazard fixed in Client.Python-006, shared _canceling_iterator cancels on CancelledError. |
| 4 | Error handling & resilience | No new issues found at this commit (prior 003, 007, 011 remain closed). |
| 5 | Security | Issue found: implicit plaintext-on-localhost (Client.Python-013) means a user explicitly listing a TLS-fronted loopback endpoint with --api-key but without --tls/--plaintext silently transmits the bearer token in cleartext. |
| 6 | Performance & resource management | No new issues found — iter_hierarchy streams pages lazily (Client.Python-005 resolution). |
| 7 | Design-document adherence | No new issues found — PythonClientDesign.md matches the implemented surface. |
| 8 | Code organization & conventions | Issue found: duplicate from mxgateway.values import lines in commands.py:22-23 (Client.Python-014). |
| 9 | Testing coverage | Issues found: bench_read_bulk CLI body, MAX_AGGREGATE_EVENTS event-cap, and _use_plaintext localhost-auto-plaintext path are untested (Client.Python-015, Client.Python-016). |
| 10 | Documentation & comments | Issues found: pyproject.toml lacks PyPI metadata (authors, license, classifiers, urls) and no PEP 561 py.typed marker (Client.Python-017); auto-plaintext behaviour is undocumented (Client.Python-013). |
Findings
Client.Python-001
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | clients/python/pyproject.toml:8,25, clients/python/src/mxgateway_cli/commands.py:25 |
| Status | Resolved |
Description: The package description in pyproject.toml still says "Async Python client scaffold" even though the client is fully implemented. Stale "scaffold" wording misrepresents maturity to anyone reading PyPI metadata. (The mxgw-py console-script name is itself consistent between pyproject.toml and the README.)
Recommendation: Update the pyproject.toml description to drop "scaffold"; keep README CLI examples in sync with the actual mxgw-py entry point.
Resolution: 2026-05-18 — Confirmed: pyproject.toml:8 description read "Async Python client scaffold for MXAccess Gateway." Changed to "Async Python client for MXAccess Gateway." The mxgw-py console-script name was already consistent with the README, so no README change was needed. Pure metadata fix — no test required.
Client.Python-002
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | clients/python/src/mxgateway/__init__.py:27 |
| Status | Resolved |
Description: MxGatewayCommandError is imported into __init__.py and is a documented public exception, but it is missing from __all__. It is the parent of MxAccessError and a meaningful catch target, so omitting it from the public surface is inconsistent — from mxgateway import * will not expose it and tooling that respects __all__ treats it as private.
Recommendation: Add "MxGatewayCommandError" to the __all__ list.
Resolution: 2026-05-18 — Re-triaged: this finding is stale against the reviewed source. clients/python/src/mxgateway/__init__.py already imports MxGatewayCommandError (line 16) and lists "MxGatewayCommandError" in __all__ (line 38). from mxgateway import * exposes it correctly. Verified at runtime ('MxGatewayCommandError' in mxgateway.__all__ is True). No source change required — the defect described no longer exists.
Client.Python-003
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Error handling & resilience |
| Location | clients/python/src/mxgateway/client.py:125-137,155-173 |
| Status | Resolved |
Description: stream_events_raw and query_active_alarms call the stub directly with a timeout kwarg when stream_timeout is set, with no TypeError fallback. galaxy.py:watch_deploy_events and _unary do have a fallback that strips timeout if the callable rejects it. This asymmetry means a fake/older stub that does not accept timeout crashes for gateway streams but not Galaxy streams. It is only masked today because stream_timeout defaults to None.
Recommendation: Apply the same try/except TypeError timeout-fallback pattern to stream_events_raw and query_active_alarms, or remove the fallback everywhere and standardise on a single behaviour.
Resolution: 2026-05-18 — Confirmed: both stream methods in client.py called the stub with timeout unconditionally and had no TypeError fallback, unlike _unary and galaxy.watch_deploy_events. Added a shared _open_stream helper in client.py that opens a server-streaming call and strips the timeout kwarg when the stub raises TypeError: ... unexpected keyword argument 'timeout', then routed both stream_events_raw and query_active_alarms through it. Regression tests in tests/test_stream_timeout_fallback.py (test_stream_events_raw_falls_back_when_stub_rejects_timeout, test_query_active_alarms_falls_back_when_stub_rejects_timeout, test_stream_events_raw_still_passes_timeout_to_capable_stub) failed before the fix and pass after. No public behaviour change for real gRPC stubs, so no README update needed.
Client.Python-004
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | clients/python/src/mxgateway_cli/commands.py:386,402-404 |
| Status | Resolved |
Description: In _smoke, the local variable closed is set to False and never reassigned; the finally block's if not closed: is therefore always true. This is dead/misleading code suggesting a removed early-close path.
Recommendation: Remove the closed variable and the if not closed: guard; call await session.close() directly in the finally block (or use async with session:).
Resolution: 2026-05-18 — Confirmed: closed = False was set and never reassigned, making if not closed: dead code. Replaced the try/finally with async with session: so the session is closed via the documented async context manager — Session already implements __aexit__ → close(). Behaviour is unchanged (the session is still closed on every exit path); no test needed for the dead-code removal — exercised by the existing CLI smoke test.
Client.Python-005
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Performance & resource management |
| Location | clients/python/src/mxgateway/galaxy.py:117-140 |
| Status | Resolved |
Description: discover_hierarchy pages through the entire Galaxy object hierarchy and accumulates every GalaxyObject (each carrying its full attribute list) into a single in-memory list before returning. For a large Galaxy this is a very large allocation with no streaming alternative and no caller-side bound.
Recommendation: Offer an async-generator variant (e.g. iter_hierarchy()) that yields objects/pages as they arrive, keeping discover_hierarchy() as a convenience wrapper. At minimum document the memory characteristic.
Resolution: 2026-05-18 — Confirmed: discover_hierarchy buffered the entire hierarchy with no streaming alternative. Added GalaxyRepositoryClient.iter_hierarchy, an async generator that fetches one DiscoverHierarchyRequest page at a time and yields each GalaxyObject as it arrives, so peak memory is bounded by a single page (_DISCOVER_HIERARCHY_PAGE_SIZE). Pages are fetched lazily — the next page is only requested after the current page is fully consumed. discover_hierarchy is now a thin convenience wrapper ([obj async for obj in self.iter_hierarchy()]) that preserves its list[GalaxyObject] contract, including the repeated-page-token guard. Regression tests in tests/test_galaxy_iter_hierarchy.py (test_iter_hierarchy_yields_objects_across_pages, test_iter_hierarchy_is_lazy_and_does_not_prefetch_next_page, test_iter_hierarchy_rejects_repeated_page_token, test_discover_hierarchy_still_returns_full_list) failed before the fix and pass after. clients/python/README.md updated with the iter_hierarchy usage and memory guidance since this adds a new public method.
Client.Python-006
| Field | Value |
|---|---|
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | clients/python/src/mxgateway/client.py:74-82, clients/python/src/mxgateway/galaxy.py:85-93, clients/python/src/mxgateway/session.py:38-55 |
| Status | Resolved |
Description: close() on the clients and Session.close() use a plain self._closed check-then-set with an await between, with no lock. If two coroutines call close() concurrently both can pass the guard before either sets it, causing a double channel.close() / double CloseSession RPC. Single-task usage is the documented contract, so impact is low, but the idempotency guarantee asserted in docstrings only holds for sequential calls.
Recommendation: Set self._closed = True before the await, or guard with an asyncio.Lock, so the idempotency claim holds under concurrent close.
Resolution: 2026-05-18 — Confirmed the check-then-set window. Fixed GatewayClient.close, GalaxyRepositoryClient.close, and Session.close to set self._closed = True before the await (channel close / CloseSession RPC). A second coroutine entering close() while the first is still awaiting now hits the early-return guard and does not issue a second channel.close() / CloseSession. Docstrings updated to state the idempotency holds under concurrent calls. TDD: regression tests in tests/test_low_severity_findings.py (test_gateway_client_concurrent_close_closes_channel_once, test_galaxy_client_concurrent_close_closes_channel_once, test_session_concurrent_close_sends_one_close_session_rpc) — each uses a fake channel/client that stalls inside close/close_session_raw so two concurrent close() calls interleave at the exact race window; they failed before the fix and pass after.
Client.Python-007
| Field | Value |
|---|---|
| Severity | Low |
| Category | Error handling & resilience |
| Location | clients/python/src/mxgateway/client.py:204-213 |
| Status | Resolved |
Description: _canceling_iterator (gateway event stream) does not catch asyncio.CancelledError to invoke call.cancel() explicitly — it relies on the finally block. galaxy.py:_canceling_iterator does explicitly catch CancelledError, cancel, and re-raise. The two are functionally equivalent today, but the inconsistency between near-identical helpers invites future divergence.
Recommendation: Make the two _canceling_iterator helpers identical, ideally by factoring a single shared helper.
Resolution: 2026-05-18 — Confirmed the divergence. Factored a single shared helper: client._canceling_iterator(call, operation) now takes the map_rpc_error operation string as a parameter, explicitly catches asyncio.CancelledError (cancels the call, re-raises) and grpc.RpcError, and repeats the cancel in finally. This replaces both the gateway _canceling_iterator and the gateway _canceling_active_alarms_iterator; galaxy.py now imports and delegates to the same helper instead of defining its own, so the gateway and Galaxy stream helpers are byte-for-byte identical. TDD: tests/test_low_severity_findings.py::test_gateway_stream_iterator_cancels_call_on_task_cancellation drives a cancellable fake stream and asserts the gateway iterator cancels the underlying call on task cancellation. All existing stream-cancellation tests still pass.
Client.Python-008
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | clients/python/src/mxgateway/values.py:62-67,83-88 |
| Status | Resolved |
Description: to_mx_value maps any Python float to VT_R8/MX_DATA_TYPE_DOUBLE with no handling for nan/inf, which are serialised and forwarded to MXAccess which may reject or mis-handle them. bytes is mapped to VT_RECORD/MX_DATA_TYPE_UNKNOWN, a questionable default. The data_type keyword exists but Session.write never forwards it.
Recommendation: Document the float/bytes mapping assumptions, optionally validate finiteness, and consider plumbing the data_type keyword through Session.write/write2.
Resolution: 2026-05-18 — Confirmed the non-finite-float hazard. Added an _ensure_finite guard in values.py: to_mx_value now raises ValueError for nan/inf/-inf, both for a scalar float and for a non-finite element inside a float sequence — MXAccess has no defined wire representation for non-finite doubles, so rejecting client-side is the correct fail-fast. The float/bytes mapping assumptions (finite-only doubles; bytes as an opaque VT_RECORD pass-through) are now documented in the values.py module docstring and clients/python/README.md. Plumbing data_type through Session.write/write2 was deliberately not done: it is a larger public-API surface change the finding only marks as "consider", and the documented MXAccess-parity convention is type-by-Python-value; the data_type keyword stays available on to_mx_value for callers that build the MxValue directly. TDD: tests/test_low_severity_findings.py adds test_to_mx_value_rejects_nan, test_to_mx_value_rejects_positive_infinity, test_to_mx_value_rejects_negative_infinity, test_to_mx_value_rejects_non_finite_float_in_sequence, and test_to_mx_value_accepts_finite_float. README updated since to_mx_value (used by Session.write/write2) now rejects an input it previously accepted.
Client.Python-009
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Testing coverage |
| Location | clients/python/tests/ |
| Status | Resolved |
Description: Several non-trivial public paths are untested: Session.write2/add_item2 request construction; the bulk-size limit _ensure_bulk_size/MAX_BULK_ITEMS guard; the None-argument TypeError guards in bulk methods; the TLS ca_file read path in create_channel; most CLI command bodies; and map_rpc_error's default (non-auth) branch.
Recommendation: Add tests for write2/add_item2 request shape, the bulk-size ValueError, the ca_file TLS branch, the generic map_rpc_error fallthrough, and at least one happy-path CLI command using a fake stub.
Resolution: 2026-05-18 — Confirmed coverage gap against the existing tests/ files. Added tests/test_coverage_gaps.py covering every path the finding lists: test_add_item2_sends_item_context_and_returns_handle and test_write2_sends_value_and_timestamp_value (request shape + MxValue oneof), test_subscribe_bulk_rejects_oversized_request and test_add_item_bulk_at_limit_is_allowed (the MAX_BULK_ITEMS _ensure_bulk_size boundary), test_advise_item_bulk_rejects_none_argument (the None-argument TypeError guard), test_create_channel_reads_ca_file and test_create_channel_missing_ca_file_raises (the TLS ca_file read path), test_map_rpc_error_generic_branch_returns_transport_error and test_map_rpc_error_handles_error_without_code (the non-auth map_rpc_error fallthrough and the no-code path), and test_cli_register_happy_path_emits_server_handle (a happy-path CLI command body driven end to end through CliRunner with a fake stub via a monkeypatched _connect). All 10 new tests pass. No source change required — this is a pure coverage finding.
Client.Python-010
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | clients/python/src/mxgateway/session.py:404, clients/python/src/mxgateway_cli/commands.py:422-425 |
| Status | Resolved |
Description: session.py ends with a module-level late import from .client import GatewayClient # noqa: E402 purely to satisfy a string type hint, and commands.py:_session does a function-local import. Both work around a circular dependency that from __future__ import annotations (already in effect) makes unnecessary. _session also lacks a return type annotation.
Recommendation: Drop the runtime late import in session.py and use a TYPE_CHECKING-guarded import for the hint; add the -> Session return annotation to commands.py:_session.
Resolution: 2026-05-18 — Confirmed: with from __future__ import annotations in effect all annotations are strings, so the runtime late import was unnecessary. Removed the trailing from .client import GatewayClient # noqa: E402 in session.py and replaced it with a top-of-file if TYPE_CHECKING: import that satisfies the GatewayClient hint without a runtime dependency (no import cycle: client.py does not import session at module scope). In commands.py, hoisted the function-local from mxgateway.session import Session to a module-level import and added the -> Session return annotation to _session. Verified import mxgateway and import mxgateway_cli.commands succeed with no circular-import error. Pure refactor — covered by the existing import and CLI tests; no new test needed.
Client.Python-011
| Field | Value |
|---|---|
| Severity | Low |
| Category | Error handling & resilience |
| Location | clients/python/src/mxgateway/errors.py:122-148 |
| Status | Resolved |
Description: ensure_mxaccess_success raises MxAccessError if any mx_status.success == 0. This treats success == 0 as the failure sentinel, but 0 is also the proto3 scalar default for an unset MxStatusProxy. If the gateway ever returns a reply with an unpopulated status entry (e.g. a partially-filled bulk result), the client raises MxAccessError even though no real failure occurred.
Recommendation: Confirm against the proto/gateway contract whether success is guaranteed populated for every statuses entry; if not, key the failure decision on an explicit failure field rather than the success == 0 default.
Resolution: 2026-05-18 — Confirmed against the gateway contract: success is not guaranteed populated for every statuses entry. src/MxGateway.Worker/Conversion/MxStatusProxyConverter.cs::ConvertMany emits a placeholder MxStatusProxy for a null MXSTATUS_PROXY COM array entry, setting Category/DetectedBy to Unknown but leaving Success at its proto3 default of 0. A fully-default proto entry likewise has success == 0. Under the old client logic either placeholder would falsely raise MxAccessError. Fixed ensure_mxaccess_success to key the per-status failure decision on a new _is_mxaccess_status_failure helper that requires success == 0 and a populated, non-OK category — a status with category of MX_STATUS_CATEGORY_UNSPECIFIED (default proto) or MX_STATUS_CATEGORY_UNKNOWN (the null-entry placeholder) is treated as unpopulated and ignored. MX_STATUS_CATEGORY_OK is also excluded so a genuine success entry never raises. Real failures (categories WARNING and the error categories, raw value ≥ 2) still raise as before — the existing write.mxaccess-failure fixture (SECURITY_ERROR/OPERATIONAL_ERROR statuses) and the MXACCESS_FAILURE protocol-status path are unaffected. TDD: tests/test_low_severity_findings.py adds test_ensure_mxaccess_success_ignores_unpopulated_status_entry (default + null-placeholder entries, no raise), test_ensure_mxaccess_success_raises_on_populated_failure_status (populated COMMUNICATION_ERROR, raises), and test_ensure_mxaccess_success_passes_when_status_reports_success. No public-behaviour change for genuine replies, so no README update.
Client.Python-012
| Field | Value |
|---|---|
| Severity | Low |
| Category | mxaccessgw conventions |
| Location | clients/python/src/mxgateway/client.py:84-108, clients/python/src/mxgateway/session.py:57-77 |
| Status | Won't Fix |
Description: Session.invoke_raw does not run ensure_mxaccess_success while Session.invoke does, so a caller using invoke_raw for parity tests gets a reply where an MXAccess HRESULT failure is silently embedded with no exception. This is by design but under-documented — the README's "preserve raw replies" sentence does not state that *_raw methods skip MXAccess-failure detection entirely.
Recommendation: Document explicitly (README + docstring) that *_raw methods surface MXAccess HRESULT/status failures only inside the reply and do not raise MxAccessError, so parity-test callers know to inspect protocol_status/hresult/statuses themselves.
Resolution: 2026-05-18 — Won't Fix (no behaviour change). Confirmed this is intentional, correct parity behaviour: the *_raw methods exist precisely so parity-test callers can inspect an unmodified gateway reply, including embedded MXAccess HRESULT/status failures, without an exception masking them. Changing invoke_raw to raise MxAccessError would defeat its purpose and duplicate Session.invoke. The finding's only actionable point is the documentation gap, which has been addressed: clients/python/README.md now states explicitly that *_raw methods enforce gateway protocol success only and do not run MXAccess-failure detection, and the docstrings of GatewayClient.invoke_raw and Session.invoke_raw say the same and point callers to inspect protocol_status/hresult/statuses (and to Session.invoke for the checked variant). No code/test change — the runtime contract is unchanged and correct.
Client.Python-013
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Security |
| Location | clients/python/src/mxgateway_cli/commands.py:757-762 |
| Status | Resolved |
Description: _use_plaintext silently returns True whenever the endpoint
string starts with localhost: or 127.0.0.1:, even if neither --plaintext
nor --tls is supplied on the command line. Any CLI subcommand (e.g.
mxgw-py open-session --endpoint localhost:5001 --api-key mxgw_<secret>) then
attaches the API key to a plaintext gRPC channel without warning. This is a
silent security downgrade: a user who deliberately ran the gateway behind TLS
on loopback (e.g. for testing a production-shaped TLS config locally) and who
passes --api-key expecting the secret to be transport-protected gets a
plaintext bearer token instead. The auto-downgrade is also undocumented —
README.md and the CLI --help text both describe --plaintext and --tls
as the controls, with no mention that endpoint-prefix matching can override
either. The other client CLIs do not auto-downgrade: the .NET CLI uses
https://-prefix detection on a URI scheme (an explicit signal), Go and Java
require an explicit --plaintext/--tls choice, and Rust defaults to
plaintext only when plaintext = true is set on the options struct.
Recommendation: Drop the localhost-prefix auto-plaintext branch and
require the user to pass --plaintext or --tls (or default to TLS to match
the rest of the matrix). If the implicit-localhost behaviour is kept for
ergonomics, document it prominently in both README.md and --help, emit a
stderr warning when --api-key is combined with the auto-downgrade path, and
add a CLI test asserting the auto-downgrade is in fact active so it is not
silently lost in a future refactor.
Resolution: 2026-05-20 — Removed the silent localhost: / 127.0.0.1:
auto-plaintext branch from _use_plaintext. The new contract matches the Go
and Java CLIs: TLS is the default, --plaintext is the only way to opt
in to an unencrypted channel, and --tls is accepted as a redundant, explicit
affirmation of the default (mutually exclusive with --plaintext, which now
raises click.UsageError). The --plaintext / --tls --help text and
clients/python/README.md both call out the new behaviour. Added six
regression tests in clients/python/tests/test_cli.py covering: (a) a
localhost: endpoint with no flags resolves to TLS, (b) a 127.0.0.1:
endpoint with no flags resolves to TLS, (c) --plaintext opts in to plaintext,
(d) --tls is accepted and idempotent with the default, (e) --plaintext
combined with --tls is rejected, and (f) an end-to-end CliRunner test
asserting ClientOptions.plaintext == False flows through to
GatewayClient.connect when no flag is supplied against a localhost:
endpoint. Behaviour change for callers: scripts that previously relied on
mxgw-py … --endpoint localhost:5000 … selecting plaintext silently must now
add an explicit --plaintext flag (or set up TLS on the gateway). Calling
mxgw-py with an --api-key against a plaintext-only gateway without
--plaintext will now fail to connect rather than silently leaking the bearer
token.
Client.Python-014
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | clients/python/src/mxgateway_cli/commands.py:22-23 |
| Status | Resolved |
Description: commands.py has two consecutive from mxgateway.values import lines:
from mxgateway.values import to_mx_value
from mxgateway.values import MxValueInput
These import from the same module and should be combined into a single
from mxgateway.values import MxValueInput, to_mx_value. The split form is
inconsistent with the rest of the file (every other module is imported in a
single statement) and would be flagged by ruff/isort if any linter were
configured. Pure style, no behavioural impact.
Recommendation: Collapse the two imports into one statement, ordered to
match the conventional alphabetical-within-module pattern:
from mxgateway.values import MxValueInput, to_mx_value.
Resolution: 2026-05-20 — Collapsed the two consecutive
from mxgateway.values import to_mx_value / from mxgateway.values import MxValueInput
lines in clients/python/src/mxgateway_cli/commands.py into a single
from mxgateway.values import MxValueInput, to_mx_value statement, matching
the alphabetical-within-module pattern used elsewhere in the file. Pure style
fix — no behavioural impact, covered by the existing CLI tests.
Client.Python-015
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | clients/python/src/mxgateway_cli/commands.py:273-294,564-647, clients/python/tests/ |
| Status | Resolved |
Description: _bench_read_bulk is a ~80-line CLI body that opens its own
session, registers, subscribe_bulks, runs a warm-up loop, a measurement loop,
collects per-call latencies, computes a percentile summary, and emits the
shared cross-language JSON schema. It is the largest untested CLI command in
the module — tests/ has no bench_read_bulk test, fake-stub-driven or
otherwise. A drift in the schema field names (callsPerSecond,
cachedReadResults, latencyMs.p50, …) would break the cross-language
scripts/bench-read-bulk.ps1 aggregation silently. _percentile_summary and
_percentile are also untested — the boundary cases (n == 0, n == 1,
quantile interpolation) would benefit from a small unit test since the
identical algorithm is duplicated in the .NET / Go / Rust / Java drivers and
a divergence would corrupt cross-language comparisons.
Recommendation: Add a fake-stub-driven bench_read_bulk test that drives
a short --duration-seconds 0 --warmup-seconds 0 run through CliRunner and
asserts the JSON schema (language == "python", the full key set,
latencyMs.p50/p95/p99/max/mean present). Add unit tests for _percentile
covering n == 0, n == 1, and a known-good interpolated value at p95 so
the implementation cannot silently drift from the other clients.
Resolution: 2026-05-20 — Added clients/python/tests/test_cli_bench_and_helpers.py
with three layers of coverage. (1) _percentile unit tests pin the
cross-language algorithm (rank = q * (n - 1), linear interpolation between
adjacent ranks): empty sample returns 0.0, single element returns that
element, exact-rank queries return the sample value (p50 of [10,20,30,40,50]
is 30.0), and the interpolated p95/p99 values (48.0 / 49.6 for that same
five-element sample) are locked down so any drift from the .NET / Go / Rust /
Java drivers fails fast. (2) _percentile_summary tests assert the full
{p50, p95, p99, max, mean} dict shape, the zero-sample placeholder, and the
3-decimal rounding contract. (3) A bench-read-bulk smoke test
(test_bench_read_bulk_emits_cross_language_schema) drives the CLI through
CliRunner with --duration-seconds 0 --warmup-seconds 0 against a fake stub
that handles OpenSession, Register, SubscribeBulk, ReadBulk, and
UnsubscribeBulk, then asserts the emitted JSON has exactly the 16
cross-language schema keys (language, command, endpoint, clientName,
bulkSize, durationSeconds, warmupSeconds, durationMs, tags,
totalCalls, successfulCalls, failedCalls, totalReadResults,
cachedReadResults, callsPerSecond, latencyMs) and that latencyMs is a
{p50, p95, p99, max, mean} sub-object — guarding against silent breakage of
scripts/bench-read-bulk.ps1's cross-language aggregation. No source change —
this is a pure coverage finding.
Client.Python-016
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | clients/python/src/mxgateway_cli/commands.py:25,757-775,805-830 |
| Status | Resolved |
Description: Three CLI helper paths are not covered by tests/:
_use_plaintextlocalhost auto-downgrade (line 762) — theendpoint.startswith("localhost:") or endpoint.startswith("127.0.0.1:")branch (see also Client.Python-013) is untested; no test asserts that an endpoint without--plaintextand without--tlsresolves to plaintext._collect_eventsMAX_AGGREGATE_EVENTSguard (line 811-815) — passing--max-eventsgreater thanMAX_AGGREGATE_EVENTSraisesclick.BadParameter, but no test exercises the guard. A silent removal of the constant or the comparison would not be caught._api_key_from_env(line 765-768) — only the implicit path through_secretsis exercised; there is no test that verifies an env-var name resolves to a value and that an unset env var producesNone.
These are all small, fake-stub-driven CLI behaviours rather than end-to-end paths. The previous coverage finding (Client.Python-009) closed without adding tests for these specific paths.
Recommendation: Add three small CliRunner / unit tests: one asserting
the localhost auto-plaintext (or its replacement, if Client.Python-013 is
fixed), one asserting --max-events 10001 exits non-zero with the
MAX_AGGREGATE_EVENTS error message, and one asserting
_api_key_from_env("MXGATEWAY_API_KEY") returns the env value and None for
an unset variable.
Resolution: 2026-05-20 — Scope adjusted: Client.Python-013 has since
removed the _use_plaintext localhost auto-plaintext branch, so item (1) is
no longer a real code path — the
test_use_plaintext_requires_explicit_flag_for_localhost_endpoint and
test_cli_localhost_endpoint_defaults_to_tls_via_open_session regressions
added under Client.Python-013 already pin the new TLS-by-default contract.
The remaining two helpers are now covered in
clients/python/tests/test_cli_bench_and_helpers.py. (2)
MAX_AGGREGATE_EVENTS cap:
test_collect_events_rejects_max_events_above_aggregate_cap drives
stream-events with --max-events 10001 through CliRunner against
stubbed _connect / _session fakes and asserts the CLI exits non-zero with
the documented less than or equal to 10000 message;
test_collect_events_accepts_max_events_at_aggregate_cap_boundary confirms
--max-events 10000 is accepted at the boundary and returns an empty event
list. (3) _api_key_from_env:
test_api_key_from_env_resolves_value_when_variable_is_set (env-var
populated → returned),
test_api_key_from_env_returns_none_when_variable_is_unset (env-var unset
→ None), test_api_key_from_env_returns_none_when_name_is_none (the
name is None early-return), and
test_api_key_from_env_returns_none_when_name_is_empty_string (the
if not name truthiness guard). No source change — pure coverage finding.
Client.Python-017
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | clients/python/pyproject.toml:5-25, clients/python/src/mxgateway/ |
| Status | Resolved |
Description: The package metadata in pyproject.toml is minimal for a
published wheel:
- No
authorsfield. PyPI /pip showwill display no author. - No
licensefield, nolicense-filesfield, and noLICENSEfile is referenced from the project. The repo as a whole has no top-levelLICENSEeither, but other client packages (Java has a license entry, the .NET package has a license expression in thecsproj) tend to set this. - No
classifiers(noProgramming Language :: Python :: 3.12,Operating System :: Microsoft :: Windows,Topic :: …, no development-status classifier). Without these the PyPI search facets are empty and tooling likepipcannot tell whether the package is alpha/beta/stable. - No
keywords, no[project.urls](no homepage / source / issue link pointing back to the repo). - The package ships no PEP 561
py.typedmarker file insrc/mxgateway/. Type hints are written throughout the module (from __future__ import annotations, full annotations on every public function), but downstream consumers runningmypyonmxaccess-gateway-clientwill not see those hints — PEP 561 requires the marker file to opt the package into type-stub distribution.
Recommendation: Add authors, license = "<spdx>", keywords, and
[project.urls] to pyproject.toml; add at least the standard classifiers
trio (Development Status, Programming Language :: Python :: 3.12,
Intended Audience); create an empty src/mxgateway/py.typed file and
include it in the wheel via [tool.setuptools.package-data] so consumers
running mypy against an installed wheel pick up the type information.
Resolution: 2026-05-20 — Filled out clients/python/pyproject.toml
with the missing PyPI metadata: authors = [{ name = "MXAccess Gateway Authors" }], license = "Proprietary" (the repo has no top-level
LICENSE file and no other client publishes under an OSS licence, so the
SPDX Proprietary expression matches the de-facto status), the standard
classifier set (Development Status :: 4 - Beta, Intended Audience :: Developers / Information Technology, Operating System :: Microsoft :: Windows and :: POSIX, Programming Language :: Python /
Python :: 3 / Python :: 3.12, Topic :: Software Development :: Libraries :: Python Modules, Topic :: System :: Distributed Computing,
and Typing :: Typed), a keywords list
(mxaccess, archestra, gateway, grpc, industrial, scada), and
[project.urls] with Homepage / Source / Issues pointing at the
Gitea repo. Added the PEP 561 marker file
clients/python/src/mxgateway/py.typed (empty, as the spec requires) and
declared it in [tool.setuptools.package-data] mxgateway = ["py.typed"]
so the wheel ships the marker and downstream mypy users see the
inline type hints. Pure metadata / packaging change — python -m pytest -q
still passes (91 tests).