a0203503a7
Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit 1cd51bb, filed 72 new findings, and
fixed them in three priority waves (3 High, 17 Medium, 52 Low).
Highs
- Server-017: enumerate AcknowledgeAlarm / QueryActiveAlarms in
GatewayGrpcScopeResolver so non-admin keys can use them; document
the mapping in docs/Authorization.md; add interceptor tests.
- Client.Java-013: add the five missing bulk-method stubs to the
CLI FakeSession so the test module compiles on a clean tree.
- Client.Rust-013: fix the clippy::doc_lazy_continuation regression
in generated tonic code by reformatting the ReadBulkCommand proto
comment and scoping a #![allow(...)] to the generated submodules.
Mediums (highlights)
- Server: unify GatewaySession state-lock discipline (-015) and
make DisposeAsync race-safe against in-flight CloseAsync (-016);
add constraint-enforcement test coverage for the bulk-plan path
(-021).
- Worker: introduce StaRuntimeShutdownException so RunAlarmPollLoop
can distinguish graceful shutdown from a real STA-affinity
violation (-016); have the watchdog skip StaHung while
CurrentCommandCorrelationId is non-empty so a legitimate slow
ReadBulk no longer self-faults (-017).
- Tests: add per-method round-trip + cancellation coverage for the
11 GatewaySession bulk methods (-013); replace the real TCP probe
in GalaxyHierarchyCacheTests with an IGalaxyRepository fake
(-016).
- IntegrationTests: drive the StreamEvents writer in the live Write
test and assert OnWriteComplete (-012); add live tests for
Unadvise/RemoveItem/Unregister ordering, WriteSecured, and
abnormal worker exit (-014).
- Worker.Tests: replace MxAccessSession reflection with an internal
CreateForTesting factory (-016); cover WorkerCancel and
unexpected-body envelope branches (-017).
- Client.Java: cancel MxEventStream when close() races
beforeStart() (-014); return a CancellingCompletableFuture that
actually forwards cancellation through .thenApply chains (-015).
- Client.Python: drop the silent localhost-plaintext downgrade in
the CLI; require explicit --plaintext (-013).
- Client.Rust: stop bench-read-bulk from polluting success-latency
histograms with failed-call durations (-015); add coverage for
the five MalformedReply paths, the bulk-write helpers, the
Error::Unavailable mapping, and the unary-fault path (-016).
- Contracts: extend docs/Contracts.md with the bulk read/write
command family (-009).
Lows (highlights)
- Server: cap GalaxyGlobMatcher.RegexCache; align
WorkerAlarmRpcDispatcher missing-session handling; drop the
duplicate dashboard @page routes; refresh IAlarmRpcDispatcher
XML doc.
- Worker: surface SetXmlAlarmQuery COM failures; remove dead
subscriptionExpression / ExecutingCommand arms; preserve
factory-supplied runtime sessions; split MxAlarmSnapshot.cs into
three files.
- Tests: dispose the WebApplication in seven test classes; rebuild
FakeWorkerProcess.WaitForExitAsync against a real TaskCompletion
source; switch the heartbeat-expires test to ManualTimeProvider;
add InvariantCulture to the remaining DateTimeOffset.Parse sites;
document GalaxyFilterInputSafetyTests in GatewayTesting.md.
- IntegrationTests: comment fixes, RecordingServerStreamWriter
IDisposable, class-level [Trait], single-source ZB default
connection string.
- Worker.Tests: replace silent-return gating with LiveMxAccessFact
so absent env vars SKIP not pass; PascalCase rename of probe
[Fact]s; deterministic deadline test; new frame-protocol error
tests; ComputeTransitions diff-coverage; relocate dev-rig probes
to Probes/.
- Contracts: add round-trip coverage and per-field redaction /
Galaxy-identifier comments to the protos.
- Client.Dotnet: introduce clients/dotnet/Directory.Build.props so
TreatWarningsAsErrors / analysers apply; document
DiscoverHierarchyOptions and IMxGatewayCliClient; require typed
bulk-read handles in CLI; surface AcknowledgeAlarm transport
faults through Translate().
- Client.Go: kill dead code in alarms_test / fakeGalaxyServer /
runWriteBulkVariant; document the six new subcommands in
writeUsage; drain galaxy-watch events on limit; switch io.EOF
comparisons to errors.Is.
- Client.Java: shared shutdown helpers + new shutdownTimeout
option; regex-based credential redaction; Long.toUnsignedString
for uint64 sequence; doc fixes.
- Client.Python: combine duplicate imports; add coverage for
_percentile / bench-read-bulk / MAX_AGGREGATE_EVENTS /
_api_key_from_env; populate pyproject metadata and ship py.typed.
- Client.Rust: expose next_correlation_id() so CLI ping/close
stop hard-coding correlation IDs; resync RustClientDesign.md
with the current Session / Error surface and CLI subcommand set.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
467 lines
35 KiB
Markdown
467 lines
35 KiB
Markdown
# Code Review — Client.Python
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Module | `clients/python` |
|
|
| Reviewer | Claude Code |
|
|
| Review date | 2026-05-20 |
|
|
| Commit reviewed | `1cd51bb` |
|
|
| Status | Reviewed |
|
|
| Open findings | 0 |
|
|
|
|
## Checklist coverage
|
|
|
|
A re-review at commit `1cd51bb` over the same module. Prior findings
|
|
(Client.Python-001 — Client.Python-012) remain closed and are kept as
|
|
history. This section reflects categories evaluated in this pass.
|
|
|
|
| # | Category | Result |
|
|
|---|---|---|
|
|
| 1 | Correctness & logic bugs | Issue found: `_use_plaintext` silently downgrades any `localhost:` / `127.0.0.1:` endpoint to plaintext (Client.Python-013). |
|
|
| 2 | mxaccessgw conventions | No new issues found — secrets redacted, MXAccess parity preserved, generated code untouched, no Blazor/COM violations apply (Python client). |
|
|
| 3 | Concurrency & thread safety | No new issues found — close-idempotency hazard fixed in Client.Python-006, shared `_canceling_iterator` cancels on `CancelledError`. |
|
|
| 4 | Error handling & resilience | No new issues found at this commit (prior 003, 007, 011 remain closed). |
|
|
| 5 | Security | Issue found: implicit plaintext-on-localhost (Client.Python-013) means a user explicitly listing a TLS-fronted loopback endpoint with `--api-key` but without `--tls`/`--plaintext` silently transmits the bearer token in cleartext. |
|
|
| 6 | Performance & resource management | No new issues found — `iter_hierarchy` streams pages lazily (Client.Python-005 resolution). |
|
|
| 7 | Design-document adherence | No new issues found — `PythonClientDesign.md` matches the implemented surface. |
|
|
| 8 | Code organization & conventions | Issue found: duplicate `from mxgateway.values import` lines in `commands.py:22-23` (Client.Python-014). |
|
|
| 9 | Testing coverage | Issues found: `bench_read_bulk` CLI body, `MAX_AGGREGATE_EVENTS` event-cap, and `_use_plaintext` localhost-auto-plaintext path are untested (Client.Python-015, Client.Python-016). |
|
|
| 10 | Documentation & comments | Issues found: `pyproject.toml` lacks PyPI metadata (`authors`, `license`, `classifiers`, `urls`) and no PEP 561 `py.typed` marker (Client.Python-017); auto-plaintext behaviour is undocumented (Client.Python-013). |
|
|
|
|
## Findings
|
|
|
|
### Client.Python-001
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Documentation & comments |
|
|
| Location | `clients/python/pyproject.toml:8,25`, `clients/python/src/mxgateway_cli/commands.py:25` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** The package `description` in `pyproject.toml` still says "Async Python client *scaffold*" even though the client is fully implemented. Stale "scaffold" wording misrepresents maturity to anyone reading PyPI metadata. (The `mxgw-py` console-script name is itself consistent between `pyproject.toml` and the README.)
|
|
|
|
**Recommendation:** Update the `pyproject.toml` description to drop "scaffold"; keep README CLI examples in sync with the actual `mxgw-py` entry point.
|
|
|
|
**Resolution:** 2026-05-18 — Confirmed: `pyproject.toml:8` `description` read "Async Python client scaffold for MXAccess Gateway." Changed to "Async Python client for MXAccess Gateway." The `mxgw-py` console-script name was already consistent with the README, so no README change was needed. Pure metadata fix — no test required.
|
|
|
|
### Client.Python-002
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Code organization & conventions |
|
|
| Location | `clients/python/src/mxgateway/__init__.py:27` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `MxGatewayCommandError` is imported into `__init__.py` and is a documented public exception, but it is missing from `__all__`. It is the parent of `MxAccessError` and a meaningful catch target, so omitting it from the public surface is inconsistent — `from mxgateway import *` will not expose it and tooling that respects `__all__` treats it as private.
|
|
|
|
**Recommendation:** Add `"MxGatewayCommandError"` to the `__all__` list.
|
|
|
|
**Resolution:** 2026-05-18 — Re-triaged: this finding is stale against the reviewed source. `clients/python/src/mxgateway/__init__.py` already imports `MxGatewayCommandError` (line 16) **and** lists `"MxGatewayCommandError"` in `__all__` (line 38). `from mxgateway import *` exposes it correctly. Verified at runtime (`'MxGatewayCommandError' in mxgateway.__all__` is `True`). No source change required — the defect described no longer exists.
|
|
|
|
### Client.Python-003
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Error handling & resilience |
|
|
| Location | `clients/python/src/mxgateway/client.py:125-137,155-173` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `stream_events_raw` and `query_active_alarms` call the stub directly with a `timeout` kwarg when `stream_timeout` is set, with no `TypeError` fallback. `galaxy.py:watch_deploy_events` and `_unary` *do* have a fallback that strips `timeout` if the callable rejects it. This asymmetry means a fake/older stub that does not accept `timeout` crashes for gateway streams but not Galaxy streams. It is only masked today because `stream_timeout` defaults to `None`.
|
|
|
|
**Recommendation:** Apply the same `try/except TypeError` timeout-fallback pattern to `stream_events_raw` and `query_active_alarms`, or remove the fallback everywhere and standardise on a single behaviour.
|
|
|
|
**Resolution:** 2026-05-18 — Confirmed: both stream methods in `client.py` called the stub with `timeout` unconditionally and had no `TypeError` fallback, unlike `_unary` and `galaxy.watch_deploy_events`. Added a shared `_open_stream` helper in `client.py` that opens a server-streaming call and strips the `timeout` kwarg when the stub raises `TypeError: ... unexpected keyword argument 'timeout'`, then routed both `stream_events_raw` and `query_active_alarms` through it. Regression tests in `tests/test_stream_timeout_fallback.py` (`test_stream_events_raw_falls_back_when_stub_rejects_timeout`, `test_query_active_alarms_falls_back_when_stub_rejects_timeout`, `test_stream_events_raw_still_passes_timeout_to_capable_stub`) failed before the fix and pass after. No public behaviour change for real gRPC stubs, so no README update needed.
|
|
|
|
### Client.Python-004
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `clients/python/src/mxgateway_cli/commands.py:386,402-404` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** In `_smoke`, the local variable `closed` is set to `False` and never reassigned; the `finally` block's `if not closed:` is therefore always true. This is dead/misleading code suggesting a removed early-close path.
|
|
|
|
**Recommendation:** Remove the `closed` variable and the `if not closed:` guard; call `await session.close()` directly in the `finally` block (or use `async with session:`).
|
|
|
|
**Resolution:** 2026-05-18 — Confirmed: `closed = False` was set and never reassigned, making `if not closed:` dead code. Replaced the `try/finally` with `async with session:` so the session is closed via the documented async context manager — `Session` already implements `__aexit__` → `close()`. Behaviour is unchanged (the session is still closed on every exit path); no test needed for the dead-code removal — exercised by the existing CLI smoke test.
|
|
|
|
### Client.Python-005
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Performance & resource management |
|
|
| Location | `clients/python/src/mxgateway/galaxy.py:117-140` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `discover_hierarchy` pages through the entire Galaxy object hierarchy and accumulates every `GalaxyObject` (each carrying its full attribute list) into a single in-memory `list` before returning. For a large Galaxy this is a very large allocation with no streaming alternative and no caller-side bound.
|
|
|
|
**Recommendation:** Offer an async-generator variant (e.g. `iter_hierarchy()`) that yields objects/pages as they arrive, keeping `discover_hierarchy()` as a convenience wrapper. At minimum document the memory characteristic.
|
|
|
|
**Resolution:** 2026-05-18 — Confirmed: `discover_hierarchy` buffered the entire hierarchy with no streaming alternative. Added `GalaxyRepositoryClient.iter_hierarchy`, an async generator that fetches one `DiscoverHierarchyRequest` page at a time and yields each `GalaxyObject` as it arrives, so peak memory is bounded by a single page (`_DISCOVER_HIERARCHY_PAGE_SIZE`). Pages are fetched lazily — the next page is only requested after the current page is fully consumed. `discover_hierarchy` is now a thin convenience wrapper (`[obj async for obj in self.iter_hierarchy()]`) that preserves its `list[GalaxyObject]` contract, including the repeated-page-token guard. Regression tests in `tests/test_galaxy_iter_hierarchy.py` (`test_iter_hierarchy_yields_objects_across_pages`, `test_iter_hierarchy_is_lazy_and_does_not_prefetch_next_page`, `test_iter_hierarchy_rejects_repeated_page_token`, `test_discover_hierarchy_still_returns_full_list`) failed before the fix and pass after. `clients/python/README.md` updated with the `iter_hierarchy` usage and memory guidance since this adds a new public method.
|
|
|
|
### Client.Python-006
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Concurrency & thread safety |
|
|
| Location | `clients/python/src/mxgateway/client.py:74-82`, `clients/python/src/mxgateway/galaxy.py:85-93`, `clients/python/src/mxgateway/session.py:38-55` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `close()` on the clients and `Session.close()` use a plain `self._closed` check-then-set with an `await` between, with no lock. If two coroutines call `close()` concurrently both can pass the guard before either sets it, causing a double `channel.close()` / double `CloseSession` RPC. Single-task usage is the documented contract, so impact is low, but the idempotency guarantee asserted in docstrings only holds for sequential calls.
|
|
|
|
**Recommendation:** Set `self._closed = True` before the `await`, or guard with an `asyncio.Lock`, so the idempotency claim holds under concurrent close.
|
|
|
|
**Resolution:** 2026-05-18 — Confirmed the check-then-set window. Fixed `GatewayClient.close`, `GalaxyRepositoryClient.close`, and `Session.close` to set `self._closed = True` *before* the `await` (channel close / `CloseSession` RPC). A second coroutine entering `close()` while the first is still awaiting now hits the early-return guard and does not issue a second `channel.close()` / `CloseSession`. Docstrings updated to state the idempotency holds under concurrent calls. TDD: regression tests in `tests/test_low_severity_findings.py` (`test_gateway_client_concurrent_close_closes_channel_once`, `test_galaxy_client_concurrent_close_closes_channel_once`, `test_session_concurrent_close_sends_one_close_session_rpc`) — each uses a fake channel/client that stalls inside `close`/`close_session_raw` so two concurrent `close()` calls interleave at the exact race window; they failed before the fix and pass after.
|
|
|
|
### Client.Python-007
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Error handling & resilience |
|
|
| Location | `clients/python/src/mxgateway/client.py:204-213` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `_canceling_iterator` (gateway event stream) does not catch `asyncio.CancelledError` to invoke `call.cancel()` explicitly — it relies on the `finally` block. `galaxy.py:_canceling_iterator` *does* explicitly catch `CancelledError`, cancel, and re-raise. The two are functionally equivalent today, but the inconsistency between near-identical helpers invites future divergence.
|
|
|
|
**Recommendation:** Make the two `_canceling_iterator` helpers identical, ideally by factoring a single shared helper.
|
|
|
|
**Resolution:** 2026-05-18 — Confirmed the divergence. Factored a single shared helper: `client._canceling_iterator(call, operation)` now takes the `map_rpc_error` operation string as a parameter, explicitly catches `asyncio.CancelledError` (cancels the call, re-raises) and `grpc.RpcError`, and repeats the cancel in `finally`. This replaces both the gateway `_canceling_iterator` and the gateway `_canceling_active_alarms_iterator`; `galaxy.py` now imports and delegates to the same helper instead of defining its own, so the gateway and Galaxy stream helpers are byte-for-byte identical. TDD: `tests/test_low_severity_findings.py::test_gateway_stream_iterator_cancels_call_on_task_cancellation` drives a cancellable fake stream and asserts the gateway iterator cancels the underlying call on task cancellation. All existing stream-cancellation tests still pass.
|
|
|
|
### Client.Python-008
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Correctness & logic bugs |
|
|
| Location | `clients/python/src/mxgateway/values.py:62-67,83-88` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `to_mx_value` maps any Python `float` to `VT_R8`/`MX_DATA_TYPE_DOUBLE` with no handling for `nan`/`inf`, which are serialised and forwarded to MXAccess which may reject or mis-handle them. `bytes` is mapped to `VT_RECORD`/`MX_DATA_TYPE_UNKNOWN`, a questionable default. The `data_type` keyword exists but `Session.write` never forwards it.
|
|
|
|
**Recommendation:** Document the float/bytes mapping assumptions, optionally validate finiteness, and consider plumbing the `data_type` keyword through `Session.write`/`write2`.
|
|
|
|
**Resolution:** 2026-05-18 — Confirmed the non-finite-float hazard. Added an `_ensure_finite` guard in `values.py`: `to_mx_value` now raises `ValueError` for `nan`/`inf`/`-inf`, both for a scalar `float` and for a non-finite element inside a float sequence — MXAccess has no defined wire representation for non-finite doubles, so rejecting client-side is the correct fail-fast. The `float`/`bytes` mapping assumptions (finite-only doubles; `bytes` as an opaque `VT_RECORD` pass-through) are now documented in the `values.py` module docstring and `clients/python/README.md`. Plumbing `data_type` through `Session.write`/`write2` was deliberately *not* done: it is a larger public-API surface change the finding only marks as "consider", and the documented MXAccess-parity convention is type-by-Python-value; the `data_type` keyword stays available on `to_mx_value` for callers that build the `MxValue` directly. TDD: `tests/test_low_severity_findings.py` adds `test_to_mx_value_rejects_nan`, `test_to_mx_value_rejects_positive_infinity`, `test_to_mx_value_rejects_negative_infinity`, `test_to_mx_value_rejects_non_finite_float_in_sequence`, and `test_to_mx_value_accepts_finite_float`. README updated since `to_mx_value` (used by `Session.write`/`write2`) now rejects an input it previously accepted.
|
|
|
|
### Client.Python-009
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Testing coverage |
|
|
| Location | `clients/python/tests/` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** Several non-trivial public paths are untested: `Session.write2`/`add_item2` request construction; the bulk-size limit `_ensure_bulk_size`/`MAX_BULK_ITEMS` guard; the `None`-argument `TypeError` guards in bulk methods; the TLS `ca_file` read path in `create_channel`; most CLI command bodies; and `map_rpc_error`'s default (non-auth) branch.
|
|
|
|
**Recommendation:** Add tests for `write2`/`add_item2` request shape, the bulk-size `ValueError`, the `ca_file` TLS branch, the generic `map_rpc_error` fallthrough, and at least one happy-path CLI command using a fake stub.
|
|
|
|
**Resolution:** 2026-05-18 — Confirmed coverage gap against the existing `tests/` files. Added `tests/test_coverage_gaps.py` covering every path the finding lists: `test_add_item2_sends_item_context_and_returns_handle` and `test_write2_sends_value_and_timestamp_value` (request shape + `MxValue` oneof), `test_subscribe_bulk_rejects_oversized_request` and `test_add_item_bulk_at_limit_is_allowed` (the `MAX_BULK_ITEMS` `_ensure_bulk_size` boundary), `test_advise_item_bulk_rejects_none_argument` (the `None`-argument `TypeError` guard), `test_create_channel_reads_ca_file` and `test_create_channel_missing_ca_file_raises` (the TLS `ca_file` read path), `test_map_rpc_error_generic_branch_returns_transport_error` and `test_map_rpc_error_handles_error_without_code` (the non-auth `map_rpc_error` fallthrough and the no-`code` path), and `test_cli_register_happy_path_emits_server_handle` (a happy-path CLI command body driven end to end through `CliRunner` with a fake stub via a monkeypatched `_connect`). All 10 new tests pass. No source change required — this is a pure coverage finding.
|
|
|
|
### Client.Python-010
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Code organization & conventions |
|
|
| Location | `clients/python/src/mxgateway/session.py:404`, `clients/python/src/mxgateway_cli/commands.py:422-425` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `session.py` ends with a module-level late import `from .client import GatewayClient # noqa: E402` purely to satisfy a string type hint, and `commands.py:_session` does a function-local import. Both work around a circular dependency that `from __future__ import annotations` (already in effect) makes unnecessary. `_session` also lacks a return type annotation.
|
|
|
|
**Recommendation:** Drop the runtime late import in `session.py` and use a `TYPE_CHECKING`-guarded import for the hint; add the `-> Session` return annotation to `commands.py:_session`.
|
|
|
|
**Resolution:** 2026-05-18 — Confirmed: with `from __future__ import annotations` in effect all annotations are strings, so the runtime late import was unnecessary. Removed the trailing `from .client import GatewayClient # noqa: E402` in `session.py` and replaced it with a top-of-file `if TYPE_CHECKING:` import that satisfies the `GatewayClient` hint without a runtime dependency (no import cycle: `client.py` does not import `session` at module scope). In `commands.py`, hoisted the function-local `from mxgateway.session import Session` to a module-level import and added the `-> Session` return annotation to `_session`. Verified `import mxgateway` and `import mxgateway_cli.commands` succeed with no circular-import error. Pure refactor — covered by the existing import and CLI tests; no new test needed.
|
|
|
|
### Client.Python-011
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Error handling & resilience |
|
|
| Location | `clients/python/src/mxgateway/errors.py:122-148` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `ensure_mxaccess_success` raises `MxAccessError` if any `mx_status.success == 0`. This treats `success == 0` as the failure sentinel, but `0` is also the proto3 scalar default for an unset `MxStatusProxy`. If the gateway ever returns a reply with an unpopulated status entry (e.g. a partially-filled bulk result), the client raises `MxAccessError` even though no real failure occurred.
|
|
|
|
**Recommendation:** Confirm against the proto/gateway contract whether `success` is guaranteed populated for every `statuses` entry; if not, key the failure decision on an explicit failure field rather than the `success == 0` default.
|
|
|
|
**Resolution:** 2026-05-18 — Confirmed against the gateway contract: `success` is **not** guaranteed populated for every `statuses` entry. `src/MxGateway.Worker/Conversion/MxStatusProxyConverter.cs::ConvertMany` emits a placeholder `MxStatusProxy` for a null `MXSTATUS_PROXY` COM array entry, setting `Category`/`DetectedBy` to `Unknown` but **leaving `Success` at its proto3 default of 0**. A fully-default proto entry likewise has `success == 0`. Under the old client logic either placeholder would falsely raise `MxAccessError`. Fixed `ensure_mxaccess_success` to key the per-status failure decision on a new `_is_mxaccess_status_failure` helper that requires `success == 0` **and** a populated, non-OK `category` — a status with `category` of `MX_STATUS_CATEGORY_UNSPECIFIED` (default proto) or `MX_STATUS_CATEGORY_UNKNOWN` (the null-entry placeholder) is treated as unpopulated and ignored. `MX_STATUS_CATEGORY_OK` is also excluded so a genuine success entry never raises. Real failures (categories `WARNING` and the error categories, raw value ≥ 2) still raise as before — the existing `write.mxaccess-failure` fixture (`SECURITY_ERROR`/`OPERATIONAL_ERROR` statuses) and the `MXACCESS_FAILURE` protocol-status path are unaffected. TDD: `tests/test_low_severity_findings.py` adds `test_ensure_mxaccess_success_ignores_unpopulated_status_entry` (default + null-placeholder entries, no raise), `test_ensure_mxaccess_success_raises_on_populated_failure_status` (populated `COMMUNICATION_ERROR`, raises), and `test_ensure_mxaccess_success_passes_when_status_reports_success`. No public-behaviour change for genuine replies, so no README update.
|
|
|
|
### Client.Python-012
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | mxaccessgw conventions |
|
|
| Location | `clients/python/src/mxgateway/client.py:84-108`, `clients/python/src/mxgateway/session.py:57-77` |
|
|
| Status | Won't Fix |
|
|
|
|
**Description:** `Session.invoke_raw` does not run `ensure_mxaccess_success` while `Session.invoke` does, so a caller using `invoke_raw` for parity tests gets a reply where an MXAccess HRESULT failure is silently embedded with no exception. This is by design but under-documented — the README's "preserve raw replies" sentence does not state that `*_raw` methods skip MXAccess-failure detection entirely.
|
|
|
|
**Recommendation:** Document explicitly (README + docstring) that `*_raw` methods surface MXAccess HRESULT/status failures only inside the reply and do not raise `MxAccessError`, so parity-test callers know to inspect `protocol_status`/`hresult`/`statuses` themselves.
|
|
|
|
**Resolution:** 2026-05-18 — Won't Fix (no behaviour change). Confirmed this is intentional, correct parity behaviour: the `*_raw` methods exist precisely so parity-test callers can inspect an unmodified gateway reply, including embedded MXAccess HRESULT/status failures, without an exception masking them. Changing `invoke_raw` to raise `MxAccessError` would defeat its purpose and duplicate `Session.invoke`. The finding's only actionable point is the documentation gap, which has been addressed: `clients/python/README.md` now states explicitly that `*_raw` methods enforce gateway protocol success only and do **not** run MXAccess-failure detection, and the docstrings of `GatewayClient.invoke_raw` and `Session.invoke_raw` say the same and point callers to inspect `protocol_status`/`hresult`/`statuses` (and to `Session.invoke` for the checked variant). No code/test change — the runtime contract is unchanged and correct.
|
|
|
|
### Client.Python-013
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Medium |
|
|
| Category | Security |
|
|
| Location | `clients/python/src/mxgateway_cli/commands.py:757-762` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `_use_plaintext` silently returns `True` whenever the endpoint
|
|
string starts with `localhost:` or `127.0.0.1:`, even if neither `--plaintext`
|
|
nor `--tls` is supplied on the command line. Any CLI subcommand (e.g.
|
|
`mxgw-py open-session --endpoint localhost:5001 --api-key mxgw_<secret>`) then
|
|
attaches the API key to a plaintext gRPC channel without warning. This is a
|
|
silent security downgrade: a user who deliberately ran the gateway behind TLS
|
|
on loopback (e.g. for testing a production-shaped TLS config locally) and who
|
|
passes `--api-key` expecting the secret to be transport-protected gets a
|
|
plaintext bearer token instead. The auto-downgrade is also undocumented —
|
|
`README.md` and the CLI `--help` text both describe `--plaintext` and `--tls`
|
|
as the controls, with no mention that endpoint-prefix matching can override
|
|
either. The other client CLIs do not auto-downgrade: the .NET CLI uses
|
|
`https://`-prefix detection on a URI scheme (an explicit signal), Go and Java
|
|
require an explicit `--plaintext`/`--tls` choice, and Rust defaults to
|
|
plaintext only when `plaintext = true` is set on the options struct.
|
|
|
|
**Recommendation:** Drop the localhost-prefix auto-plaintext branch and
|
|
require the user to pass `--plaintext` or `--tls` (or default to TLS to match
|
|
the rest of the matrix). If the implicit-localhost behaviour is kept for
|
|
ergonomics, document it prominently in both `README.md` and `--help`, emit a
|
|
stderr warning when `--api-key` is combined with the auto-downgrade path, and
|
|
add a CLI test asserting the auto-downgrade is in fact active so it is not
|
|
silently lost in a future refactor.
|
|
|
|
**Resolution:** 2026-05-20 — Removed the silent `localhost:` / `127.0.0.1:`
|
|
auto-plaintext branch from `_use_plaintext`. The new contract matches the Go
|
|
and Java CLIs: **TLS is the default**, `--plaintext` is the only way to opt
|
|
in to an unencrypted channel, and `--tls` is accepted as a redundant, explicit
|
|
affirmation of the default (mutually exclusive with `--plaintext`, which now
|
|
raises `click.UsageError`). The `--plaintext` / `--tls` `--help` text and
|
|
`clients/python/README.md` both call out the new behaviour. Added six
|
|
regression tests in `clients/python/tests/test_cli.py` covering: (a) a
|
|
`localhost:` endpoint with no flags resolves to TLS, (b) a `127.0.0.1:`
|
|
endpoint with no flags resolves to TLS, (c) `--plaintext` opts in to plaintext,
|
|
(d) `--tls` is accepted and idempotent with the default, (e) `--plaintext`
|
|
combined with `--tls` is rejected, and (f) an end-to-end CliRunner test
|
|
asserting `ClientOptions.plaintext == False` flows through to
|
|
`GatewayClient.connect` when no flag is supplied against a `localhost:`
|
|
endpoint. **Behaviour change for callers:** scripts that previously relied on
|
|
`mxgw-py … --endpoint localhost:5000 …` selecting plaintext silently must now
|
|
add an explicit `--plaintext` flag (or set up TLS on the gateway). Calling
|
|
`mxgw-py` with an `--api-key` against a plaintext-only gateway without
|
|
`--plaintext` will now fail to connect rather than silently leaking the bearer
|
|
token.
|
|
|
|
### Client.Python-014
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Code organization & conventions |
|
|
| Location | `clients/python/src/mxgateway_cli/commands.py:22-23` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `commands.py` has two consecutive `from mxgateway.values
|
|
import` lines:
|
|
|
|
```python
|
|
from mxgateway.values import to_mx_value
|
|
from mxgateway.values import MxValueInput
|
|
```
|
|
|
|
These import from the same module and should be combined into a single
|
|
`from mxgateway.values import MxValueInput, to_mx_value`. The split form is
|
|
inconsistent with the rest of the file (every other module is imported in a
|
|
single statement) and would be flagged by `ruff`/`isort` if any linter were
|
|
configured. Pure style, no behavioural impact.
|
|
|
|
**Recommendation:** Collapse the two imports into one statement, ordered to
|
|
match the conventional alphabetical-within-module pattern:
|
|
`from mxgateway.values import MxValueInput, to_mx_value`.
|
|
|
|
**Resolution:** 2026-05-20 — Collapsed the two consecutive
|
|
`from mxgateway.values import to_mx_value` / `from mxgateway.values import MxValueInput`
|
|
lines in `clients/python/src/mxgateway_cli/commands.py` into a single
|
|
`from mxgateway.values import MxValueInput, to_mx_value` statement, matching
|
|
the alphabetical-within-module pattern used elsewhere in the file. Pure style
|
|
fix — no behavioural impact, covered by the existing CLI tests.
|
|
|
|
### Client.Python-015
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Testing coverage |
|
|
| Location | `clients/python/src/mxgateway_cli/commands.py:273-294,564-647`, `clients/python/tests/` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** `_bench_read_bulk` is a ~80-line CLI body that opens its own
|
|
session, registers, subscribe_bulks, runs a warm-up loop, a measurement loop,
|
|
collects per-call latencies, computes a percentile summary, and emits the
|
|
shared cross-language JSON schema. It is the largest untested CLI command in
|
|
the module — `tests/` has no `bench_read_bulk` test, fake-stub-driven or
|
|
otherwise. A drift in the schema field names (`callsPerSecond`,
|
|
`cachedReadResults`, `latencyMs.p50`, …) would break the cross-language
|
|
`scripts/bench-read-bulk.ps1` aggregation silently. `_percentile_summary` and
|
|
`_percentile` are also untested — the boundary cases (`n == 0`, `n == 1`,
|
|
quantile interpolation) would benefit from a small unit test since the
|
|
identical algorithm is duplicated in the .NET / Go / Rust / Java drivers and
|
|
a divergence would corrupt cross-language comparisons.
|
|
|
|
**Recommendation:** Add a fake-stub-driven `bench_read_bulk` test that drives
|
|
a short `--duration-seconds 0 --warmup-seconds 0` run through `CliRunner` and
|
|
asserts the JSON schema (`language == "python"`, the full key set,
|
|
`latencyMs.p50/p95/p99/max/mean` present). Add unit tests for `_percentile`
|
|
covering `n == 0`, `n == 1`, and a known-good interpolated value at p95 so
|
|
the implementation cannot silently drift from the other clients.
|
|
|
|
**Resolution:** 2026-05-20 — Added `clients/python/tests/test_cli_bench_and_helpers.py`
|
|
with three layers of coverage. (1) `_percentile` unit tests pin the
|
|
cross-language algorithm (`rank = q * (n - 1)`, linear interpolation between
|
|
adjacent ranks): empty sample returns `0.0`, single element returns that
|
|
element, exact-rank queries return the sample value (p50 of `[10,20,30,40,50]`
|
|
is `30.0`), and the interpolated p95/p99 values (`48.0` / `49.6` for that same
|
|
five-element sample) are locked down so any drift from the .NET / Go / Rust /
|
|
Java drivers fails fast. (2) `_percentile_summary` tests assert the full
|
|
`{p50, p95, p99, max, mean}` dict shape, the zero-sample placeholder, and the
|
|
3-decimal rounding contract. (3) A `bench-read-bulk` smoke test
|
|
(`test_bench_read_bulk_emits_cross_language_schema`) drives the CLI through
|
|
`CliRunner` with `--duration-seconds 0 --warmup-seconds 0` against a fake stub
|
|
that handles `OpenSession`, `Register`, `SubscribeBulk`, `ReadBulk`, and
|
|
`UnsubscribeBulk`, then asserts the emitted JSON has exactly the 16
|
|
cross-language schema keys (`language`, `command`, `endpoint`, `clientName`,
|
|
`bulkSize`, `durationSeconds`, `warmupSeconds`, `durationMs`, `tags`,
|
|
`totalCalls`, `successfulCalls`, `failedCalls`, `totalReadResults`,
|
|
`cachedReadResults`, `callsPerSecond`, `latencyMs`) and that `latencyMs` is a
|
|
`{p50, p95, p99, max, mean}` sub-object — guarding against silent breakage of
|
|
`scripts/bench-read-bulk.ps1`'s cross-language aggregation. No source change —
|
|
this is a pure coverage finding.
|
|
|
|
### Client.Python-016
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Testing coverage |
|
|
| Location | `clients/python/src/mxgateway_cli/commands.py:25,757-775,805-830` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** Three CLI helper paths are not covered by `tests/`:
|
|
|
|
1. `_use_plaintext` localhost auto-downgrade (line 762) — the
|
|
`endpoint.startswith("localhost:") or endpoint.startswith("127.0.0.1:")`
|
|
branch (see also Client.Python-013) is untested; no test asserts that an
|
|
endpoint without `--plaintext` and without `--tls` resolves to plaintext.
|
|
2. `_collect_events` `MAX_AGGREGATE_EVENTS` guard (line 811-815) — passing
|
|
`--max-events` greater than `MAX_AGGREGATE_EVENTS` raises
|
|
`click.BadParameter`, but no test exercises the guard. A silent removal of
|
|
the constant or the comparison would not be caught.
|
|
3. `_api_key_from_env` (line 765-768) — only the implicit path through
|
|
`_secrets` is exercised; there is no test that verifies an env-var name
|
|
resolves to a value and that an unset env var produces `None`.
|
|
|
|
These are all small, fake-stub-driven CLI behaviours rather than end-to-end
|
|
paths. The previous coverage finding (Client.Python-009) closed without
|
|
adding tests for these specific paths.
|
|
|
|
**Recommendation:** Add three small `CliRunner` / unit tests: one asserting
|
|
the localhost auto-plaintext (or its replacement, if Client.Python-013 is
|
|
fixed), one asserting `--max-events 10001` exits non-zero with the
|
|
`MAX_AGGREGATE_EVENTS` error message, and one asserting
|
|
`_api_key_from_env("MXGATEWAY_API_KEY")` returns the env value and `None` for
|
|
an unset variable.
|
|
|
|
**Resolution:** 2026-05-20 — Scope adjusted: Client.Python-013 has since
|
|
removed the `_use_plaintext` localhost auto-plaintext branch, so item (1) is
|
|
no longer a real code path — the
|
|
`test_use_plaintext_requires_explicit_flag_for_localhost_endpoint` and
|
|
`test_cli_localhost_endpoint_defaults_to_tls_via_open_session` regressions
|
|
added under Client.Python-013 already pin the new TLS-by-default contract.
|
|
The remaining two helpers are now covered in
|
|
`clients/python/tests/test_cli_bench_and_helpers.py`. (2)
|
|
`MAX_AGGREGATE_EVENTS` cap:
|
|
`test_collect_events_rejects_max_events_above_aggregate_cap` drives
|
|
`stream-events` with `--max-events 10001` through `CliRunner` against
|
|
stubbed `_connect` / `_session` fakes and asserts the CLI exits non-zero with
|
|
the documented `less than or equal to 10000` message;
|
|
`test_collect_events_accepts_max_events_at_aggregate_cap_boundary` confirms
|
|
`--max-events 10000` is accepted at the boundary and returns an empty event
|
|
list. (3) `_api_key_from_env`:
|
|
`test_api_key_from_env_resolves_value_when_variable_is_set` (env-var
|
|
populated → returned),
|
|
`test_api_key_from_env_returns_none_when_variable_is_unset` (env-var unset
|
|
→ `None`), `test_api_key_from_env_returns_none_when_name_is_none` (the
|
|
`name is None` early-return), and
|
|
`test_api_key_from_env_returns_none_when_name_is_empty_string` (the
|
|
`if not name` truthiness guard). No source change — pure coverage finding.
|
|
|
|
### Client.Python-017
|
|
|
|
| Field | Value |
|
|
|---|---|
|
|
| Severity | Low |
|
|
| Category | Documentation & comments |
|
|
| Location | `clients/python/pyproject.toml:5-25`, `clients/python/src/mxgateway/` |
|
|
| Status | Resolved |
|
|
|
|
**Description:** The package metadata in `pyproject.toml` is minimal for a
|
|
published wheel:
|
|
|
|
* No `authors` field. PyPI / `pip show` will display no author.
|
|
* No `license` field, no `license-files` field, and no `LICENSE` file is
|
|
referenced from the project. The repo as a whole has no top-level
|
|
`LICENSE` either, but other client packages (Java has a license entry, the
|
|
.NET package has a license expression in the `csproj`) tend to set this.
|
|
* No `classifiers` (no `Programming Language :: Python :: 3.12`,
|
|
`Operating System :: Microsoft :: Windows`, `Topic :: …`, no
|
|
development-status classifier). Without these the PyPI search facets are
|
|
empty and tooling like `pip` cannot tell whether the package is
|
|
alpha/beta/stable.
|
|
* No `keywords`, no `[project.urls]` (no homepage / source / issue link
|
|
pointing back to the repo).
|
|
* The package ships no PEP 561 `py.typed` marker file in
|
|
`src/mxgateway/`. Type hints are written throughout the module
|
|
(`from __future__ import annotations`, full annotations on every public
|
|
function), but downstream consumers running `mypy` on `mxaccess-gateway-client`
|
|
will not see those hints — PEP 561 requires the marker file to opt the
|
|
package into type-stub distribution.
|
|
|
|
**Recommendation:** Add `authors`, `license = "<spdx>"`, `keywords`, and
|
|
`[project.urls]` to `pyproject.toml`; add at least the standard `classifiers`
|
|
trio (`Development Status`, `Programming Language :: Python :: 3.12`,
|
|
`Intended Audience`); create an empty `src/mxgateway/py.typed` file and
|
|
include it in the wheel via `[tool.setuptools.package-data]` so consumers
|
|
running `mypy` against an installed wheel pick up the type information.
|
|
|
|
**Resolution:** 2026-05-20 — Filled out `clients/python/pyproject.toml`
|
|
with the missing PyPI metadata: `authors = [{ name = "MXAccess Gateway
|
|
Authors" }]`, `license = "Proprietary"` (the repo has no top-level
|
|
`LICENSE` file and no other client publishes under an OSS licence, so the
|
|
SPDX `Proprietary` expression matches the de-facto status), the standard
|
|
classifier set (`Development Status :: 4 - Beta`, `Intended Audience ::
|
|
Developers` / `Information Technology`, `Operating System :: Microsoft ::
|
|
Windows` and `:: POSIX`, `Programming Language :: Python` /
|
|
`Python :: 3` / `Python :: 3.12`, `Topic :: Software Development ::
|
|
Libraries :: Python Modules`, `Topic :: System :: Distributed Computing`,
|
|
and `Typing :: Typed`), a `keywords` list
|
|
(`mxaccess`, `archestra`, `gateway`, `grpc`, `industrial`, `scada`), and
|
|
`[project.urls]` with `Homepage` / `Source` / `Issues` pointing at the
|
|
Gitea repo. Added the PEP 561 marker file
|
|
`clients/python/src/mxgateway/py.typed` (empty, as the spec requires) and
|
|
declared it in `[tool.setuptools.package-data] mxgateway = ["py.typed"]`
|
|
so the wheel ships the marker and downstream `mypy` users see the
|
|
inline type hints. Pure metadata / packaging change — `python -m pytest -q`
|
|
still passes (91 tests).
|