Files
mxaccessgw/code-reviews/Contracts/findings.md
T
Joseph Doherty a0203503a7 Code-review 2026-05-20 sweep: re-review at 1cd51bb, resolve 72 findings across all 11 modules
Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit 1cd51bb, filed 72 new findings, and
fixed them in three priority waves (3 High, 17 Medium, 52 Low).

Highs
- Server-017: enumerate AcknowledgeAlarm / QueryActiveAlarms in
  GatewayGrpcScopeResolver so non-admin keys can use them; document
  the mapping in docs/Authorization.md; add interceptor tests.
- Client.Java-013: add the five missing bulk-method stubs to the
  CLI FakeSession so the test module compiles on a clean tree.
- Client.Rust-013: fix the clippy::doc_lazy_continuation regression
  in generated tonic code by reformatting the ReadBulkCommand proto
  comment and scoping a #![allow(...)] to the generated submodules.

Mediums (highlights)
- Server: unify GatewaySession state-lock discipline (-015) and
  make DisposeAsync race-safe against in-flight CloseAsync (-016);
  add constraint-enforcement test coverage for the bulk-plan path
  (-021).
- Worker: introduce StaRuntimeShutdownException so RunAlarmPollLoop
  can distinguish graceful shutdown from a real STA-affinity
  violation (-016); have the watchdog skip StaHung while
  CurrentCommandCorrelationId is non-empty so a legitimate slow
  ReadBulk no longer self-faults (-017).
- Tests: add per-method round-trip + cancellation coverage for the
  11 GatewaySession bulk methods (-013); replace the real TCP probe
  in GalaxyHierarchyCacheTests with an IGalaxyRepository fake
  (-016).
- IntegrationTests: drive the StreamEvents writer in the live Write
  test and assert OnWriteComplete (-012); add live tests for
  Unadvise/RemoveItem/Unregister ordering, WriteSecured, and
  abnormal worker exit (-014).
- Worker.Tests: replace MxAccessSession reflection with an internal
  CreateForTesting factory (-016); cover WorkerCancel and
  unexpected-body envelope branches (-017).
- Client.Java: cancel MxEventStream when close() races
  beforeStart() (-014); return a CancellingCompletableFuture that
  actually forwards cancellation through .thenApply chains (-015).
- Client.Python: drop the silent localhost-plaintext downgrade in
  the CLI; require explicit --plaintext (-013).
- Client.Rust: stop bench-read-bulk from polluting success-latency
  histograms with failed-call durations (-015); add coverage for
  the five MalformedReply paths, the bulk-write helpers, the
  Error::Unavailable mapping, and the unary-fault path (-016).
- Contracts: extend docs/Contracts.md with the bulk read/write
  command family (-009).

Lows (highlights)
- Server: cap GalaxyGlobMatcher.RegexCache; align
  WorkerAlarmRpcDispatcher missing-session handling; drop the
  duplicate dashboard @page routes; refresh IAlarmRpcDispatcher
  XML doc.
- Worker: surface SetXmlAlarmQuery COM failures; remove dead
  subscriptionExpression / ExecutingCommand arms; preserve
  factory-supplied runtime sessions; split MxAlarmSnapshot.cs into
  three files.
- Tests: dispose the WebApplication in seven test classes; rebuild
  FakeWorkerProcess.WaitForExitAsync against a real TaskCompletion
  source; switch the heartbeat-expires test to ManualTimeProvider;
  add InvariantCulture to the remaining DateTimeOffset.Parse sites;
  document GalaxyFilterInputSafetyTests in GatewayTesting.md.
- IntegrationTests: comment fixes, RecordingServerStreamWriter
  IDisposable, class-level [Trait], single-source ZB default
  connection string.
- Worker.Tests: replace silent-return gating with LiveMxAccessFact
  so absent env vars SKIP not pass; PascalCase rename of probe
  [Fact]s; deterministic deadline test; new frame-protocol error
  tests; ComputeTransitions diff-coverage; relocate dev-rig probes
  to Probes/.
- Contracts: add round-trip coverage and per-field redaction /
  Galaxy-identifier comments to the protos.
- Client.Dotnet: introduce clients/dotnet/Directory.Build.props so
  TreatWarningsAsErrors / analysers apply; document
  DiscoverHierarchyOptions and IMxGatewayCliClient; require typed
  bulk-read handles in CLI; surface AcknowledgeAlarm transport
  faults through Translate().
- Client.Go: kill dead code in alarms_test / fakeGalaxyServer /
  runWriteBulkVariant; document the six new subcommands in
  writeUsage; drain galaxy-watch events on limit; switch io.EOF
  comparisons to errors.Is.
- Client.Java: shared shutdown helpers + new shutdownTimeout
  option; regex-based credential redaction; Long.toUnsignedString
  for uint64 sequence; doc fixes.
- Client.Python: combine duplicate imports; add coverage for
  _percentile / bench-read-bulk / MAX_AGGREGATE_EVENTS /
  _api_key_from_env; populate pyproject metadata and ship py.typed.
- Client.Rust: expose next_correlation_id() so CLI ping/close
  stop hard-coding correlation IDs; resync RustClientDesign.md
  with the current Session / Error surface and CLI subcommand set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 09:46:47 -04:00

225 lines
29 KiB
Markdown

# Code Review — Contracts
| Field | Value |
|---|---|
| Module | `src/MxGateway.Contracts` |
| Reviewer | Claude Code |
| Review date | 2026-05-20 |
| Commit reviewed | `1cd51bb` |
| Status | Reviewed |
| Open findings | 0 |
## Checklist coverage
This re-review focuses on the contract delta introduced since the prior review at `6c64030` — primarily the new bulk write/read command family added in `5e375f6` (`WriteBulk`, `Write2Bulk`, `WriteSecuredBulk`, `WriteSecured2Bulk`, `ReadBulk`) plus the resolution changes for Contracts-001/002/004/005/006/007/008.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | New bulk command kinds, `BulkWriteResult`, and `BulkReadResult` align with the worker executor, validator (`MxAccessGrpcRequestValidator.ExpectedPayload`), and `MxAccessSession.ReadBulk`. Field numbering is contiguous and additive (10-43 on `MxCommand.payload`, 20-40 on `MxCommandReply.payload`); no collisions. No new functional bugs. |
| 2 | mxaccessgw conventions | Additive-only evolution preserved across all three protos; new wire-compatibility policy comment block (added under Contracts-005) is honored by the bulk additions; generated code untouched; naming and oneof usage are consistent with the style guide. No new violations. |
| 3 | Concurrency & thread safety | N/A — pure contract definitions plus a static const class with no shared mutable state. |
| 4 | Error handling & resilience | `BulkWriteResult` carries the full `was_successful` + `hresult` + `statuses` + `error_message` carriers per entry; `BulkReadResult` carries `was_successful` + `was_cached` + per-entry value and statuses. The asymmetry (no `hresult` on `BulkReadResult`) is intentional given ReadBulk's lifecycle. No issues. |
| 5 | Security | The new `WriteSecuredBulkCommand` / `WriteSecured2BulkCommand` carry the redaction note on the outer command only, not on the inner entry's `value` field (Contracts-011); otherwise no secrets forced into loggable shapes. |
| 6 | Performance & resource management | `ReadBulk` is the only command without a 1:1 MXAccess analogue; the per-entry timeout shape (`uint32 timeout_ms`) and `was_cached` semantics avoid disturbing existing subscriptions. No bloat issues. |
| 7 | Design-document adherence | `gateway.md` documents the bulk write/read families, but `docs/Contracts.md` was not updated for them (Contracts-009). This violates the CLAUDE.md "update docs in the same commit as the source" rule for the bulk-read/write addition. |
| 8 | Code organization & conventions | Package / namespace / file layout correct; additive-only contract evolution observed; field numbers continuous and isolated by 100+ from diagnostic/control commands. No new issues. |
| 9 | Testing coverage | The bulk write/read families have no `ProtobufContractRoundTripTests` coverage (Contracts-010); Galaxy Repository protos and `MxArray` raw paths are now covered (per Contracts-007 resolution). |
| 10 | Documentation & comments | `GalaxyAttribute.mx_data_type` lacks an in-proto comment explaining it is a raw Galaxy integer (Contracts-012); the `GatewayContractInfoTests` summary is now stale (Contracts-013); credential-sensitive bulk entry `value` fields lack per-field redaction comments (Contracts-011). |
## Findings
### Contracts-001
| Field | Value |
|---|---|
| Severity | Low |
| Category | Design-document adherence |
| Location | `docs/Grpc.md:13` (and `:3`, `:32`, `:39`) |
| Status | Resolved |
**Description:** `mxaccess_gateway.proto` now declares six RPCs on `MxAccessGateway` (`OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, `AcknowledgeAlarm`, `QueryActiveAlarms`). `docs/Grpc.md` still describes "the four `MxAccessGateway` RPCs" in its type table and omits `AcknowledgeAlarm`/`QueryActiveAlarms` from the Validation Rules table. CLAUDE.md requires docs to change in the same commit as the contract; the alarm RPC commits left this doc stale and misleading about the public surface.
**Recommendation:** Update `docs/Grpc.md` to enumerate all six RPCs and add `AcknowledgeAlarm`/`QueryActiveAlarms` to the type/handler and validation tables, or explicitly cross-reference `AlarmClientDiscovery.md`.
**Resolution:** _(2026-05-18)_ Confirmed against `mxaccess_gateway.proto` — six RPCs declared, doc said "four". Updated `docs/Grpc.md`: the collaborator table now says "six `MxAccessGateway` RPCs", the RPC Handlers intro enumerates all six, added dedicated `AcknowledgeAlarm` and `QueryActiveAlarms` handler subsections (noting the alarm surface routes through `IAlarmRpcDispatcher` and is validated inline rather than via `MxAccessGrpcRequestValidator`, with a cross-reference to `AlarmClientDiscovery.md`), and added both alarm RPCs to the Validation Rules table.
### Contracts-002
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Error handling & resilience |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:384-385`, `:95` |
| Status | Resolved |
**Description:** `MxCommandKind` includes `MX_COMMAND_KIND_ACKNOWLEDGE_ALARM_BY_NAME = 29` and `MxCommand.payload` carries `AcknowledgeAlarmByNameCommand acknowledge_alarm_by_name_command = 38`, but `MxCommandReply.payload` has only `acknowledge_alarm = 34` and `query_active_alarms = 35` — there is no by-name reply case. The by-name ack must reuse `AcknowledgeAlarmReplyPayload` or rely on the top-level `hresult`. The command/reply payload asymmetry is undocumented and easy to dispatch incorrectly.
**Recommendation:** Either add an explicit comment to `MxCommandReply` stating that by-name ack reuses the `acknowledge_alarm` payload case, or add a dedicated payload case for symmetry, and document the chosen contract in `docs/Contracts.md` / `AlarmClientDiscovery.md`.
**Resolution:** _(2026-05-18)_ Verified against both the `.proto` and the dispatch code. The asymmetry is intentional and the code is correct: the worker's `MxAccessCommandExecutor.ExecuteAcknowledgeAlarmByName` builds `reply.AcknowledgeAlarm = new AcknowledgeAlarmReplyPayload { NativeStatus = rc }` — deliberately reusing the `acknowledge_alarm` payload case — and the gateway's `WorkerAlarmRpcDispatcher.AcknowledgeAsync` only reads the top-level `hresult`/`protocol_status`, so both ack arms work. The gap was documentation only. Took the finding's preferred option (a) — comment-only, no wire-format or generated-type change: added explicit comments to the `acknowledge_alarm` reply-payload case and to the `AcknowledgeAlarmReplyPayload` message in `mxaccess_gateway.proto` stating both ack kinds reuse this case and consumers must dispatch on `MxCommandReply.kind`, and documented the contract in `docs/AlarmClientDiscovery.md` section 4. Added regression test `ProtobufContractRoundTripTests.MxCommandReply_AcknowledgeAlarmByName_ReusesAcknowledgeAlarmPayloadCase` pinning the by-name-ack → `acknowledge_alarm` reuse and asserting no by-name-specific reply oneof case exists.
### Contracts-003
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `src/MxGateway.Contracts/MxGateway.Contracts.csproj:10` |
| Status | Won't Fix |
**Description:** The `<Protobuf>` item for `mxaccess_worker.proto` omits `ProtoRoot="Protos"`, while the items for `mxaccess_gateway.proto` (line 9) and `galaxy_repository.proto` (line 11) both set it. `mxaccess_worker.proto` does `import "mxaccess_gateway.proto"`, which resolves only because Grpc.Tools adds the importing file's own directory to the proto path. The inconsistency is fragile — tooling changes to ProtoRoot handling could break import resolution.
**Recommendation:** Add `ProtoRoot="Protos"` to the `mxaccess_worker.proto` `<Protobuf>` item so all three entries are consistent.
**Resolution:** _(2026-05-18)_ Re-triaged as not-a-defect: the finding's premise is factually wrong. Line 10 of `MxGateway.Contracts.csproj` already carries `ProtoRoot="Protos"` — all three `<Protobuf>` items are already consistent. `git show 6c64030:src/MxGateway.Contracts/MxGateway.Contracts.csproj` (the reviewed commit) confirms the attribute was present at review time too; the csproj has not been touched since `133c830`. No code change made. Status set to Won't Fix because there is nothing to fix.
### Contracts-004
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/MxGateway.Contracts/GatewayContractInfo.cs:3-6` |
| Status | Resolved |
**Description:** The XML summary says the class exposes version metadata "before generated protobuf contracts are introduced." Generated protobuf contracts have long been introduced and are consumed across the solution. The comment is stale; the class now holds the authoritative `GatewayProtocolVersion`/`WorkerProtocolVersion` advertised in `OpenSessionReply` and used to validate `WorkerEnvelope` framing.
**Recommendation:** Reword the summary to describe the current purpose — version constants advertised in `OpenSessionReply` and used to validate `WorkerEnvelope` protocol framing.
**Resolution:** _(2026-05-18)_ Confirmed stale — the class is consumed by `GatewayApplication`/`OpenSessionReply` and `WorkerEnvelope` framing checks across the solution. Reworded the XML summary on `GatewayContractInfo` to describe the actual current purpose: `GatewayProtocolVersion` is advertised to clients in `OpenSessionReply`, and `WorkerProtocolVersion` validates `WorkerEnvelope` protocol framing on the gateway↔worker pipe.
### Contracts-005
| Field | Value |
|---|---|
| Severity | Low |
| Category | mxaccessgw conventions |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto`, `src/MxGateway.Contracts/Protos/mxaccess_worker.proto` |
| Status | Resolved |
**Description:** The ProtobufStyleGuide mandates reserving removed field numbers / enum values. Evolution to date has been purely additive, so this is not a current violation — but none of the `.proto` files contain any `reserved` declarations, leaving no in-file guardrail for the first removal. This is a latent maintainability gap.
**Recommendation:** When any field or enum value is eventually removed, add a `reserved` range/name in the same change. Consider a short comment block in each message documenting the policy so future editors apply `reserved` rather than reusing tags.
**Resolution:** _(2026-05-18)_ Confirmed: no field or enum value has ever been removed, so adding `reserved` ranges now would be incorrect (there are no retired tags to reserve, and inventing ranges for never-used numbers would itself violate the contract). Took the finding's least-invasive option — added a short wire-compatibility policy comment block at the top of all three `.proto` files (`mxaccess_gateway.proto`, `mxaccess_worker.proto`, `galaxy_repository.proto`) stating the additive-only rule and instructing future editors to add a `reserved` range + name in the same change as any removal. Comment-only, no wire-format or generated-type change. The `reserved` declarations themselves remain correctly deferred to the first actual removal.
### Contracts-006
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:647` |
| Status | Resolved |
**Description:** `MxStatusProxy.success` is declared `int32 success = 1` with no comment. The name reads like a boolean flag but the type is a 32-bit integer (mirroring MXAccess `MXSTATUS_PROXY`, which stores a numeric success/HResult-like value). Without a comment a client author can reasonably misinterpret the field (treat non-1 as failure, or expect only 0/1).
**Recommendation:** Add a comment clarifying the semantic — what range of values it carries and how 0 vs non-zero map to MXAccess status — per the style guide rule to comment fields carrying raw MXAccess status detail.
**Resolution:** _(2026-05-18)_ Confirmed: `int32 success = 1` had no comment. Cross-checked against the worker `MxStatusProxyConverter`, which reads the COM struct's `success` field verbatim (a 16-bit signed value) without reinterpretation, and against the MXAccess analysis (`MXAccess-Public-API.md`: `MxStatus`/`MXSTATUS_PROXY` are identical structs with a `short success` member). Added a field comment to `MxStatusProxy.success` stating it mirrors the COM struct's numeric `success` member (NOT a boolean), is carried verbatim for diagnostics, and that clients should branch on `category` (`MX_STATUS_CATEGORY_OK` marks success) — deliberately avoiding an over-specified 0-vs-1 claim, since the gateway never maps `success` to an outcome and `category` is the authoritative field. Comment-only change.
### Contracts-007
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `src/MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs` |
| Status | Resolved |
**Description:** `ProtobufContractRoundTripTests` covers gateway command/reply/event, alarm transition, alarm ack request/reply, active-alarm snapshot, and the worker envelope. It has no coverage for: (a) any `galaxy_repository.proto` message (`DiscoverHierarchy*`, `GalaxyObject`, `GalaxyAttribute`, `DeployEvent`, the `root` oneof, wrapper-typed fields); (b) `BulkSubscribeReply`/`SubscribeResult` and the bulk command kinds; (c) `MxValue`/`MxArray` `raw_value`/`RawArray` (`bytes`) paths and the `WorkerFault`/`WorkerHeartbeat` IPC bodies.
**Recommendation:** Add round-trip tests for the Galaxy Repository messages (including the `root` oneof and proto wrapper fields), the bulk-subscribe reply, and the remaining `WorkerEnvelope` body cases.
**Resolution:** _(2026-05-18)_ Confirmed the listed gaps and added round-trip tests to `ProtobufContractRoundTripTests` covering all three areas: (a) Galaxy Repository — `GalaxyRepositoryDescriptor_ContainsBrowseServiceMethods`, `DiscoverHierarchyRequest_RoundTripsRootOneofAndWrapperFields` (a `[Theory]` exercising all three `root` oneof arms plus the `Int32Value` wrapper `max_depth`), `DiscoverHierarchyReply_RoundTripsObjectAndAttributeGraph`, `DeployEvent_RoundTripsTimestampAndCounters`, `GalaxyConnectionReplies_RoundTrip`; (b) `BulkSubscribeReply_RoundTripsSubscribeResults` and `MxCommandReply_RoundTripsBulkSubscribePayload` (bulk-subscribe command kind + payload case); (c) `MxValue_RoundTripsRawValueBytesPayload`, `MxArray_RoundTripsRawArrayPayload`, `WorkerEnvelope_RoundTripsWorkerFaultBody`, `WorkerEnvelope_RoundTripsWorkerHeartbeatBody`. All new tests pass; the full `ProtobufContractRoundTripTests` class is 27 tests green.
### Contracts-008
| Field | Value |
|---|---|
| Severity | Low |
| Category | Design-document adherence |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:451-459`, `:627-636` |
| Status | Resolved |
**Description:** The worker-side `AcknowledgeAlarmReplyPayload` carries the alarm-ack outcome as `int32 native_status`, while the public `AcknowledgeAlarmReply` carries it as `MxStatusProxy status` plus `optional int32 hresult`. The comment explains the worker echoes `native_status` into `AcknowledgeAlarmReply.hresult`, but the two outcome shapes (raw `int32` vs structured `MxStatusProxy`) are not reconciled in `docs/Contracts.md` / `AlarmClientDiscovery.md`. A reader cannot tell whether `MxStatusProxy status` is always populated or only on COM-layer failure.
**Recommendation:** Document in `docs/Contracts.md` (or `AlarmClientDiscovery.md`) how the worker `native_status` maps onto the public reply's `status`/`hresult` pair so client authors know which field is authoritative.
**Resolution:** _(2026-05-18)_ Verified against `WorkerAlarmRpcDispatcher.AcknowledgeAsync`. The asymmetry is larger than the finding implies: the dispatcher copies the worker `MxCommandReply.hresult` into `AcknowledgeAlarmReply.hresult` but **never** assigns `AcknowledgeAlarmReply.status` — the `MxStatusProxy status` field is left UNSET on every reply. The proto comment on `status` ("Native MxAccess status describing the outcome of the ack") was therefore actively misleading. Fixed: (1) reworded the `mxaccess_gateway.proto` comments on `AcknowledgeAlarmReply.hresult` (now identifies it as the authoritative native-return-code field) and `AcknowledgeAlarmReply.status` (now states it is reserved/unset and clients must not depend on it); (2) extended `docs/AlarmClientDiscovery.md` section 4 with a "Worker `native_status` → public `AcknowledgeAlarmReply` mapping" subsection spelling out that `hresult` is authoritative (`0` = success) and `status` is always unset, and that clients should branch on `protocol_status` then `hresult`, never `status`.
### Contracts-009
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Design-document adherence |
| Location | `docs/Contracts.md:13-24` |
| Status | Resolved |
**Description:** Commit `5e375f6` ("Add bulk read/write command family across worker, gateway, and clients") added five new command kinds — `WriteBulk`, `Write2Bulk`, `WriteSecuredBulk`, `WriteSecured2Bulk`, `ReadBulk` — plus the `BulkWriteReply` / `BulkWriteResult` and `BulkReadReply` / `BulkReadResult` shapes to `mxaccess_gateway.proto`. `gateway.md` (lines 299-322) was updated in that commit, but `docs/Contracts.md` was not. It still describes only the older bulk subscription family (`AddItemBulk`, `AdviseItemBulk`, `RemoveItemBulk`, `UnAdviseItemBulk`, `SubscribeBulk`, `UnsubscribeBulk`) returning `BulkSubscribeReply` with no mention of the bulk write/read commands or their per-entry result types. The CLAUDE.md rule "Update docs in the same change as the source. When public APIs, contracts, configuration, build steps, security behavior, event shapes, value conversion, status mapping, or lifecycle rules change, the affected docs … must change in the same commit" was violated for this addition. The result is that the canonical contracts document undercounts the public bulk surface by five commands.
**Recommendation:** Extend the bulk-commands paragraph in `docs/Contracts.md` to list the new `WriteBulk` / `Write2Bulk` / `WriteSecuredBulk` / `WriteSecured2Bulk` / `ReadBulk` command kinds, the per-entry request shape (`WriteBulkEntry` etc.), and the new reply types (`BulkWriteReply` carrying `BulkWriteResult`; `BulkReadReply` carrying `BulkReadResult`). Cross-reference `gateway.md` for the cached-vs-snapshot `ReadBulk` lifecycle and `docs/DesignDecisions.md` "Bulk Command Family" for the per-entry-result rationale rather than re-stating those details.
**Resolution:** _(2026-05-20)_ Confirmed `docs/Contracts.md` documented only the older bulk subscription family and never mentioned the bulk write/read additions from commit `5e375f6`. Cross-checked against `mxaccess_gateway.proto` (`MxCommand.payload` cases 39-43, `MxCommandKind` 30-34, the `Write*BulkCommand` / `Write*BulkEntry` shapes, `ReadBulkCommand` with `tag_addresses` + `timeout_ms`, `MxCommandReply.payload` cases 36-40, and the `BulkWriteReply`/`BulkWriteResult` + `BulkReadReply`/`BulkReadResult` messages). Extended the "Files" section of `docs/Contracts.md` with a new paragraph listing the five command kinds, the per-entry request shape for each `Write*Bulk` family (with the credential-sensitive redaction rule carried through to `WriteSecuredBulkEntry`/`WriteSecured2BulkEntry`), the `BulkWriteReply` + `BulkWriteResult` reply (including the `optional int32 hresult` field and the no-raise per-entry failure contract), and the `ReadBulkCommand``BulkReadReply` + `BulkReadResult` reply with the cached-vs-snapshot dual-mode semantics and the deliberate absence of `hresult` on `BulkReadResult`. Cross-references to `gateway.md` (lifecycle + scopes) and `docs/DesignDecisions.md` "Bulk Command Family" (rationale) added rather than re-stating those details.
### Contracts-010
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `src/MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs` |
| Status | Resolved |
**Description:** Contracts-007 (closed 2026-05-18) added Galaxy Repository, bulk-subscribe, `MxValue.raw_value` / `MxArray.raw_values`, and `WorkerFault`/`WorkerHeartbeat` round-trip coverage. The bulk write/read messages added in commit `5e375f6` were never given equivalent coverage. `ProtobufContractRoundTripTests` has no test that exercises any of: `WriteBulkCommand` / `Write2BulkCommand` / `WriteSecuredBulkCommand` / `WriteSecured2BulkCommand` / `ReadBulkCommand`; `BulkWriteReply` / `BulkWriteResult`; `BulkReadReply` / `BulkReadResult`; the new `MxCommandReply.payload` oneof cases (`write_bulk`, `write2_bulk`, `write_secured_bulk`, `write_secured2_bulk`, `read_bulk`). The asymmetry that `BulkWriteResult` carries `hresult` and `BulkReadResult` does not, and the `optional int32 hresult` semantics on `BulkWriteResult`, are exactly the kind of wire-shape details prior contract tests have been written to pin.
**Recommendation:** Add `ProtobufContractRoundTripTests` cases mirroring the existing `BulkSubscribeReply_RoundTripsSubscribeResults` / `MxCommandReply_RoundTripsBulkSubscribePayload` pattern: at minimum one round-trip per new request-side message (`WriteBulkCommand` covers the entry-list case; one secured variant proves the credential-sensitive shape; `ReadBulkCommand` covers `timeout_ms`), one round-trip for each new reply payload (`BulkWriteReply` carrying `BulkWriteResult` with `hresult` set + unset to exercise the proto3 `optional` presence; `BulkReadReply` carrying a `was_cached = true` and a `was_cached = false` entry), and at least one `MxCommandReply` test pinning a new payload-oneof case (e.g. `MxCommandReply.PayloadCase == PayloadOneofCase.ReadBulk` for `MxCommandKind.ReadBulk`).
**Resolution:** _(2026-05-20)_ Added round-trip tests in `ProtobufContractRoundTripTests` covering every gap listed: per-request `WriteBulkCommand_RoundTripsEntries`, `Write2BulkCommand_RoundTripsEntriesWithTimestampValue`, `WriteSecuredBulkCommand_RoundTripsCredentialBearingEntries`, `WriteSecured2BulkCommand_RoundTripsCredentialBearingEntriesWithTimestamp`, `ReadBulkCommand_RoundTripsTagAddressesAndTimeout`; per-reply `BulkWriteReply_RoundTripsResultsWithOptionalHresultPresence` (asserts both `HasHresult == true` and `HasHresult == false` arms of the proto3 `optional int32 hresult`) and `BulkReadReply_RoundTripsCachedAndSnapshotResults` (covers `was_cached = true`, `was_cached = false`, and a per-entry failure with `error_message`; additionally pins the deliberate absence of an `hresult` field on `BulkReadResult` via the descriptor); and `MxCommandReply` oneof-case pinning via `MxCommandReply_RoundTripsBulkWritePayloadCases` (a `[Theory]` exercising the four bulk-write payload-oneof cases) plus `MxCommandReply_RoundTripsReadBulkPayload`. All new tests pass; the full `ProtobufContractRoundTripTests` + `GatewayContractInfoTests` filter is 42 tests green.
### Contracts-011
| Field | Value |
|---|---|
| Severity | Low |
| Category | Security |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:392-397`, `:406-412` |
| Status | Resolved |
**Description:** The single-item `WriteSecuredCommand` (line 234-242) and `WriteSecured2Command` (line 244-253) put the credential-sensitivity redaction note on the `value` field directly ("Credential-sensitive write value. Implementations must not log this field unless an explicit redacted value-logging path is enabled."). The bulk equivalents move the note to the outer message instead — `WriteSecuredBulkCommand` (line 383-386) and `WriteSecured2BulkCommand` (line 399-400) carry it as a header comment — and the inner `WriteSecuredBulkEntry.value` (line 396) and `WriteSecured2BulkEntry.value` (line 410) are left without per-field comments. A future editor reading just `WriteSecuredBulkEntry` to add a new field or change the entry shape will not see the redaction rule. The ProtobufStyleGuide explicitly requires "Mark credential-bearing request fields clearly in comments"; the single-item path follows that rule, the bulk path does not.
**Recommendation:** Add per-field credential-sensitivity comments to `WriteSecuredBulkEntry.value` and `WriteSecured2BulkEntry.value` matching the wording on `WriteSecuredCommand.value` / `WriteSecured2Command.value`. Comment-only change with no wire-format or generated-type impact.
**Resolution:** _(2026-05-20)_ Added per-field credential-sensitivity comments to `WriteSecuredBulkEntry.value` and `WriteSecured2BulkEntry.value` in `mxaccess_gateway.proto`, mirroring verbatim the wording carried on `WriteSecuredCommand.value` / `WriteSecured2Command.value` ("Credential-sensitive write value. Implementations must not log this field unless an explicit redacted value-logging path is enabled."). The outer-message header redaction comment on `WriteSecuredBulkCommand` / `WriteSecured2BulkCommand` is retained so the rule is visible at both scopes. Comment-only change; no wire-format or generated-type impact (the `MxGateway.Contracts` build is clean against the regenerated code).
### Contracts-012
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/MxGateway.Contracts/Protos/galaxy_repository.proto:120` |
| Status | Resolved |
**Description:** `GalaxyAttribute.mx_data_type` is declared as `int32` with no in-proto comment. The field carries the raw Galaxy SQL DB type identifier (from `dbo.data_type`), which deliberately does NOT correspond to the public `MxDataType` enum in `mxaccess_gateway.proto`; `docs/Contracts.md` calls this out ("The service is metadata-only and does not share types with mxaccess_gateway.proto") and `docs/GalaxyRepository.md:190` documents the choice ("`mx_data_type` is returned as the raw Galaxy integer rather than mapped to a language-neutral enum"), but the proto file itself gives the reader no signal. A client author looking at the .proto without those docs is likely to assume the field is a `MxDataType` value and write a `(MxDataType)` cast that silently misclassifies most attributes. The ProtobufStyleGuide rule "Comment fields that carry MXAccess parity details, raw HRESULT/status information, or compatibility constraints" applies — this is exactly a parity-detail / compatibility-constraint field where the int32 has non-obvious semantics. The accompanying `data_type_name`, `mx_attribute_category`, and `security_classification` int fields share the same gap.
**Recommendation:** Add a short comment on `GalaxyAttribute.mx_data_type` (and ideally on `mx_attribute_category` and `security_classification`) clarifying that the value is a raw Galaxy SQL identifier passed through unchanged, NOT a member of the `mxaccess_gateway.v1.MxDataType` enum, with a pointer to `docs/GalaxyRepository.md`. Comment-only change; no wire-format impact.
**Resolution:** _(2026-05-20)_ Added in-proto comments to `GalaxyAttribute.mx_data_type`, `data_type_name`, `mx_attribute_category`, and `security_classification` in `galaxy_repository.proto`. The `mx_data_type` comment explicitly calls out that the value is a raw Galaxy SQL `dbo.data_type` identifier passed through unchanged, that it is NOT a member of `mxaccess_gateway.v1.MxDataType`, and that the two enumerations must not be cast or compared (closing the silent-misclassification trap the finding describes). The `data_type_name` comment clarifies it is free-form Galaxy text from the same table, not a stable enum. `mx_attribute_category` and `security_classification` comments mark them as raw Galaxy-specific identifiers not mapped to any gateway enum. All four comments cross-reference `docs/GalaxyRepository.md` for the rationale rather than restating it. Comment-only change; no wire-format impact.
### Contracts-013
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/MxGateway.Tests/Contracts/GatewayContractInfoTests.cs:14` |
| Status | Resolved |
**Description:** The XML summary on `GatewayContractInfoTests.GatewayProtocolVersion_IsVersionThree` reads "Verifies that the gateway protocol version is bumped to three after the alarm proto extension." That description is now incomplete: since the comment was written, the contract has been extended again (the bulk write/read command family in commit `5e375f6`) without a corresponding `GatewayProtocolVersion` bump. The test name says "IsVersionThree" but the summary attributes the value-of-3 to a single historical event (the alarm extension) — readers checking whether subsequent contract additions should have bumped the version will get a misleading rationale. This is the same class of stale-summary issue as Contracts-004 (`GatewayContractInfo` class summary), just relocated to the test that pins the constant.
**Recommendation:** Reword the summary to describe what the test pins (the current `GatewayProtocolVersion` constant equals 3) rather than narrating a specific historical bump, OR explicitly enumerate the alarm- and bulk-write/read additions covered under version 3 so readers know both extensions were additive and intentionally did not require a bump.
**Resolution:** _(2026-05-20)_ Reworded the XML summary on `GatewayContractInfoTests.GatewayProtocolVersion_IsVersionThree` to describe what the test actually pins: the current `GatewayProtocolVersion` constant equals 3, with both the alarm proto extension (`AcknowledgeAlarm` / `QueryActiveAlarms` RPCs, `OnAlarmTransitionEvent`, the alarm command/reply payload cases) AND the bulk write/read command family extension (`WriteBulk` / `Write2Bulk` / `WriteSecuredBulk` / `WriteSecured2Bulk` / `ReadBulk` with their `BulkWriteReply` / `BulkReadReply` payloads) shipping under version 3 as strictly additive changes that did not require a further bump. The new summary also instructs that a future breaking contract change should bump the constant and update the test in lock-step. Test logic is unchanged; the test still passes.