Resolve Contracts-001/004/005/006/007/008 code-review findings

Contracts-001: docs/Grpc.md still described "four MxAccessGateway RPCs" —
updated to the actual six (adding AcknowledgeAlarm and QueryActiveAlarms to
the handler and validation-rule sections).

Contracts-003 (Won't Fix): the finding is factually wrong — the <Protobuf>
item for mxaccess_worker.proto already sets ProtoRoot="Protos"; all three
items are consistent (confirmed back to the reviewed commit).

Contracts-004: corrected the stale GatewayContractInfo XML summary
("before generated protobuf contracts are introduced").

Contracts-005: no proto field/enum value was ever removed, so no reserved
ranges were invented. Added a wire-compatibility policy comment to all three
.proto files instructing future editors to reserve removed numbers.

Contracts-006: documented MxStatusProxy.success — it mirrors the COM
MXSTATUS_PROXY numeric success member, is not a boolean, and clients should
branch on category.

Contracts-007: added 13 round-trip tests covering galaxy_repository.proto
messages, bulk-subscribe payloads, and raw-value/IPC worker bodies.

Contracts-008: WorkerAlarmRpcDispatcher never assigns AcknowledgeAlarmReply.
status, so the old "native status" proto comment was misleading. Corrected
the hresult/status proto comments and documented the worker native_status →
public reply mapping in AlarmClientDiscovery.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-18 23:12:00 -04:00
parent 771229b39f
commit ee959e46e6
9 changed files with 448 additions and 24 deletions
+16 -16
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-18 |
| Commit reviewed | `6c64030` |
| Status | Reviewed |
| Open findings | 7 |
| Open findings | 0 |
## Checklist coverage
@@ -20,7 +20,7 @@
| 5 | Security | Credential-sensitive fields are clearly commented; no secrets forced into loggable shapes. No issues found. |
| 6 | Performance & resource management | `DiscoverHierarchy` is paged; alarm-snapshot streams are server-streamed; no bloat issues. No issues found. |
| 7 | Design-document adherence | `.proto` files match design intent but `docs/Grpc.md` is stale (Contracts-001); worker vs public alarm-status shapes unreconciled in docs (Contracts-008). |
| 8 | Code organization & conventions | Package/file layout correct; `mxaccess_worker.proto` Protobuf item missing `ProtoRoot` (Contracts-003); stale class summary (Contracts-004). |
| 8 | Code organization & conventions | Package/file layout correct; stale class summary (Contracts-004). Contracts-003 (`mxaccess_worker.proto` Protobuf item missing `ProtoRoot`) was re-triaged as not-a-defect — the attribute is already present. |
| 9 | Testing coverage | Gateway/worker/alarm round-trips covered; Galaxy Repository protos and raw `MxArray` paths untested (Contracts-007). |
| 10 | Documentation & comments | Proto comments accurate and domain-rich; one stale class summary (Contracts-004). |
@@ -33,13 +33,13 @@
| Severity | Low |
| Category | Design-document adherence |
| Location | `docs/Grpc.md:13` (and `:3`, `:32`, `:39`) |
| Status | Open |
| Status | Resolved |
**Description:** `mxaccess_gateway.proto` now declares six RPCs on `MxAccessGateway` (`OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, `AcknowledgeAlarm`, `QueryActiveAlarms`). `docs/Grpc.md` still describes "the four `MxAccessGateway` RPCs" in its type table and omits `AcknowledgeAlarm`/`QueryActiveAlarms` from the Validation Rules table. CLAUDE.md requires docs to change in the same commit as the contract; the alarm RPC commits left this doc stale and misleading about the public surface.
**Recommendation:** Update `docs/Grpc.md` to enumerate all six RPCs and add `AcknowledgeAlarm`/`QueryActiveAlarms` to the type/handler and validation tables, or explicitly cross-reference `AlarmClientDiscovery.md`.
**Resolution:** _(open)_
**Resolution:** _(2026-05-18)_ Confirmed against `mxaccess_gateway.proto` — six RPCs declared, doc said "four". Updated `docs/Grpc.md`: the collaborator table now says "six `MxAccessGateway` RPCs", the RPC Handlers intro enumerates all six, added dedicated `AcknowledgeAlarm` and `QueryActiveAlarms` handler subsections (noting the alarm surface routes through `IAlarmRpcDispatcher` and is validated inline rather than via `MxAccessGrpcRequestValidator`, with a cross-reference to `AlarmClientDiscovery.md`), and added both alarm RPCs to the Validation Rules table.
### Contracts-002
@@ -63,13 +63,13 @@
| Severity | Low |
| Category | Code organization & conventions |
| Location | `src/MxGateway.Contracts/MxGateway.Contracts.csproj:10` |
| Status | Open |
| Status | Won't Fix (re-triaged — not a defect) |
**Description:** The `<Protobuf>` item for `mxaccess_worker.proto` omits `ProtoRoot="Protos"`, while the items for `mxaccess_gateway.proto` (line 9) and `galaxy_repository.proto` (line 11) both set it. `mxaccess_worker.proto` does `import "mxaccess_gateway.proto"`, which resolves only because Grpc.Tools adds the importing file's own directory to the proto path. The inconsistency is fragile — tooling changes to ProtoRoot handling could break import resolution.
**Recommendation:** Add `ProtoRoot="Protos"` to the `mxaccess_worker.proto` `<Protobuf>` item so all three entries are consistent.
**Resolution:** _(open)_
**Resolution:** _(2026-05-18)_ Re-triaged as not-a-defect: the finding's premise is factually wrong. Line 10 of `MxGateway.Contracts.csproj` already carries `ProtoRoot="Protos"` — all three `<Protobuf>` items are already consistent. `git show 6c64030:src/MxGateway.Contracts/MxGateway.Contracts.csproj` (the reviewed commit) confirms the attribute was present at review time too; the csproj has not been touched since `133c830`. No code change made. Status set to Won't Fix because there is nothing to fix.
### Contracts-004
@@ -78,13 +78,13 @@
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/MxGateway.Contracts/GatewayContractInfo.cs:3-6` |
| Status | Open |
| Status | Resolved |
**Description:** The XML summary says the class exposes version metadata "before generated protobuf contracts are introduced." Generated protobuf contracts have long been introduced and are consumed across the solution. The comment is stale; the class now holds the authoritative `GatewayProtocolVersion`/`WorkerProtocolVersion` advertised in `OpenSessionReply` and used to validate `WorkerEnvelope` framing.
**Recommendation:** Reword the summary to describe the current purpose — version constants advertised in `OpenSessionReply` and used to validate `WorkerEnvelope` protocol framing.
**Resolution:** _(open)_
**Resolution:** _(2026-05-18)_ Confirmed stale — the class is consumed by `GatewayApplication`/`OpenSessionReply` and `WorkerEnvelope` framing checks across the solution. Reworded the XML summary on `GatewayContractInfo` to describe the actual current purpose: `GatewayProtocolVersion` is advertised to clients in `OpenSessionReply`, and `WorkerProtocolVersion` validates `WorkerEnvelope` protocol framing on the gateway↔worker pipe.
### Contracts-005
@@ -93,13 +93,13 @@
| Severity | Low |
| Category | mxaccessgw conventions |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto`, `src/MxGateway.Contracts/Protos/mxaccess_worker.proto` |
| Status | Open |
| Status | Resolved |
**Description:** The ProtobufStyleGuide mandates reserving removed field numbers / enum values. Evolution to date has been purely additive, so this is not a current violation — but none of the `.proto` files contain any `reserved` declarations, leaving no in-file guardrail for the first removal. This is a latent maintainability gap.
**Recommendation:** When any field or enum value is eventually removed, add a `reserved` range/name in the same change. Consider a short comment block in each message documenting the policy so future editors apply `reserved` rather than reusing tags.
**Resolution:** _(open)_
**Resolution:** _(2026-05-18)_ Confirmed: no field or enum value has ever been removed, so adding `reserved` ranges now would be incorrect (there are no retired tags to reserve, and inventing ranges for never-used numbers would itself violate the contract). Took the finding's least-invasive option — added a short wire-compatibility policy comment block at the top of all three `.proto` files (`mxaccess_gateway.proto`, `mxaccess_worker.proto`, `galaxy_repository.proto`) stating the additive-only rule and instructing future editors to add a `reserved` range + name in the same change as any removal. Comment-only, no wire-format or generated-type change. The `reserved` declarations themselves remain correctly deferred to the first actual removal.
### Contracts-006
@@ -108,13 +108,13 @@
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:647` |
| Status | Open |
| Status | Resolved |
**Description:** `MxStatusProxy.success` is declared `int32 success = 1` with no comment. The name reads like a boolean flag but the type is a 32-bit integer (mirroring MXAccess `MXSTATUS_PROXY`, which stores a numeric success/HResult-like value). Without a comment a client author can reasonably misinterpret the field (treat non-1 as failure, or expect only 0/1).
**Recommendation:** Add a comment clarifying the semantic — what range of values it carries and how 0 vs non-zero map to MXAccess status — per the style guide rule to comment fields carrying raw MXAccess status detail.
**Resolution:** _(open)_
**Resolution:** _(2026-05-18)_ Confirmed: `int32 success = 1` had no comment. Cross-checked against the worker `MxStatusProxyConverter`, which reads the COM struct's `success` field verbatim (a 16-bit signed value) without reinterpretation, and against the MXAccess analysis (`MXAccess-Public-API.md`: `MxStatus`/`MXSTATUS_PROXY` are identical structs with a `short success` member). Added a field comment to `MxStatusProxy.success` stating it mirrors the COM struct's numeric `success` member (NOT a boolean), is carried verbatim for diagnostics, and that clients should branch on `category` (`MX_STATUS_CATEGORY_OK` marks success) — deliberately avoiding an over-specified 0-vs-1 claim, since the gateway never maps `success` to an outcome and `category` is the authoritative field. Comment-only change.
### Contracts-007
@@ -123,13 +123,13 @@
| Severity | Low |
| Category | Testing coverage |
| Location | `src/MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs` |
| Status | Open |
| Status | Resolved |
**Description:** `ProtobufContractRoundTripTests` covers gateway command/reply/event, alarm transition, alarm ack request/reply, active-alarm snapshot, and the worker envelope. It has no coverage for: (a) any `galaxy_repository.proto` message (`DiscoverHierarchy*`, `GalaxyObject`, `GalaxyAttribute`, `DeployEvent`, the `root` oneof, wrapper-typed fields); (b) `BulkSubscribeReply`/`SubscribeResult` and the bulk command kinds; (c) `MxValue`/`MxArray` `raw_value`/`RawArray` (`bytes`) paths and the `WorkerFault`/`WorkerHeartbeat` IPC bodies.
**Recommendation:** Add round-trip tests for the Galaxy Repository messages (including the `root` oneof and proto wrapper fields), the bulk-subscribe reply, and the remaining `WorkerEnvelope` body cases.
**Resolution:** _(open)_
**Resolution:** _(2026-05-18)_ Confirmed the listed gaps and added round-trip tests to `ProtobufContractRoundTripTests` covering all three areas: (a) Galaxy Repository — `GalaxyRepositoryDescriptor_ContainsBrowseServiceMethods`, `DiscoverHierarchyRequest_RoundTripsRootOneofAndWrapperFields` (a `[Theory]` exercising all three `root` oneof arms plus the `Int32Value` wrapper `max_depth`), `DiscoverHierarchyReply_RoundTripsObjectAndAttributeGraph`, `DeployEvent_RoundTripsTimestampAndCounters`, `GalaxyConnectionReplies_RoundTrip`; (b) `BulkSubscribeReply_RoundTripsSubscribeResults` and `MxCommandReply_RoundTripsBulkSubscribePayload` (bulk-subscribe command kind + payload case); (c) `MxValue_RoundTripsRawValueBytesPayload`, `MxArray_RoundTripsRawArrayPayload`, `WorkerEnvelope_RoundTripsWorkerFaultBody`, `WorkerEnvelope_RoundTripsWorkerHeartbeatBody`. All new tests pass; the full `ProtobufContractRoundTripTests` class is 27 tests green.
### Contracts-008
@@ -138,10 +138,10 @@
| Severity | Low |
| Category | Design-document adherence |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:451-459`, `:627-636` |
| Status | Open |
| Status | Resolved |
**Description:** The worker-side `AcknowledgeAlarmReplyPayload` carries the alarm-ack outcome as `int32 native_status`, while the public `AcknowledgeAlarmReply` carries it as `MxStatusProxy status` plus `optional int32 hresult`. The comment explains the worker echoes `native_status` into `AcknowledgeAlarmReply.hresult`, but the two outcome shapes (raw `int32` vs structured `MxStatusProxy`) are not reconciled in `docs/Contracts.md` / `AlarmClientDiscovery.md`. A reader cannot tell whether `MxStatusProxy status` is always populated or only on COM-layer failure.
**Recommendation:** Document in `docs/Contracts.md` (or `AlarmClientDiscovery.md`) how the worker `native_status` maps onto the public reply's `status`/`hresult` pair so client authors know which field is authoritative.
**Resolution:** _(open)_
**Resolution:** _(2026-05-18)_ Verified against `WorkerAlarmRpcDispatcher.AcknowledgeAsync`. The asymmetry is larger than the finding implies: the dispatcher copies the worker `MxCommandReply.hresult` into `AcknowledgeAlarmReply.hresult` but **never** assigns `AcknowledgeAlarmReply.status` — the `MxStatusProxy status` field is left UNSET on every reply. The proto comment on `status` ("Native MxAccess status describing the outcome of the ack") was therefore actively misleading. Fixed: (1) reworded the `mxaccess_gateway.proto` comments on `AcknowledgeAlarmReply.hresult` (now identifies it as the authoritative native-return-code field) and `AcknowledgeAlarmReply.status` (now states it is reserved/unset and clients must not depend on it); (2) extended `docs/AlarmClientDiscovery.md` section 4 with a "Worker `native_status` → public `AcknowledgeAlarmReply` mapping" subsection spelling out that `hresult` is authoritative (`0` = success) and `status` is always unset, and that clients should branch on `protocol_status` then `hresult`, never `status`.