Files
mxaccessgw/code-reviews/Contracts/findings.md
T
Joseph Doherty 24de7e21d9 Regenerate code-reviews index after Low findings Batch 3
Reflects resolution of Contracts-001/004/005/006/007/008 (and Contracts-003
re-triaged Won't Fix). All code-review findings across every module are now
closed. Also normalizes the Contracts-003 Status to the canonical
`Won't Fix` value the index generator expects.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 23:12:34 -04:00

148 lines
15 KiB
Markdown

# Code Review — Contracts
| Field | Value |
|---|---|
| Module | `src/MxGateway.Contracts` |
| Reviewer | Claude Code |
| Review date | 2026-05-18 |
| Commit reviewed | `6c64030` |
| Status | Reviewed |
| Open findings | 0 |
## Checklist coverage
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | No functional bugs; one missing reply-payload case for the by-name ack command and an `int32`-typed `success` flag that reads like a bool (Contracts-002, Contracts-006). |
| 2 | mxaccessgw conventions | Additive-only evolution honored (no renumbered/removed tags), MXAccess-aligned naming consistent, generated code untouched; no `reserved` statements declared as a guardrail (Contracts-005). |
| 3 | Concurrency & thread safety | N/A — pure contract definitions plus a static const class with no shared mutable state. |
| 4 | Error handling & resilience | HRESULT / `MxStatusProxy` / `ProtocolStatus` carriers are complete; the worker-side by-name alarm ack has no dedicated reply payload (Contracts-002). |
| 5 | Security | Credential-sensitive fields are clearly commented; no secrets forced into loggable shapes. No issues found. |
| 6 | Performance & resource management | `DiscoverHierarchy` is paged; alarm-snapshot streams are server-streamed; no bloat issues. No issues found. |
| 7 | Design-document adherence | `.proto` files match design intent but `docs/Grpc.md` is stale (Contracts-001); worker vs public alarm-status shapes unreconciled in docs (Contracts-008). |
| 8 | Code organization & conventions | Package/file layout correct; stale class summary (Contracts-004). Contracts-003 (`mxaccess_worker.proto` Protobuf item missing `ProtoRoot`) was re-triaged as not-a-defect — the attribute is already present. |
| 9 | Testing coverage | Gateway/worker/alarm round-trips covered; Galaxy Repository protos and raw `MxArray` paths untested (Contracts-007). |
| 10 | Documentation & comments | Proto comments accurate and domain-rich; one stale class summary (Contracts-004). |
## Findings
### Contracts-001
| Field | Value |
|---|---|
| Severity | Low |
| Category | Design-document adherence |
| Location | `docs/Grpc.md:13` (and `:3`, `:32`, `:39`) |
| Status | Resolved |
**Description:** `mxaccess_gateway.proto` now declares six RPCs on `MxAccessGateway` (`OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, `AcknowledgeAlarm`, `QueryActiveAlarms`). `docs/Grpc.md` still describes "the four `MxAccessGateway` RPCs" in its type table and omits `AcknowledgeAlarm`/`QueryActiveAlarms` from the Validation Rules table. CLAUDE.md requires docs to change in the same commit as the contract; the alarm RPC commits left this doc stale and misleading about the public surface.
**Recommendation:** Update `docs/Grpc.md` to enumerate all six RPCs and add `AcknowledgeAlarm`/`QueryActiveAlarms` to the type/handler and validation tables, or explicitly cross-reference `AlarmClientDiscovery.md`.
**Resolution:** _(2026-05-18)_ Confirmed against `mxaccess_gateway.proto` — six RPCs declared, doc said "four". Updated `docs/Grpc.md`: the collaborator table now says "six `MxAccessGateway` RPCs", the RPC Handlers intro enumerates all six, added dedicated `AcknowledgeAlarm` and `QueryActiveAlarms` handler subsections (noting the alarm surface routes through `IAlarmRpcDispatcher` and is validated inline rather than via `MxAccessGrpcRequestValidator`, with a cross-reference to `AlarmClientDiscovery.md`), and added both alarm RPCs to the Validation Rules table.
### Contracts-002
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Error handling & resilience |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:384-385`, `:95` |
| Status | Resolved |
**Description:** `MxCommandKind` includes `MX_COMMAND_KIND_ACKNOWLEDGE_ALARM_BY_NAME = 29` and `MxCommand.payload` carries `AcknowledgeAlarmByNameCommand acknowledge_alarm_by_name_command = 38`, but `MxCommandReply.payload` has only `acknowledge_alarm = 34` and `query_active_alarms = 35` — there is no by-name reply case. The by-name ack must reuse `AcknowledgeAlarmReplyPayload` or rely on the top-level `hresult`. The command/reply payload asymmetry is undocumented and easy to dispatch incorrectly.
**Recommendation:** Either add an explicit comment to `MxCommandReply` stating that by-name ack reuses the `acknowledge_alarm` payload case, or add a dedicated payload case for symmetry, and document the chosen contract in `docs/Contracts.md` / `AlarmClientDiscovery.md`.
**Resolution:** _(2026-05-18)_ Verified against both the `.proto` and the dispatch code. The asymmetry is intentional and the code is correct: the worker's `MxAccessCommandExecutor.ExecuteAcknowledgeAlarmByName` builds `reply.AcknowledgeAlarm = new AcknowledgeAlarmReplyPayload { NativeStatus = rc }` — deliberately reusing the `acknowledge_alarm` payload case — and the gateway's `WorkerAlarmRpcDispatcher.AcknowledgeAsync` only reads the top-level `hresult`/`protocol_status`, so both ack arms work. The gap was documentation only. Took the finding's preferred option (a) — comment-only, no wire-format or generated-type change: added explicit comments to the `acknowledge_alarm` reply-payload case and to the `AcknowledgeAlarmReplyPayload` message in `mxaccess_gateway.proto` stating both ack kinds reuse this case and consumers must dispatch on `MxCommandReply.kind`, and documented the contract in `docs/AlarmClientDiscovery.md` section 4. Added regression test `ProtobufContractRoundTripTests.MxCommandReply_AcknowledgeAlarmByName_ReusesAcknowledgeAlarmPayloadCase` pinning the by-name-ack → `acknowledge_alarm` reuse and asserting no by-name-specific reply oneof case exists.
### Contracts-003
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `src/MxGateway.Contracts/MxGateway.Contracts.csproj:10` |
| Status | Won't Fix |
**Description:** The `<Protobuf>` item for `mxaccess_worker.proto` omits `ProtoRoot="Protos"`, while the items for `mxaccess_gateway.proto` (line 9) and `galaxy_repository.proto` (line 11) both set it. `mxaccess_worker.proto` does `import "mxaccess_gateway.proto"`, which resolves only because Grpc.Tools adds the importing file's own directory to the proto path. The inconsistency is fragile — tooling changes to ProtoRoot handling could break import resolution.
**Recommendation:** Add `ProtoRoot="Protos"` to the `mxaccess_worker.proto` `<Protobuf>` item so all three entries are consistent.
**Resolution:** _(2026-05-18)_ Re-triaged as not-a-defect: the finding's premise is factually wrong. Line 10 of `MxGateway.Contracts.csproj` already carries `ProtoRoot="Protos"` — all three `<Protobuf>` items are already consistent. `git show 6c64030:src/MxGateway.Contracts/MxGateway.Contracts.csproj` (the reviewed commit) confirms the attribute was present at review time too; the csproj has not been touched since `133c830`. No code change made. Status set to Won't Fix because there is nothing to fix.
### Contracts-004
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/MxGateway.Contracts/GatewayContractInfo.cs:3-6` |
| Status | Resolved |
**Description:** The XML summary says the class exposes version metadata "before generated protobuf contracts are introduced." Generated protobuf contracts have long been introduced and are consumed across the solution. The comment is stale; the class now holds the authoritative `GatewayProtocolVersion`/`WorkerProtocolVersion` advertised in `OpenSessionReply` and used to validate `WorkerEnvelope` framing.
**Recommendation:** Reword the summary to describe the current purpose — version constants advertised in `OpenSessionReply` and used to validate `WorkerEnvelope` protocol framing.
**Resolution:** _(2026-05-18)_ Confirmed stale — the class is consumed by `GatewayApplication`/`OpenSessionReply` and `WorkerEnvelope` framing checks across the solution. Reworded the XML summary on `GatewayContractInfo` to describe the actual current purpose: `GatewayProtocolVersion` is advertised to clients in `OpenSessionReply`, and `WorkerProtocolVersion` validates `WorkerEnvelope` protocol framing on the gateway↔worker pipe.
### Contracts-005
| Field | Value |
|---|---|
| Severity | Low |
| Category | mxaccessgw conventions |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto`, `src/MxGateway.Contracts/Protos/mxaccess_worker.proto` |
| Status | Resolved |
**Description:** The ProtobufStyleGuide mandates reserving removed field numbers / enum values. Evolution to date has been purely additive, so this is not a current violation — but none of the `.proto` files contain any `reserved` declarations, leaving no in-file guardrail for the first removal. This is a latent maintainability gap.
**Recommendation:** When any field or enum value is eventually removed, add a `reserved` range/name in the same change. Consider a short comment block in each message documenting the policy so future editors apply `reserved` rather than reusing tags.
**Resolution:** _(2026-05-18)_ Confirmed: no field or enum value has ever been removed, so adding `reserved` ranges now would be incorrect (there are no retired tags to reserve, and inventing ranges for never-used numbers would itself violate the contract). Took the finding's least-invasive option — added a short wire-compatibility policy comment block at the top of all three `.proto` files (`mxaccess_gateway.proto`, `mxaccess_worker.proto`, `galaxy_repository.proto`) stating the additive-only rule and instructing future editors to add a `reserved` range + name in the same change as any removal. Comment-only, no wire-format or generated-type change. The `reserved` declarations themselves remain correctly deferred to the first actual removal.
### Contracts-006
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:647` |
| Status | Resolved |
**Description:** `MxStatusProxy.success` is declared `int32 success = 1` with no comment. The name reads like a boolean flag but the type is a 32-bit integer (mirroring MXAccess `MXSTATUS_PROXY`, which stores a numeric success/HResult-like value). Without a comment a client author can reasonably misinterpret the field (treat non-1 as failure, or expect only 0/1).
**Recommendation:** Add a comment clarifying the semantic — what range of values it carries and how 0 vs non-zero map to MXAccess status — per the style guide rule to comment fields carrying raw MXAccess status detail.
**Resolution:** _(2026-05-18)_ Confirmed: `int32 success = 1` had no comment. Cross-checked against the worker `MxStatusProxyConverter`, which reads the COM struct's `success` field verbatim (a 16-bit signed value) without reinterpretation, and against the MXAccess analysis (`MXAccess-Public-API.md`: `MxStatus`/`MXSTATUS_PROXY` are identical structs with a `short success` member). Added a field comment to `MxStatusProxy.success` stating it mirrors the COM struct's numeric `success` member (NOT a boolean), is carried verbatim for diagnostics, and that clients should branch on `category` (`MX_STATUS_CATEGORY_OK` marks success) — deliberately avoiding an over-specified 0-vs-1 claim, since the gateway never maps `success` to an outcome and `category` is the authoritative field. Comment-only change.
### Contracts-007
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `src/MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs` |
| Status | Resolved |
**Description:** `ProtobufContractRoundTripTests` covers gateway command/reply/event, alarm transition, alarm ack request/reply, active-alarm snapshot, and the worker envelope. It has no coverage for: (a) any `galaxy_repository.proto` message (`DiscoverHierarchy*`, `GalaxyObject`, `GalaxyAttribute`, `DeployEvent`, the `root` oneof, wrapper-typed fields); (b) `BulkSubscribeReply`/`SubscribeResult` and the bulk command kinds; (c) `MxValue`/`MxArray` `raw_value`/`RawArray` (`bytes`) paths and the `WorkerFault`/`WorkerHeartbeat` IPC bodies.
**Recommendation:** Add round-trip tests for the Galaxy Repository messages (including the `root` oneof and proto wrapper fields), the bulk-subscribe reply, and the remaining `WorkerEnvelope` body cases.
**Resolution:** _(2026-05-18)_ Confirmed the listed gaps and added round-trip tests to `ProtobufContractRoundTripTests` covering all three areas: (a) Galaxy Repository — `GalaxyRepositoryDescriptor_ContainsBrowseServiceMethods`, `DiscoverHierarchyRequest_RoundTripsRootOneofAndWrapperFields` (a `[Theory]` exercising all three `root` oneof arms plus the `Int32Value` wrapper `max_depth`), `DiscoverHierarchyReply_RoundTripsObjectAndAttributeGraph`, `DeployEvent_RoundTripsTimestampAndCounters`, `GalaxyConnectionReplies_RoundTrip`; (b) `BulkSubscribeReply_RoundTripsSubscribeResults` and `MxCommandReply_RoundTripsBulkSubscribePayload` (bulk-subscribe command kind + payload case); (c) `MxValue_RoundTripsRawValueBytesPayload`, `MxArray_RoundTripsRawArrayPayload`, `WorkerEnvelope_RoundTripsWorkerFaultBody`, `WorkerEnvelope_RoundTripsWorkerHeartbeatBody`. All new tests pass; the full `ProtobufContractRoundTripTests` class is 27 tests green.
### Contracts-008
| Field | Value |
|---|---|
| Severity | Low |
| Category | Design-document adherence |
| Location | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:451-459`, `:627-636` |
| Status | Resolved |
**Description:** The worker-side `AcknowledgeAlarmReplyPayload` carries the alarm-ack outcome as `int32 native_status`, while the public `AcknowledgeAlarmReply` carries it as `MxStatusProxy status` plus `optional int32 hresult`. The comment explains the worker echoes `native_status` into `AcknowledgeAlarmReply.hresult`, but the two outcome shapes (raw `int32` vs structured `MxStatusProxy`) are not reconciled in `docs/Contracts.md` / `AlarmClientDiscovery.md`. A reader cannot tell whether `MxStatusProxy status` is always populated or only on COM-layer failure.
**Recommendation:** Document in `docs/Contracts.md` (or `AlarmClientDiscovery.md`) how the worker `native_status` maps onto the public reply's `status`/`hresult` pair so client authors know which field is authoritative.
**Resolution:** _(2026-05-18)_ Verified against `WorkerAlarmRpcDispatcher.AcknowledgeAsync`. The asymmetry is larger than the finding implies: the dispatcher copies the worker `MxCommandReply.hresult` into `AcknowledgeAlarmReply.hresult` but **never** assigns `AcknowledgeAlarmReply.status` — the `MxStatusProxy status` field is left UNSET on every reply. The proto comment on `status` ("Native MxAccess status describing the outcome of the ack") was therefore actively misleading. Fixed: (1) reworded the `mxaccess_gateway.proto` comments on `AcknowledgeAlarmReply.hresult` (now identifies it as the authoritative native-return-code field) and `AcknowledgeAlarmReply.status` (now states it is reserved/unset and clients must not depend on it); (2) extended `docs/AlarmClientDiscovery.md` section 4 with a "Worker `native_status` → public `AcknowledgeAlarmReply` mapping" subsection spelling out that `hresult` is authoritative (`0` = success) and `status` is always unset, and that clients should branch on `protocol_status` then `hresult`, never `status`.