code-reviews: 2026-06-18 re-review of array-write-ergonomics feature at 88915c3

Re-reviewed the 10 modules touched by the MxSparseArray / array-write
ergonomics work (8df5ab3..88915c3). 16 new findings:

- Server-057 (Medium): [] AddItem normalization skips AddItemBulk/AddBufferedItem
- Client.Dotnet-030 (Medium): advise-supervisory missing from IsKnownGatewayCommand (dead command)
- 14 Low: MxSparseArray doc/test gaps, advise-supervisory CLI gaps across clients,
  Client.Java-049 / Client.Python-037 version-bump consistency misses

Worker.Tests and IntegrationTests clean. Worker unchanged by the feature, not re-reviewed.
This commit is contained in:
Joseph Doherty
2026-06-18 10:33:47 -04:00
parent 88915c3d9a
commit 85ef453d0d
11 changed files with 588 additions and 39 deletions
+48 -3
View File
@@ -4,10 +4,10 @@
|---|---|
| Module | `clients/dotnet` |
| Reviewer | Claude Code |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Review date | 2026-06-18 |
| Commit reviewed | `88915c3` |
| Status | Re-reviewed |
| Open findings | 0 |
| Open findings | 1 |
## Checklist coverage
@@ -604,6 +604,32 @@ Net effect at HEAD: `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx
**Resolution:** 2026-06-15 — Confirmed against source: `Children => _children` returned the live mutable backing `List<LazyBrowseNode>` and `IsExpanded => _isExpanded` read a plain `bool`, while `ExpandAsync` appended to that same list under `_expandLock` with no release/acquire barrier to lock-free readers — so a concurrent reader could enumerate a mid-append list and throw `InvalidOperationException` ("collection was modified"). Applied option (b) (safe publication): `ExpandAsync` now accumulates children into a method-local `List<LazyBrowseNode>` and, only when fully drained across all pages, publishes it via `Volatile.Write(ref _children, children)` (release) immediately before setting the now-`volatile bool _isExpanded = true`. The `_children` field is an `IReadOnlyList<LazyBrowseNode>` read via `Volatile.Read` from the `Children` getter (acquire), so a reader that observes `IsExpanded == true` always sees the fully-populated snapshot and never enumerates a partially-built list. Updated the `ExpandAsync` `<remarks>` to document the strengthened concurrent-read guarantee. Regression test `LazyBrowseNodeTests.Expand_ConcurrentReadOfChildren_NeverTearsAndPublishesAtomically` gates the child-page RPCs (via a new `FakeGalaxyRepositoryTransport.BrowseChildrenGate` hook) to hold the expand mid-flight while a background reader spins enumerating `Children` and reading `IsExpanded`, asserting no exception escapes and that once `IsExpanded` is true the published snapshot has all five children. Verified red against the pre-fix code (the reader threw `InvalidOperationException: Collection was modified` deterministically across three runs) and green after the fix.
#### 2026-06-18 re-review (commit 88915c3)
Re-review of changes since `8df5ab3`. The diff adds `WriteArrayElementsAsync` /
`BuildSparseArray` to `MxGatewaySession`, an `advise-supervisory` CLI subcommand,
the Client.Dotnet-028 (`TryResolveApiKey`) and Client.Dotnet-029 (`IMxGatewayCliClient`
summary) in-source fixes, two tests covering the new sparse-array helper, a "Write
Semantics And Common Pitfalls" section in README.md, the `LazyBrowseNode` Client.Dotnet-027
rationale comment, and a version bump (`0.1.1``0.1.2`). The 029 and 028 fixes are
correctly applied. The `isLongRunning` / `galaxy-browse` fix from Client.Dotnet-026 is
correctly present. One Medium correctness bug found: `advise-supervisory` is in the
dispatch table but missing from `IsKnownGatewayCommand`, making the command
unreachable (exit 2 "Unknown command").
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Issue found (Client.Dotnet-030): `advise-supervisory` is present in the `command switch` dispatch table but absent from `IsKnownGatewayCommand`; the guard at line 91 intercepts it first and returns exit code 2 "Unknown command", making the command completely non-functional. `WriteArrayElementsAsync` / `BuildSparseArray` logic is correct: `elementDataType` and `totalLength` are threaded through faithfully, `MxValue.SparseArrayValue` is set and the outer `MxValue.DataType` (unused by the expander) is left at the proto3 default — consistent with all other language clients. Index validation (out-of-range, duplicate, zero total_length) is correctly deferred to `SparseArrayExpander` gateway-side, consistent with Go/Rust/Python/Java. |
| 2 | mxaccessgw conventions | No issues found — no forked proto, `authorization: Bearer` metadata unchanged, MXAccess parity preserved (sparse array is a write-only helper, reset-not-preserve semantics documented). `Async` suffix on `WriteArrayElementsAsync` correct. `BuildSparseArray` is `internal static` — appropriate since it is used by both the method and tests. |
| 3 | Concurrency & thread safety | No issues found — `BuildSparseArray` is a pure static factory with no shared state; `WriteArrayElementsAsync` delegates to the existing `WriteAsync`. |
| 4 | Error handling & resilience | No issues found — `ArgumentNullException.ThrowIfNull(elements)` covers the null-dict case; invalid indices / unsupported element types surface as `InvalidArgument` from `SparseArrayExpander`, which the existing `RpcExceptionMapper` maps to `MxGatewayException` with `StatusCode`. |
| 5 | Security | No issues found — `TryResolveApiKey` correctly wired; regression test covers the env-var-sourced key path. |
| 6 | Performance & resource management | No issues found — `BuildSparseArray` is O(n) allocation with no unnecessary copies; the protobuf `repeated` list is built in one pass. |
| 7 | Design-document adherence | No issues found — sparse array semantics match the proto comment on `MxSparseArray` ("reset, NOT preserved") and `SparseArrayExpander`'s design; the README "bare-name auto-normalized to `[]` form at AddItem" claim is confirmed by `GatewaySession.cs:973` and `SessionManager.cs:52`. |
| 8 | Code organization & conventions | No issues found beyond the correctness finding above (missing `IsKnownGatewayCommand` entry is the same defect). |
| 9 | Testing coverage | No issues found — `BuildSparseArray_ProducesSparseArrayValueWithCorrectTotalLengthAndElements` and `WriteArrayElementsAsync_BuildsWriteCommandWithSparseArrayValue` cover the happy path; `RunAsync_ErrorOutput_RedactsApiKey_WhenSourcedFromEnvironmentVariable` covers the Client.Dotnet-028 path. No test for `advise-supervisory` (the new command is dead, so there is nothing to test until the `IsKnownGatewayCommand` gap is fixed). |
| 10 | Documentation & comments | No issues found — the "Write Semantics And Common Pitfalls" README section accurately describes default-fill / reset semantics, the supervisory-advise prerequisite for user attribution, and the auto-`[]` normalization. XML docs on `WriteArrayElementsAsync` and `BuildSparseArray` are accurate and complete; the `<remarks>` block on `WriteArrayElementsAsync` correctly emphasises "RESET, not preserve". |
#### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the .NET client delta: `LazyBrowseNode` lazy paging + tests, the new `MxGatewayClientCli` galaxy-browse surface + tests, `GalaxyClientFactory`/adapter seam. Client.Dotnet-025 (LazyBrowseNode publish ordering) confirmed resolved. One Medium security regression.
@@ -680,3 +706,22 @@ Re-review of the .NET client delta: `LazyBrowseNode` lazy paging + tests, the ne
**Recommendation:** Add a one-line `<summary>` describing the interface and noting `MxGatewayCliClientAdapter` is the production binding.
**Resolution:** 2026-06-16 — Confirmed against source: the interface declaration at `IMxGatewayCliClient.cs:6` had no type-level `<summary>` (only the members were documented). Added a type-level `<summary>` describing the interface as the CLI's transport seam over the gateway and Galaxy Repository RPCs, naming `MxGatewayCliClientAdapter` (over a real `MxGatewayClient`) as the production binding and the in-memory fake as the test substitute. Pure documentation change — no test needed.
### Client.Dotnet-030
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:91-93,113,2023-2050` |
| Status | Open |
**Description:** `advise-supervisory` was added to the `command switch` dispatch table at line 113 but was not added to `IsKnownGatewayCommand` (the exhaustive list at lines 20232050). The guard at line 91 evaluates `IsKnownGatewayCommand(command)` before the dispatch table is reached; because `"advise-supervisory"` is absent from that list, `WriteUnknownCommand` is called and the method returns exit code 2 with "Unknown command 'advise-supervisory'." printed to stderr. The handler at line 113 is dead code — it can never execute.
The README documents `advise-supervisory` (`clients/dotnet/README.md:159` "The CLI exposes the same command as `advise-supervisory`") and `WriteUsage` lists it (line 2093), so callers following the docs will receive a confusing failure with no obvious remedy.
Note: `"advise"` is correctly present in `IsKnownGatewayCommand` (line 2030); the omission of `"advise-supervisory"` is an oversight introduced when the command was added in this diff.
**Recommendation:** Add `or "advise-supervisory"` to the `IsKnownGatewayCommand` expression (e.g. after `"advise"` at line 2030). Add a test (`MxGatewayClientCliTests`) that invokes `advise-supervisory` through `RunAsync` with a fake client and asserts exit code 0 (not 2) and that the reply is written to stdout — this would have caught the regression immediately.
**Resolution:** _(empty until closed)_