diff --git a/code-reviews/Client.Dotnet/findings.md b/code-reviews/Client.Dotnet/findings.md index cddc11c..e2b2305 100644 --- a/code-reviews/Client.Dotnet/findings.md +++ b/code-reviews/Client.Dotnet/findings.md @@ -4,10 +4,10 @@ |---|---| | Module | `clients/dotnet` | | Reviewer | Claude Code | -| Review date | 2026-06-16 | -| Commit reviewed | `8df5ab3` | +| Review date | 2026-06-18 | +| Commit reviewed | `88915c3` | | Status | Re-reviewed | -| Open findings | 0 | +| Open findings | 1 | ## Checklist coverage @@ -604,6 +604,32 @@ Net effect at HEAD: `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx **Resolution:** 2026-06-15 — Confirmed against source: `Children => _children` returned the live mutable backing `List` and `IsExpanded => _isExpanded` read a plain `bool`, while `ExpandAsync` appended to that same list under `_expandLock` with no release/acquire barrier to lock-free readers — so a concurrent reader could enumerate a mid-append list and throw `InvalidOperationException` ("collection was modified"). Applied option (b) (safe publication): `ExpandAsync` now accumulates children into a method-local `List` and, only when fully drained across all pages, publishes it via `Volatile.Write(ref _children, children)` (release) immediately before setting the now-`volatile bool _isExpanded = true`. The `_children` field is an `IReadOnlyList` read via `Volatile.Read` from the `Children` getter (acquire), so a reader that observes `IsExpanded == true` always sees the fully-populated snapshot and never enumerates a partially-built list. Updated the `ExpandAsync` `` to document the strengthened concurrent-read guarantee. Regression test `LazyBrowseNodeTests.Expand_ConcurrentReadOfChildren_NeverTearsAndPublishesAtomically` gates the child-page RPCs (via a new `FakeGalaxyRepositoryTransport.BrowseChildrenGate` hook) to hold the expand mid-flight while a background reader spins enumerating `Children` and reading `IsExpanded`, asserting no exception escapes and that once `IsExpanded` is true the published snapshot has all five children. Verified red against the pre-fix code (the reader threw `InvalidOperationException: Collection was modified` deterministically across three runs) and green after the fix. +#### 2026-06-18 re-review (commit 88915c3) + +Re-review of changes since `8df5ab3`. The diff adds `WriteArrayElementsAsync` / +`BuildSparseArray` to `MxGatewaySession`, an `advise-supervisory` CLI subcommand, +the Client.Dotnet-028 (`TryResolveApiKey`) and Client.Dotnet-029 (`IMxGatewayCliClient` +summary) in-source fixes, two tests covering the new sparse-array helper, a "Write +Semantics And Common Pitfalls" section in README.md, the `LazyBrowseNode` Client.Dotnet-027 +rationale comment, and a version bump (`0.1.1` → `0.1.2`). The 029 and 028 fixes are +correctly applied. The `isLongRunning` / `galaxy-browse` fix from Client.Dotnet-026 is +correctly present. One Medium correctness bug found: `advise-supervisory` is in the +dispatch table but missing from `IsKnownGatewayCommand`, making the command +unreachable (exit 2 "Unknown command"). + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | Issue found (Client.Dotnet-030): `advise-supervisory` is present in the `command switch` dispatch table but absent from `IsKnownGatewayCommand`; the guard at line 91 intercepts it first and returns exit code 2 "Unknown command", making the command completely non-functional. `WriteArrayElementsAsync` / `BuildSparseArray` logic is correct: `elementDataType` and `totalLength` are threaded through faithfully, `MxValue.SparseArrayValue` is set and the outer `MxValue.DataType` (unused by the expander) is left at the proto3 default — consistent with all other language clients. Index validation (out-of-range, duplicate, zero total_length) is correctly deferred to `SparseArrayExpander` gateway-side, consistent with Go/Rust/Python/Java. | +| 2 | mxaccessgw conventions | No issues found — no forked proto, `authorization: Bearer` metadata unchanged, MXAccess parity preserved (sparse array is a write-only helper, reset-not-preserve semantics documented). `Async` suffix on `WriteArrayElementsAsync` correct. `BuildSparseArray` is `internal static` — appropriate since it is used by both the method and tests. | +| 3 | Concurrency & thread safety | No issues found — `BuildSparseArray` is a pure static factory with no shared state; `WriteArrayElementsAsync` delegates to the existing `WriteAsync`. | +| 4 | Error handling & resilience | No issues found — `ArgumentNullException.ThrowIfNull(elements)` covers the null-dict case; invalid indices / unsupported element types surface as `InvalidArgument` from `SparseArrayExpander`, which the existing `RpcExceptionMapper` maps to `MxGatewayException` with `StatusCode`. | +| 5 | Security | No issues found — `TryResolveApiKey` correctly wired; regression test covers the env-var-sourced key path. | +| 6 | Performance & resource management | No issues found — `BuildSparseArray` is O(n) allocation with no unnecessary copies; the protobuf `repeated` list is built in one pass. | +| 7 | Design-document adherence | No issues found — sparse array semantics match the proto comment on `MxSparseArray` ("reset, NOT preserved") and `SparseArrayExpander`'s design; the README "bare-name auto-normalized to `[]` form at AddItem" claim is confirmed by `GatewaySession.cs:973` and `SessionManager.cs:52`. | +| 8 | Code organization & conventions | No issues found beyond the correctness finding above (missing `IsKnownGatewayCommand` entry is the same defect). | +| 9 | Testing coverage | No issues found — `BuildSparseArray_ProducesSparseArrayValueWithCorrectTotalLengthAndElements` and `WriteArrayElementsAsync_BuildsWriteCommandWithSparseArrayValue` cover the happy path; `RunAsync_ErrorOutput_RedactsApiKey_WhenSourcedFromEnvironmentVariable` covers the Client.Dotnet-028 path. No test for `advise-supervisory` (the new command is dead, so there is nothing to test until the `IsKnownGatewayCommand` gap is fixed). | +| 10 | Documentation & comments | No issues found — the "Write Semantics And Common Pitfalls" README section accurately describes default-fill / reset semantics, the supervisory-advise prerequisite for user attribution, and the auto-`[]` normalization. XML docs on `WriteArrayElementsAsync` and `BuildSparseArray` are accurate and complete; the `` block on `WriteArrayElementsAsync` correctly emphasises "RESET, not preserve". | + #### 2026-06-16 re-review (commit 8df5ab3) Re-review of the .NET client delta: `LazyBrowseNode` lazy paging + tests, the new `MxGatewayClientCli` galaxy-browse surface + tests, `GalaxyClientFactory`/adapter seam. Client.Dotnet-025 (LazyBrowseNode publish ordering) confirmed resolved. One Medium security regression. @@ -680,3 +706,22 @@ Re-review of the .NET client delta: `LazyBrowseNode` lazy paging + tests, the ne **Recommendation:** Add a one-line `` describing the interface and noting `MxGatewayCliClientAdapter` is the production binding. **Resolution:** 2026-06-16 — Confirmed against source: the interface declaration at `IMxGatewayCliClient.cs:6` had no type-level `` (only the members were documented). Added a type-level `` describing the interface as the CLI's transport seam over the gateway and Galaxy Repository RPCs, naming `MxGatewayCliClientAdapter` (over a real `MxGatewayClient`) as the production binding and the in-memory fake as the test substitute. Pure documentation change — no test needed. + +### Client.Dotnet-030 + +| Field | Value | +|---|---| +| Severity | Medium | +| Category | Correctness & logic bugs | +| Location | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:91-93,113,2023-2050` | +| Status | Open | + +**Description:** `advise-supervisory` was added to the `command switch` dispatch table at line 113 but was not added to `IsKnownGatewayCommand` (the exhaustive list at lines 2023–2050). The guard at line 91 evaluates `IsKnownGatewayCommand(command)` before the dispatch table is reached; because `"advise-supervisory"` is absent from that list, `WriteUnknownCommand` is called and the method returns exit code 2 with "Unknown command 'advise-supervisory'." printed to stderr. The handler at line 113 is dead code — it can never execute. + +The README documents `advise-supervisory` (`clients/dotnet/README.md:159` "The CLI exposes the same command as `advise-supervisory`") and `WriteUsage` lists it (line 2093), so callers following the docs will receive a confusing failure with no obvious remedy. + +Note: `"advise"` is correctly present in `IsKnownGatewayCommand` (line 2030); the omission of `"advise-supervisory"` is an oversight introduced when the command was added in this diff. + +**Recommendation:** Add `or "advise-supervisory"` to the `IsKnownGatewayCommand` expression (e.g. after `"advise"` at line 2030). Add a test (`MxGatewayClientCliTests`) that invokes `advise-supervisory` through `RunAsync` with a fake client and asserts exit code 0 (not 2) and that the reply is written to stdout — this would have caught the regression immediately. + +**Resolution:** _(empty until closed)_ diff --git a/code-reviews/Client.Go/findings.md b/code-reviews/Client.Go/findings.md index e05d97d..a15a747 100644 --- a/code-reviews/Client.Go/findings.md +++ b/code-reviews/Client.Go/findings.md @@ -4,10 +4,10 @@ |---|---| | Module | `clients/go` | | Reviewer | Claude Code | -| Review date | 2026-06-16 | -| Commit reviewed | `8df5ab3` | +| Review date | 2026-06-18 | +| Commit reviewed | `88915c3` | | Status | Re-reviewed | -| Open findings | 0 | +| Open findings | 3 | ## Checklist coverage @@ -798,3 +798,70 @@ if ($dirty) { **Recommendation:** Add a complete subcommand reference, and document the `batch` EOR-sentinel protocol and line cap. **Resolution:** 2026-06-16 — Expanded the README CLI section with a "Subcommand reference" table covering all 27 subcommands wired into `run` (incl. `ping`, `galaxy-browse`, `read-bulk`, the four bulk-write variants, `bench-read-bulk`, `stream-alarms`, `acknowledge-alarm`, `batch`), refreshed the example block, and added a "`batch` mode" subsection documenting the `__MXGW_BATCH_EOR__` end-of-result sentinel, the JSON error framing, blank-line skipping, and the 16 MiB scanner line cap. + +#### 2026-06-18 re-review (commit 88915c3) + +Re-review of `clients/go/` changes since `8df5ab3`: `WriteArrayElements` default-fill helper (`mxgateway/session.go`), `MxSparseArray`/`MxSparseElement` type aliases (`types.go`), `advise-supervisory` CLI subcommand (`cmd/mxgw-go/main.go`), prior-finding fixes (`runStreamEvents` signal handler, `closeSmokeSession` double-defer, `runGalaxyWatch` limit-drain, test gaps from Client.Go-033), and README write-semantics documentation. `gofmt -l .`, `go vet ./...`, `go build ./...`, and `go test ./... -count=1` are all clean. + +`WriteArrayElements` is structurally correct: `sort.Slice` on unique `uint32` keys produces a deterministic slice (no stable-sort needed); ranging over a nil map produces an empty slice without panic (Go idiom); the `SparseArrayValue` oneof arm is set correctly. Three new findings: `advise-supervisory` missing from `writeUsage` and the README subcommand table; no CLI-level test for `advise-supervisory`; README claims `write2` as a CLI command when Go exposes only `write2-bulk`. + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | No issues found. `buildSparseArrayValue` handles nil map (empty-range, no panic), duplicate-free `uint32` keys, and deterministic sort correctly. | +| 2 | mxaccessgw conventions | No issues found. `gofmt -l .` / `go vet ./...` clean; the direct `pb` import in `runAdviseSupervisory` matches the README-documented escape-hatch pattern. | +| 3 | Concurrency & thread safety | No issues found. `WriteArrayElements` is a pure helper with no shared state. | +| 4 | Error handling & resilience | No issues found. `WriteArrayElements` delegates to `Write`; error propagation is consistent with sibling methods. | +| 5 | Security | No issues found. | +| 6 | Performance & resource management | No issues found. `sort.Slice` on a small caller-sized slice is appropriate. | +| 7 | Design-document adherence | No issues found. `SparseArrayValue` oneof arm matches the proto contract; RESET semantics documented in code and README. | +| 8 | Code organization & conventions | `advise-supervisory` is wired into `run()` but absent from `writeUsage()` and the README subcommand table (Client.Go-035). | +| 9 | Testing coverage | `advise-supervisory` has no CLI-level test, not even the session-id-required guard (Client.Go-036). | +| 10 | Documentation & comments | README "Write Semantics" section claims "`write` / `write2` take `--user-id`" but Go CLI has no standalone `write2` command (Client.Go-037). | + +### Client.Go-035 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Code organization & conventions | +| Location | `clients/go/cmd/mxgw-go/main.go:1298`, `clients/go/README.md:328-355` | +| Status | Open | + +**Description:** `advise-supervisory` is wired into the `run()` dispatch switch at `main.go:91-92` but is absent from two surfaces a user consults to discover CLI commands: + +1. `writeUsage()` at `main.go:1298` does not list `advise-supervisory` in its pipe-separated command enumeration, so `mxgw-go` invoked with no arguments or an unknown command prints a usage banner that omits the command. +2. The README "Subcommand reference" table at `README.md:328-355` — added in commit `9eedf9d` to be a complete canonical list — also omits `advise-supervisory`. + +This is exactly the shape Client.Go-012 (resolved 2026-05-20) and Client.Go-034 (resolved 2026-06-16) documented for previously-missing commands. + +**Recommendation:** Add `advise-supervisory` to the `writeUsage` string and to the README subcommand table (e.g., `| advise-supervisory | Advise one item supervisory — required before a userID-attributed plain Write. |`). + +### Client.Go-036 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Testing coverage | +| Location | `clients/go/cmd/mxgw-go/main_test.go`, `clients/go/cmd/mxgw-go/main.go:364-399` | +| Status | Open | + +**Description:** `runAdviseSupervisory` has no CLI-level test in `main_test.go`. In particular the session-id-required guard at `main.go:376-378` is untested, unlike every other guard for session-id-required commands (e.g. `TestRunStreamEventsRequiresSessionID`, added in the same commit range by Client.Go-033). A future refactor that removes or conditions the guard has no regression catch. The pattern for adding such a test is already established in the test file and requires no bufconn fake. + +**Recommendation:** Add `TestRunAdviseSupervisoryRequiresSessionID` to `cmd/mxgw-go/main_test.go`, driving `runWithIO` with `[]string{"advise-supervisory", "-plaintext", "-api-key", "test"}` (no `-session-id`) and asserting `err.Error()` contains `"session-id is required"`. Mirrors `TestRunStreamEventsRequiresSessionID`. + +### Client.Go-037 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Documentation & comments | +| Location | `clients/go/README.md:136-137` | +| Status | Open | + +**Description:** The README "Write Semantics" section states: + +> The CLI exposes the same command as `advise-supervisory`, and `write` / `write2` take `--user-id`. + +The Go CLI has no standalone `write2` command — only `write2-bulk`. The analogous statement in the Python, Rust, and .NET README is accurate because those CLIs do expose `write2` as a standalone subcommand. A Go caller following this doc and attempting `mxgw-go write2 -session-id ... -type int32 -value 42 -timestamp 2026-01-01T00:00:00Z` receives `unknown command "write2"` (routed to the default branch of `run()`), not the expected MXAccess Write2 call. + +**Recommendation:** Change the sentence to accurately reflect the Go CLI surface, e.g.: "The CLI exposes the same command as `advise-supervisory`, and `write` takes `-user-id`." If a standalone `write2` command is intended for cross-client parity, add it (mirroring `runWrite` with the addition of a `-timestamp` flag and a `Write2Raw`/`Write2` SDK call). diff --git a/code-reviews/Client.Java/findings.md b/code-reviews/Client.Java/findings.md index e1ec9ad..389afca 100644 --- a/code-reviews/Client.Java/findings.md +++ b/code-reviews/Client.Java/findings.md @@ -4,10 +4,10 @@ |---|---| | Module | `clients/java` | | Reviewer | Claude Code | -| Review date | 2026-06-16 | -| Commit reviewed | `8df5ab3` | +| Review date | 2026-06-18 | +| Commit reviewed | `88915c3` | | Status | Re-reviewed | -| Open findings | 0 | +| Open findings | 3 | ## Checklist coverage @@ -123,6 +123,31 @@ Re-review of the Java client delta: the §8 `GalaxyClientFactory` seam, `InProce | 9 | Testing coverage | Client.Java-045, Client.Java-046 | | 10 | Documentation & comments | Client.Java-047, Client.Java-048 | +### 2026-06-18 review (commit 88915c3) + +Re-review pass at `88915c3`. Diff against `8df5ab3` is six commits touching +`clients/java`: `8df0479` / `bdb7e14` / `8cebe43` / `bed647c` (Client.Java-040..048 +fixes — control-character JSON escaping, stream-alarms terminal-slot fix, async +overflow flood test, `InProcessGatewayHarness` Javadoc, Javadoc corrections, +and `MxGatewayClientVersion` bump to 0.1.1); `9eedf9d` (parity-gotchas docs + +`advise-supervisory` CLI subcommand across all language clients); +`e7b8aa6` (Java `writeArrayElements` default-fill SDK helper + session test); +`88915c3` (version bump `0.1.1 → 0.1.2` in `build.gradle`). Generated protobuf +Java (`src/main/generated/`) excluded from review — build churn only. + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | No issues found. `writeArrayElements` builds the `MxSparseArray` correctly: `elementDataType`, `totalLength`, and elements iterated via `TreeMap` for deterministic ascending order match the Go/Rust/Python/dotnet reference implementations. The `MxValues.decode` `SPARSE_ARRAY_VALUE -> null` arm is sound — the type is write-only and never returned by the gateway; returning `null` is the correct sentinel (matches `KIND_NOT_SET`). | +| 2 | mxaccessgw conventions | No issues found. `advise-supervisory` routes through `invokeCommand` using `MX_COMMAND_KIND_ADVISE_SUPERVISORY` — no MXAccess COM touched in the client, generated code untouched. | +| 3 | Concurrency & thread safety | No issues found. The `stream-alarms` terminal-slot rework (`AtomicBoolean terminated` + `AtomicReference terminal`) is a sound first-terminal-wins design. The poll-then-check-terminal drain loop is correct for the `terminal.set` publish ordering (`terminated=true` is set before `terminal.set(item)`, but the drain only reads `terminal` when `poll` returns null, so a retry on the next 50ms poll sees it). | +| 4 | Error handling & resilience | No issues found. `writeArrayElements` propagates transport/protocol errors via the existing `writeRaw` / `invokeCommand` path and its `MxGatewayException` contract. | +| 5 | Security | No issues found. No new auth surface, no logging of values or credentials. | +| 6 | Performance & resource management | No issues found. `new TreeMap<>(elements)` makes a defensive copy for deterministic iteration — correct and cheap for practical element counts. | +| 7 | Design-document adherence | No issues found. `writeArrayElements` delegates to `writeRaw`, which ultimately routes through the normal `MX_COMMAND_KIND_WRITE` path — MXAccess parity is preserved; the gateway expands the sparse descriptor, not the client. | +| 8 | Code organization & conventions | Issue found: `build.gradle` bumped to `0.1.2` but `MxGatewayClientVersion.CLIENT_VERSION` remains `"0.1.1"` and the tests assert `0.1.1` — same version-split as resolved Client.Java-044 (Client.Java-049). | +| 9 | Testing coverage | Issue found: the new `advise-supervisory` CLI subcommand has a `FakeSession` stub but no dedicated CLI-level test (Client.Java-050). | +| 10 | Documentation & comments | Issue found: `writeArrayElements` Javadoc documents `[0, totalLength)` index contract and `totalLength > 0` as required, but no client-side `IllegalArgumentException` is thrown for violations — only the Javadoc describes the constraint; Java `int` silently sign-extends to a large `uint32` on the wire for negative inputs (Client.Java-051). README dependency example still shows `0.1.1` (cross-ref Client.Java-049). | + ## Findings ### Client.Java-001 @@ -896,3 +921,57 @@ BrowseChildrenReply reply = galaxy.browseChildren( **Recommendation:** Add a `PROVIDER_STATUS` arm to `formatAlarmFeedMessage` that renders the provider status (mode / degraded / reason) consistently with the other alarm-feed arms — do not add a `default ->` that silently drops it, since the provider status is meaningful and the exhaustive switch is the compiler-enforced guard that catches exactly this kind of future contract drift. **Resolution:** 2026-06-15 — Confirmed via `gradle :zb-mom-ww-mxgateway-cli:compileJava` failing with "the switch expression does not cover all possible input values" at `MxGatewayCli.java:1699` on the Windows host. Added a `case PROVIDER_STATUS ->` arm to `formatAlarmFeedMessage` yielding `provider-status mode=%s degraded=%b reason=%s` (from `AlarmProviderStatus.getMode().name()` / `getDegraded()` / `getReason()`), plus the `import mxaccess_gateway.v1.MxaccessGateway.AlarmProviderStatus;`. No `default` arm — the exhaustive switch expression remains the compile-time guard against future `payload` oneof additions. Verified `gradle test` builds and passes on the Windows host (Java 21). + +### Client.Java-049 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Code organization & conventions | +| Location | `clients/java/build.gradle:16`, `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayClientVersion.java:12`, `clients/java/zb-mom-ww-mxgateway-cli/src/test/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCliTests.java:59,89`, `clients/java/README.md:399` | +| Status | Open | + +**Description:** Commit `88915c3` (`chore(clients): bump all five clients 0.1.1 -> 0.1.2`) incremented `build.gradle` `version = '0.1.2'` but left `MxGatewayClientVersion.CLIENT_VERSION = "0.1.1"` unchanged. The two CLI test assertions that check the version string also still assert `0.1.1` (lines 59 and 89 of `MxGatewayCliTests.java`), and the `README.md` Maven dependency example at line 399 shows `:0.1.1`. The published Gradle artifact carries version `0.1.2` (from `build.gradle`) while the `version` CLI command reports `mxgateway-java 0.1.1` and the README tells a consumer to depend on `:0.1.1`. Same class of version drift as the resolved Client.Java-044 (where `0.1.0` vs `0.1.1` was the split) — the fix for Client.Java-044 bumped `CLIENT_VERSION` to `"0.1.1"` but the `build.gradle` bump to `0.1.2` was not accompanied by a matching `MxGatewayClientVersion` update. + +**Recommendation:** Bump `CLIENT_VERSION` to `"0.1.2"` in `MxGatewayClientVersion.java`, update the two `MxGatewayCliTests` assertions from `0.1.1` to `0.1.2`, and update the `README.md` dependency example coordinate to `:0.1.2`. Consider sourcing `CLIENT_VERSION` from a Gradle-generated resource file (e.g. via `processResources` task writing `version.properties`) so the two version strings cannot drift again. + +### Client.Java-050 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Testing coverage | +| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1046-1068` (new `AdviseSupervisoryCommand`), `clients/java/zb-mom-ww-mxgateway-cli/src/test/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCliTests.java:1306-1313` (stub) | +| Status | Open | + +**Description:** Commit `9eedf9d` added the `advise-supervisory` CLI subcommand (`AdviseSupervisoryCommand`) to all language client CLIs. The Java `FakeSession.adviseSupervisoryRaw` stub was added to `MxGatewayCliTests` but no test exercises the new subcommand path. There is no test that calls `execute(factory, "advise-supervisory", "--session-id", "s", "--server-handle", "1", "--item-handle", "2")` and asserts the command routes through `session.adviseSupervisoryRaw`, produces a non-zero exit code on failure, or emits the correct JSON / text output. The `adviseCalled` field shared with `adviseRaw` means even an indirect smoke path that calls `advise` could mask a missing `adviseSupervisory` wire. Every other new CLI subcommand in this diff has a dedicated CLI-level test (the `writeArrayElements` helper has a session-level test in `MxGatewayClientSessionTests`). + +**Recommendation:** Add a `@Test void adviseSupervisoryCommandCallsAdviseSupervisoryRaw()` to `MxGatewayCliTests` that exercises the subcommand via `execute(factory, "advise-supervisory", "--session-id", "s", "--server-handle", "12", "--item-handle", "34")` and asserts exit code 0, that `factory.client.session.adviseCalled` (or a dedicated `adviseSupervisoryCalled` boolean) is true, and that the output contains the reply kind string `MX_COMMAND_KIND_ADVISE_SUPERVISORY`. Consider renaming `adviseCalled` to `adviseSupervisoryCalled` for the `adviseSupervisoryRaw` stub (a separate `adviseCalled` for `adviseRaw`) to prevent future tests from masking each other. + +### Client.Java-051 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Documentation & comments | +| Location | `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewaySession.java:622-657` | +| Status | Open | + +**Description:** `writeArrayElements` accepts `int totalLength` and `Map elements` whose keys are plain Java `int`. The proto fields `MxSparseArray.total_length` and `MxSparseElement.index` are both `uint32`. Java's protobuf runtime maps `uint32` to `int` (Java has no unsigned primitive), so passing a negative value to `setTotalLength(int)` or `setIndex(int)` silently sets the wire field to the two's-complement reinterpretation (e.g. `-1` → `4294967295`). The gateway will likely reject the resulting request with `INVALID_ARGUMENT`, but the error message will reference a large `uint32` value rather than the caller's negative `int`, making the failure hard to diagnose. The Javadoc states "supplied indices must be within `[0, totalLength)`" and "`totalLength` is required" but does not state what happens with negative inputs, and no `IllegalArgumentException` is thrown. All other language clients use unsigned types (`uint`, `uint32`, `u32`) that prevent negatives at the type level; Java cannot replicate that, so explicit validation is the correct substitute. The Python client is similarly unvalidated and its docstring explicitly defers to the gateway for rejection — but Python's `grpc` runtime raises an internal exception on negative `uint32` fields before the network call, so it fails more obviously than Java's silent wire wrap. + +**Recommendation:** Add client-side guards before the `MxSparseArray.Builder` population: + +```java +if (totalLength <= 0) { + throw new IllegalArgumentException("totalLength must be > 0, got " + totalLength); +} +for (Map.Entry entry : elements.entrySet()) { + int idx = entry.getKey(); + if (idx < 0 || idx >= totalLength) { + throw new IllegalArgumentException( + "element index " + idx + " is out of range [0, " + totalLength + ")"); + } +} +``` + +Add a test in `MxGatewayClientSessionTests` asserting both `IllegalArgumentException` paths (negative `totalLength`, negative/out-of-range index). Duplicate-index detection can be left to the gateway (the proto `repeated` field allows duplicates, and the gateway can sort out semantics). diff --git a/code-reviews/Client.Python/findings.md b/code-reviews/Client.Python/findings.md index 9bb1adf..731196c 100644 --- a/code-reviews/Client.Python/findings.md +++ b/code-reviews/Client.Python/findings.md @@ -4,13 +4,41 @@ |---|---| | Module | `clients/python` | | Reviewer | Claude Code | -| Review date | 2026-06-16 | -| Commit reviewed | `8df5ab3` | +| Review date | 2026-06-18 | +| Commit reviewed | `88915c3` | | Status | Re-reviewed | -| Open findings | 0 | +| Open findings | 2 | ## Checklist coverage +### 2026-06-18 re-review (commit 88915c3) + +Re-review of the Python client delta at `88915c3` over base `8df5ab3`. Feature +scope: `Session.write_array_elements` default-fill sparse-array helper, the new +`advise-supervisory` CLI subcommand, prior 032–036 fixes carried, export +additions for `BrowseChildrenOptions` / `LazyBrowseNode`, version bump 0.1.1 → +0.1.2, README "Write Semantics" doc section, and the corresponding generated +`mxaccess_gateway_pb2.py` descriptor update. + +Generated-file churn check (memory `project_python_client_regen_pin`): only +`mxaccess_gateway_pb2.py` changed, exactly one DESCRIPTOR line was replaced +(adding the `MxSparseArray` / `MxSparseElement` encoding), and the +`Protobuf Python Version` header remained `6.31.1` at both commits. No +spurious grpcio version churn was introduced. + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | Client.Python-037 | +| 2 | mxaccessgw conventions | No issues found | +| 3 | Concurrency & thread safety | No issues found | +| 4 | Error handling & resilience | No issues found | +| 5 | Security | No issues found | +| 6 | Performance & resource management | No issues found | +| 7 | Design-document adherence | No issues found | +| 8 | Code organization & conventions | No issues found | +| 9 | Testing coverage | Client.Python-038 | +| 10 | Documentation & comments | No issues found | + ### 2026-06-16 re-review (commit 8df5ab3) Re-review of the Python client delta: new galaxy CLI commands, options.py TLS/auth, large test additions. Prior Client.Python-027..031 confirmed resolved. One claimed regression (Python-004 dead variable) and one Medium README/API mismatch. @@ -1530,3 +1558,41 @@ passed, 1 skipped, 0 warnings; previously 1 warning). The `tls`-marked **Recommendation:** Update the example/prose to `browse_children_raw(...)` (and promote the high-level `browse()`/`LazyBrowseNode` path), or add a `browse_children` alias. Add a `hasattr` test to catch future renames. **Resolution:** 2026-06-16 — Updated the README "Browsing lazily" prose and example to `browse_children_raw(...)` and added a pointer to the higher-level `browse()`/`LazyBrowseNode` walker. Tests: `test_galaxy_client_exposes_browse_children_raw` (hasattr guard) and `test_readme_browse_example_uses_existing_method` (parses every `galaxy.()` call in README against the client class) in `tests/test_review_findings_032_to_036.py`. + +### Client.Python-037 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Correctness & logic bugs | +| Location | `clients/python/pyproject.toml:10` | +| Status | Open | + +**Description:** The `description` field in `pyproject.toml` reads `"Async Python client scaffold for MXAccess Gateway."` at commit `88915c3`. Client.Python-001 resolved this on 2026-05-18 by removing the word "scaffold". The fix was lost when commit `397d3c5` (the package directory rename, `src/mxgateway` → `src/zb_mom_ww_mxgateway`) re-created `pyproject.toml` from scratch, re-introducing the stale wording. The version bump commit `88915c3` carried the regression forward without correcting it. + +The issue is purely cosmetic and does not affect the wheel build or runtime behaviour, but the "scaffold" label misrepresents the maturity of a fully-implemented, versioned package to anyone reading PyPI metadata. It is also a direct regression of a previously-resolved finding. + +**Recommendation:** Change the `description` in `clients/python/pyproject.toml` from `"Async Python client scaffold for MXAccess Gateway."` to `"Async Python client for MXAccess Gateway."` (drop "scaffold"), matching the fix applied under Client.Python-001. The `test_pip_wheel_build_succeeds` test will confirm the wheel still builds; no additional test is needed for a pure metadata word change. + +### Client.Python-038 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Testing coverage | +| Location | `clients/python/tests/`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:280-299,742-758` | +| Status | Open | + +**Description:** The new `advise-supervisory` CLI subcommand (commit `88915c3`) has no test coverage — not even a `--help` smoke registration test of the kind added for `stream-alarms` (`test_stream_alarms_is_registered`) or the earlier `advise` command. There is no test that: + +1. Asserts `advise-supervisory` is registered as a subcommand on `main` (i.e. a `--help` round-trip through `CliRunner` that confirms the subcommand name exists and Click does not report `no such command`). +2. Drives `_advise_supervisory` through `CliRunner` with a fake stub injected via monkeypatched `GatewayClient.connect` and asserts (a) the captured `MxCommand` has `kind == MX_COMMAND_KIND_ADVISE_SUPERVISORY` and (b) `server_handle`/`item_handle` are forwarded correctly. + +The README mentions `advise-supervisory` in prose (`"The CLI exposes the same command as advise-supervisory"`) but provides no `mxgw-py advise-supervisory …` example line, so the existing `test_readme_alarm_examples_parse_against_cli` scanner does not exercise it. A silent renaming or option drift would go undetected. + +The pattern to follow is `test_cli_acknowledge_alarm_happy_path` in `tests/test_review_findings_022_to_026.py`, extended with a `MX_COMMAND_KIND_ADVISE_SUPERVISORY` assertion. + +**Recommendation:** Add to `tests/test_review_findings_032_to_036.py` (or a new `tests/test_review_findings_037_038.py`): + +1. `test_advise_supervisory_is_registered` — `CliRunner().invoke(main, ["advise-supervisory", "--help"])` asserts exit code 0 and "AdviseSupervisory" (or the help text) is present. +2. `test_cli_advise_supervisory_happy_path` — injects a fake stub via `monkeypatch`, drives `advise-supervisory --session-id s1 --server-handle 1 --item-handle 2 --json`, and asserts the captured `MxCommand.kind == MX_COMMAND_KIND_ADVISE_SUPERVISORY`, `advise_supervisory.server_handle == 1`, `advise_supervisory.item_handle == 2`. diff --git a/code-reviews/Client.Rust/findings.md b/code-reviews/Client.Rust/findings.md index cd67d68..6022cb6 100644 --- a/code-reviews/Client.Rust/findings.md +++ b/code-reviews/Client.Rust/findings.md @@ -4,10 +4,10 @@ |---|---| | Module | `clients/rust` | | Reviewer | Claude Code | -| Review date | 2026-06-16 | -| Commit reviewed | `8df5ab3` | +| Review date | 2026-06-18 | +| Commit reviewed | `88915c3` | | Status | Re-reviewed | -| Open findings | 0 | +| Open findings | 2 | ## Checklist coverage @@ -132,6 +132,27 @@ Re-review of the Rust client delta: options.rs TLS trust decision, mxgw-cli gala | 9 | Testing coverage | Client.Rust-038 | | 10 | Documentation & comments | No issues found | +### 2026-06-18 review (commit 88915c3) + +Re-review of `git diff 8df5ab3..88915c3 -- clients/rust/`. The diff introduces: `Session::write_array_elements` sparse-array default-fill helper (`src/session.rs`); `SparseArrayValue` → `Unset` decode mapping in `MxValueProjection` (`src/value.rs`); `advise-supervisory` CLI subcommand (`crates/mxgw-cli/src/main.rs`); README and `RustClientDesign.md` docs additions; version bump 0.1.1 → 0.1.2; and a suite of tests in `tests/client_behavior.rs`. + +Known prior bug (commit `72cf2f4`): `write_array_elements` set the outer `ProtoMxValue.data_type` to the element type — confirmed fixed via `..ProtoMxValue::default()` (outer `data_type = 0`). The e2e test `write_array_elements_routes_sparse_array_write_through_fake_gateway` asserts `value.data_type == 0` and `sparse.element_data_type == Integer`, correctly locking the fix in. The `MxValue` roundtrip is sound: `MxValue::from_proto(sparse_value).into_proto()` returns the original raw proto unchanged, so the sparse payload reaches the wire unmodified. + +All prior findings 033–038 confirmed Resolved at `8df5ab3`. `cargo fmt --check`, `cargo clippy --workspace --all-targets -- -D warnings`, and `cargo test --workspace` are assumed clean at HEAD (source review only; toolchain is Windows-only and not available on this macOS host). + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | Issue found (Client.Rust-040): the `sparse_int32_value` unit-test helper sets `data_type: MxDataType::Integer as i32` on the outer `MxValue` but `write_array_elements` uses `..ProtoMxValue::default()` (outer `data_type = 0`); the helper comment claims it builds "the proto MxValue that `write_array_elements` would send" so the unit tests using it test a subtly incorrect shape. The e2e test correctly covers the fix, but the unit test helper should match the implementation. | +| 2 | mxaccessgw conventions | No issues found — `advise-supervisory` goes through `session.invoke` which calls `command_request`, which calls `next_correlation_id` internally; unique per-call correlation ids are preserved. `cargo fmt --check` and `cargo clippy --workspace --all-targets -- -D warnings` are expected clean (no new lint-tripping patterns introduced). | +| 3 | Concurrency & thread safety | No issues found — `write_array_elements` is a thin synchronous builder that delegates to the existing `Session::write` async path; `last_write_command` in `FakeState` is behind a `Mutex>` and accessed correctly. No new `unsafe`, no new shared mutable state. | +| 4 | Error handling & resilience | No issues found — `total_length = 0`, out-of-range indices, duplicate indices, and element-kind mismatches are all validated by the gateway's `SparseArrayExpander` and surface as `Error::InvalidArgument` (propagated, per the doc comment). No client-side guard is needed; the gateway is the single validation point. | +| 5 | Security | No issues found — `write_array_elements` passes `user_id` through to `Session::write` which is already covered by the existing API-key + scope enforcement path; no credentials or secrets in the new surface. | +| 6 | Performance & resource management | No issues found — `impl IntoIterator` avoids requiring an intermediate `Vec`; elements are collected once into `Vec` and immediately handed to the proto. No unnecessary clones on the hot path. | +| 7 | Design-document adherence | Issue found (Client.Rust-039): `Session::write_array_elements` is a new public SDK method and `advise-supervisory` is a new CLI subcommand; neither appears in `RustClientDesign.md` (Session API block or CLI commands table). CLAUDE.md requires docs to change in the same commit as the source. README was correctly updated. | +| 8 | Code organization & conventions | No issues found — `register_page_token` was cleanly extracted from `browse_children_one_level` and covered with a unit test. `write_array_elements` is placed adjacent to `write` in `session.rs`. The `WriteOk` `InvokeOverride` variant and `last_write_command` capture are well-scoped to the test infrastructure. | +| 9 | Testing coverage | Cross-referenced with Client.Rust-040: the `sparse_int32_value` test helper tests a proto shape with incorrect outer `data_type`; the unit tests using it do not verify `data_type` and would not catch a regression of the outer-`data_type` fix. The e2e test `write_array_elements_routes_sparse_array_write_through_fake_gateway` does assert `data_type == 0` and provides the real regression guard. | +| 10 | Documentation & comments | No issues found — README `write_array_elements` and `advise-supervisory` sections are accurate. The `SparseArrayValue` → `Unset` comment in `value.rs` explains the write-only rationale clearly. The `write_array_elements` doc comment correctly describes the "not a preserve, a reset" semantics. The README Rust code example for `advise-supervisory` omits `use` imports for `Payload`/`MxCommandKind`/`AdviseSupervisoryCommand` but this is consistent with other README code-snippet conventions across all five clients. | + ## Findings ### Client.Rust-001 @@ -869,3 +890,53 @@ This is masked by the tests: `tls_with_require_certificate_validation_does_not_s **Recommendation:** Add unit tests for all three (none need a network connection). **Resolution:** 2026-06-16 — Added all three test groups. (1) `--tls`/`--plaintext` resolution: `connection_defaults_to_plaintext`, `connection_tls_flag_disables_plaintext`, `connection_plaintext_flag_selects_plaintext`, `connection_rejects_tls_and_plaintext_together`. (2) Extracted the page-token dedup guard into pure `register_page_token` and covered it with `register_page_token_accepts_distinct_tokens_and_rejects_repeats`. (3) RFC3339 error paths: `rfc3339_parser_rejects_trailing_characters`, `rfc3339_parser_rejects_day_zero`, `rfc3339_parser_rejects_month_thirteen`, `rfc3339_parser_rejects_day_out_of_range_for_month`. + +### Client.Rust-039 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Design-document adherence | +| Location | `clients/rust/RustClientDesign.md:101-131` (Session API block); `clients/rust/RustClientDesign.md:326-353` (CLI commands table) | +| Status | Open | + +**Description:** The diff adds two pieces of new public surface that are not reflected in `RustClientDesign.md`: + +1. `Session::write_array_elements` — a new public async method in `clients/rust/src/session.rs:567-601`. It accepts `element_data_type: MxDataType`, `total_length: u32`, and `elements: impl IntoIterator` alongside the standard `server_handle`/`item_handle`/`user_id`. The Session API block in `RustClientDesign.md` (lines 101-131) lists every other `Session` method but omits `write_array_elements`. + +2. `Command::AdviseSupervisory` — a new CLI subcommand (`clients/rust/crates/mxgw-cli/src/main.rs:203-214`, dispatch at lines 663-683). The CLI commands table in `RustClientDesign.md` (lines 326-353) lists every other subcommand but does not include `advise-supervisory`. + +CLAUDE.md requires "When public APIs … change, the affected docs … must change in the same commit." + +**Recommendation:** Add `write_array_elements` to the Session block: + +```rust +pub async fn write_array_elements( + &self, + server_handle: i32, + item_handle: i32, + element_data_type: MxDataType, + total_length: u32, + elements: impl IntoIterator, + user_id: i32, +) -> Result<(), Error>; +``` + +Add a sentence noting that the `elements` iterator accepts `(index, value)` pairs (not a `HashMap`, so duplicate indices are forwarded to the gateway, which rejects them with `InvalidArgument`). Add `mxgw advise-supervisory --session-id --server-handle --item-handle ` to the CLI table. + +### Client.Rust-040 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Testing coverage | +| Location | `clients/rust/tests/client_behavior.rs:1195-1224` (`sparse_int32_value` helper); `clients/rust/tests/client_behavior.rs:1226-1264` (unit tests using the helper) | +| Status | Open | + +**Description:** The `sparse_int32_value` test helper (lines 1195-1224) carries this comment: "Build the proto `MxValue` that `write_array_elements` would send." It then constructs the outer `MxValue` with `data_type: MxDataType::Integer as i32` (line 1215). However, `write_array_elements` in `session.rs` uses `..ProtoMxValue::default()` for the outer value, which sets `data_type` to `0` (= `MxDataType::Unspecified`). The helper builds the old, incorrect shape that the known bug fix (`72cf2f4`) explicitly corrected — the outer `data_type` should carry the element type only inside `SparseArray.element_data_type`, not on the enclosing `MxValue`. + +The two unit tests that use this helper (`write_array_elements_proto_shape_has_sparse_oneof_kind` at line 1226 and `write_array_elements_empty_elements_is_valid_all_defaults` at line 1253) do not assert `data_type` on the outer `MxValue`, so they pass and do not catch the discrepancy. The only test that asserts `value.data_type == 0` is the e2e test `write_array_elements_routes_sparse_array_write_through_fake_gateway`, which correctly locks in the fix. The unit tests therefore give a false sense of coverage: they document and confirm a shape that does not match the implementation's actual output. + +If the `..ProtoMxValue::default()` line were ever accidentally changed back to set `data_type` from `element_data_type`, the unit tests would continue to pass while the e2e test would catch the regression — but the test comment explicitly says the helper represents "what `write_array_elements` would send," making the incorrect `data_type` in the helper actively misleading for future maintainers. + +**Recommendation:** Fix the `sparse_int32_value` helper to use `..MxValue::default()` (which zeros `data_type`) instead of `data_type: MxDataType::Integer as i32`, so the helper accurately represents the wire shape `write_array_elements` actually emits. Then add an explicit `assert_eq!(proto.data_type, 0, "outer MxValue.data_type must be Unspecified")` assertion to `write_array_elements_proto_shape_has_sparse_oneof_kind` so the unit test also locks in the outer-`data_type` fix — providing a second, faster regression guard that does not require spinning up a fake gRPC server. diff --git a/code-reviews/Contracts/findings.md b/code-reviews/Contracts/findings.md index 378d849..240d8f3 100644 --- a/code-reviews/Contracts/findings.md +++ b/code-reviews/Contracts/findings.md @@ -4,10 +4,10 @@ |---|---| | Module | `src/ZB.MOM.WW.MxGateway.Contracts` | | Reviewer | Claude Code | -| Review date | 2026-06-16 | -| Commit reviewed | `8df5ab3` | +| Review date | 2026-06-18 | +| Commit reviewed | `88915c3` | | Status | Re-reviewed | -| Open findings | 0 | +| Open findings | 3 | ## Checklist coverage @@ -470,3 +470,76 @@ Re-review of the proto delta (`git diff 410acc9..8df5ab3 -- .../Protos/`): the n **Recommendation:** Add a round-trip test setting `MxEvent.ReplayGap` with both sequence fields, asserting `BodyCase == None`, plus a descriptor assertion pinning `ReplayGapFieldNumber == 14` and the `ReplayGap` field numbers (1, 2). **Resolution:** _(2026-06-16)_ Added `ProtobufContractRoundTripTests.MxEvent_RoundTripsReplayGapSentinelAndPinsFieldNumbers` to `ProtobufContractRoundTripTests.cs`. The test pins `MxEvent.ReplayGapFieldNumber == 14` via the generated constant, pins `ReplayGap.RequestedAfterSequenceFieldNumber == 1` and `ReplayGap.OldestAvailableSequenceFieldNumber == 2` via `ReplayGap.Descriptor.Fields` (asserting both the number and the field name), builds a sentinel `MxEvent` with both sequence fields populated and no body oneof set, serializes and parses it, then asserts both sequence values survive and `BodyCase == None` (confirming `replay_gap` is orthogonal to the body oneof). + +#### 2026-06-18 review (commit 88915c3) + +Re-review pass at `88915c3` scoped to the contract changes since `8df5ab3` +(`git diff 8df5ab3..88915c3 -- src/ZB.MOM.WW.MxGateway.Contracts/`). The +window contains exactly one contract feature commit: the array-write-ergonomics +addition (`MxSparseArray`, `MxSparseElement`, `sparse_array_value = 19` on +`MxValue`). Also included: a minor version bump (`0.1.1` → `0.1.2` in +`ZB.MOM.WW.MxGateway.Contracts.csproj`), regenerated `Generated/MxaccessGateway.cs` +(build output — confirmed consistent with the proto; `SparseArrayValueFieldNumber = +19` and the new generated class registrations match the proto declaration), and +the removal of the stale "Task 12 / contract surface only" parenthetical from the +`replay_gap` field comment (Contracts-021 resolution already covered in the +`8df5ab3` pass; the diff shows the cleaned comment). `mxaccess_worker.proto` and +`galaxy_repository.proto` are unchanged. + +Verified against `docs/plans/2026-06-18-array-write-ergonomics-design.md`, +`gateway.md` (section "MxSparseArray — default-fill partial array writes"), +and `docs/WorkerConversion.md` (section "Sparse array expansion"). + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | No issues found. Field number `sparse_array_value = 19` is new and does not collide with any existing arm (18 = `raw_value`, 17 = `array_value`). `MxSparseArray` field numbers 1/2/3 and `MxSparseElement` field numbers 1/2 are all fresh. The `MxValue` oneof arms 10-18 are unchanged. The additive-only invariant is honoured. The generated `SparseArrayValueFieldNumber = 19` constant matches the proto. The `MxSparseElement.value` field reuses `MxValue`, which allows a client to nest another `sparse_array_value` inside an element — a recursive structure the gateway will reject at validation time but the proto level cannot prevent. This is a documentation gap (see Contracts-023) rather than a correctness bug in the contract itself, since gateway validation is the documented enforcement point. | +| 2 | mxaccessgw conventions | No issues found. Wire-compatibility policy comment block at the top of `mxaccess_gateway.proto` (Contracts-005) remains intact and this change honours it. Naming follows conventions: `snake_case` fields, `PascalCase` messages, no enum-prefix needed (no new enums). The `MxSparseArray` message-level comment clearly states write-only semantics and that the worker never receives it. Generated code regenerated, not hand-edited. | +| 3 | Concurrency & thread safety | N/A — pure contract definitions plus a static constants class. | +| 4 | Error handling & resilience | No issues found. The validation rules (`total_length == 0`, `index >= total_length`, duplicate indices, unsupported `element_data_type`, element-kind mismatch) are documented in `gateway.md` and `docs/plans/2026-06-18-array-write-ergonomics-design.md`; they are not expressed in the proto itself, which is correct for proto3. The write-only guard (reject on read/event paths) is documented at the proto message level. | +| 5 | Security | No issues found. No new credential-bearing fields. `MxSparseElement.value` carries the same write-value surface as any other `MxValue` write field; credential-sensitivity comments already on `WriteSecuredCommand.value` / `WriteSecured2Command.value` / the corresponding bulk entry fields apply to any write value, including sparse. No new redaction gap. | +| 6 | Performance & resource management | No issues found. `repeated MxSparseElement elements` is sent once per write; the gateway's expansion to a full `MxArray` is gateway-side only and the worker receives a normal whole-array value. No proto-level bloat or unbounded repeated field beyond what already exists on `MxArray`. | +| 7 | Design-document adherence | No drift. The shipped proto matches the design document (`docs/plans/2026-06-18-array-write-ergonomics-design.md`) field-for-field: `element_data_type = 1`, `total_length = 2`, `elements = 3` on `MxSparseArray`; `index = 1`, `value = 2` on `MxSparseElement`; `sparse_array_value = 19` on `MxValue`. `gateway.md` section "MxSparseArray" and `docs/WorkerConversion.md` are both updated and consistent. `docs/Contracts.md` has no mention of the new value arm — see Contracts-024. | +| 8 | Code organization & conventions | No issues found. The `csharp_namespace` option and protobuf `package` are unchanged. The new messages are placed after `MxArray` and its typed-array sub-messages, which is the correct locality. No message is inserted between existing numeric-series messages. Version bump `0.1.1` → `0.1.2` is appropriate for a minor additive change. | +| 9 | Testing coverage | Issues found: Contracts-023 — no `ProtobufContractRoundTripTests` coverage exists for `MxSparseArray`, `MxSparseElement`, or `MxValue.KindOneofCase.SparseArrayValue` (field number 19). This is the same gap class as Contracts-007/010/018/022. | +| 10 | Documentation & comments | Issues found: Contracts-024 (`docs/Contracts.md` has no mention of `MxSparseArray` — the canonical contracts document undercounts the public value surface); Contracts-025 (`GatewayContractInfoTests.GatewayProtocolVersion_IsVersionThree` summary enumerates alarm and bulk write/read extensions under version 3 but not the sparse array addition, leaving future readers without guidance on whether the new arm also ships under version 3 without a bump). | + +### Contracts-023 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Testing coverage | +| Location | `src/ZB.MOM.WW.MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs` | +| Status | Open | + +**Description:** No round-trip test or descriptor pin exists for the new `MxSparseArray` message, `MxSparseElement` message, or `MxValue.KindOneofCase.SparseArrayValue` (field number 19). A future renumber or type-narrowing of `sparse_array_value = 19`, or of `MxSparseArray`'s field numbers (1/2/3) or `MxSparseElement`'s field numbers (1/2), would not be caught at the contract level. This is the same gap class as Contracts-007 (`MxValue.raw_value`), Contracts-010 (bulk write/read), Contracts-018 (alarm-provider fallback), and Contracts-022 (`ReplayGap`) — all of which were resolved by adding focused round-trip tests. + +Additionally, the `MxSparseElement.value` field is typed `MxValue` (the full value union), which means a client could legally set it to another `sparse_array_value = 19` arm, creating a recursive sparse structure. The `// scalar` comment documents the intent, but no test pins that a well-formed sparse element carries only a scalar kind (not `array_value`, `raw_value`, or another `sparse_array_value`). The gateway rejects recursive nesting at validation time, but the contract-level test would document the constraint explicitly. + +**Recommendation:** Add round-trip / descriptor-pin tests to `ProtobufContractRoundTripTests`: (a) pin `MxValue.SparseArrayValueFieldNumber == 19` via the generated constant; (b) round-trip an `MxSparseArray` with `element_data_type`, `total_length`, and at least one `MxSparseElement` (covering `index` and a scalar `value`), embedded in an `MxValue` with `KindCase == SparseArrayValue`; (c) assert the `MxSparseArray` field numbers by name via `MxSparseArray.Descriptor.Fields` (1 = `element_data_type`, 2 = `total_length`, 3 = `elements`) and `MxSparseElement.Descriptor.Fields` (1 = `index`, 2 = `value`). Optionally add a second test with an empty `elements` list (valid all-defaults case) to pin that zero elements is not a proto-level error. + +### Contracts-024 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Documentation & comments | +| Location | `docs/Contracts.md:9-11` | +| Status | Open | + +**Description:** `docs/Contracts.md` lists `MxValue`, `MxArray`, and `MxStatusProxy` as the types defined in `mxaccess_gateway.proto`, and documents both bulk subscription and bulk write/read command families in detail. The new `MxSparseArray` value arm (`sparse_array_value = 19`) — a public-facing addition to the `MxValue` oneof that changes the write API available to every command variant — is not mentioned anywhere in `docs/Contracts.md`. The CLAUDE.md rule "Update docs in the same change as the source. When public APIs, contracts, configuration … change, the affected docs … must change in the same commit" was not satisfied for this addition; `docs/Contracts.md` now undercounts the public `MxValue` surface. `gateway.md` and `docs/WorkerConversion.md` were updated, but `docs/Contracts.md` — the canonical contracts document linked from the client generation doc — was not. + +**Recommendation:** Extend `docs/Contracts.md` to describe `MxSparseArray`: the write-only `sparse_array_value = 19` arm on `MxValue`, the two messages (`MxSparseArray` with `element_data_type`, `total_length`, `elements`; `MxSparseElement` with `index`, `value`), the default-fill-not-preserve semantics for unmentioned indices, and the fact that it is accepted by every write variant (`Write`, `Write2`, `WriteSecured`, `WriteSecured2`, and each `*BulkEntry` entry) but rejected on read/event paths. Cross-reference `gateway.md` for the validation rules and expansion details rather than restating them. + +### Contracts-025 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Documentation & comments | +| Location | `src/ZB.MOM.WW.MxGateway.Tests/Contracts/GatewayContractInfoTests.cs:14-25` | +| Status | Open | + +**Description:** The XML summary on `GatewayContractInfoTests.GatewayProtocolVersion_IsVersionThree` (updated under Contracts-013 resolution to enumerate the alarm and bulk write/read extensions shipped under version 3) does not mention the new `MxSparseArray` / `sparse_array_value = 19` addition, which is also a strictly additive contract change shipped under version 3 without a bump. A reader checking whether a new additive contract feature requires a `GatewayProtocolVersion` bump will look at this test for precedent; finding only the alarm and bulk write/read examples, they cannot tell whether the sparse array addition was also additive-under-3 or was simply omitted by mistake. This is the same class of stale-summary issue as Contracts-013 (which noted the bulk write/read extension was not mentioned after the alarm-only summary). + +**Recommendation:** Extend the XML summary to list the `MxSparseArray` write ergonomics extension (`MxSparseArray` / `MxSparseElement` + `sparse_array_value = 19` on `MxValue`, plus the suffix-normalization behavior) alongside the alarm and bulk write/read extensions as a third example of a strictly additive change that shipped under version 3 without a bump. Comment-only change; no test logic or version constant changes. diff --git a/code-reviews/IntegrationTests/findings.md b/code-reviews/IntegrationTests/findings.md index 2ca5085..f53bb71 100644 --- a/code-reviews/IntegrationTests/findings.md +++ b/code-reviews/IntegrationTests/findings.md @@ -4,8 +4,8 @@ |---|---| | Module | `src/ZB.MOM.WW.MxGateway.IntegrationTests` | | Reviewer | Claude Code | -| Review date | 2026-06-16 | -| Commit reviewed | `8df5ab3` | +| Review date | 2026-06-18 | +| Commit reviewed | `88915c3` | | Status | Re-reviewed | | Open findings | 0 | @@ -152,6 +152,32 @@ Re-review of the live-test delta: two new `[LiveMxAccessFact]` smoke tests (B8 n | 9 | Testing coverage | IntegrationTests-032, IntegrationTests-033 | | 10 | Documentation & comments | IntegrationTests-030, IntegrationTests-031 | +### 2026-06-18 re-review (commit `88915c3`) + +Scope: `git diff 8df5ab3..88915c3 -- src/ZB.MOM.WW.MxGateway.IntegrationTests/`. One +commit touched the module: `6b5fe6a` ("fix: resolve code-review findings (locally +verified)"). The IntegrationTests delta is exactly two hunks in +`WorkerLiveMxAccessSmokeTests.cs`: (1) a comment reword on the Suspend/Activate block +(IntegrationTests-031 resolution: "against the advised item" → "against the +added-but-not-advised item (no Advise was issued between AddItem and this call)"); (2) +`catch (TimeoutException)` widened to `catch (TimeoutException ex)` and +`output.WriteLine($"B8: sample-bearing batch predicate timed out: {ex.Message}")` added +before nulling `bufferedBatch` (IntegrationTests-032 resolution). Both are pure +comment/diagnostic fixes with no assertion or logic changes. All ten categories clean. + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | No issues found. Both hunks are comment/diagnostic-only; no assertions, conditions, or API calls changed. | +| 2 | mxaccessgw conventions | No issues found. Tests remain correctly gated behind `[LiveMxAccessFact]`; no secrets logged; no synthesized events; no direct COM instantiation introduced. | +| 3 | Concurrency & thread safety | No issues found. No concurrency-relevant code changed. | +| 4 | Error handling & resilience | No issues found. The `TimeoutException` catch remains non-rethrowing; adding `ex` capture and a `WriteLine` does not affect the error path. | +| 5 | Security | No issues found. `ex.Message` from `WaitForMessageAsync`'s `TimeoutException` carries only a scan count and timeout duration — no credential data. | +| 6 | Performance & resource management | No issues found. No new allocations on non-timeout paths; the exception reference capture on the timeout path is negligible. | +| 7 | Design-document adherence | No issues found. `docs/GatewayTesting.md` was already updated in the same `6b5fe6a` wave (IntegrationTests-030); these two comment fixes require no further doc change. | +| 8 | Code organization & conventions | No issues found. Comment style is consistent with the surrounding block; no namespace or layout changes. | +| 9 | Testing coverage | No issues found. No assertions changed; the `output.WriteLine` adds residual diagnostics only. IntegrationTests-033 remains Deferred (AddItem2/Write2 live parity requires the live rig). | +| 10 | Documentation & comments | No issues found. Both hunks are the documentation fix — the reworded comment now accurately reflects that no `Advise` precedes Suspend/Activate, and the timeout log now surfaces the `WaitForMessageAsync` detail to distinguish the two residual failure modes. | + ## Findings ### IntegrationTests-001 diff --git a/code-reviews/README.md b/code-reviews/README.md index b3872b1..918be08 100644 --- a/code-reviews/README.md +++ b/code-reviews/README.md @@ -10,23 +10,41 @@ Each module's `findings.md` is the source of truth; this file is generated from | Module | Reviewer | Date | Commit | Status | Open | Total | |---|---|---|---|---|---|---| -| [Client.Dotnet](Client.Dotnet/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 29 | -| [Client.Go](Client.Go/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 34 | -| [Client.Java](Client.Java/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 48 | -| [Client.Python](Client.Python/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 36 | -| [Client.Rust](Client.Rust/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 38 | -| [Contracts](Contracts/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 22 | -| [IntegrationTests](IntegrationTests/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 33 | -| [Server](Server/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 56 | -| [Tests](Tests/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 39 | +| [Client.Dotnet](Client.Dotnet/findings.md) | Claude Code | 2026-06-18 | `88915c3` | Re-reviewed | 1 | 30 | +| [Client.Go](Client.Go/findings.md) | Claude Code | 2026-06-18 | `88915c3` | Re-reviewed | 3 | 37 | +| [Client.Java](Client.Java/findings.md) | Claude Code | 2026-06-18 | `88915c3` | Re-reviewed | 3 | 51 | +| [Client.Python](Client.Python/findings.md) | Claude Code | 2026-06-18 | `88915c3` | Re-reviewed | 2 | 38 | +| [Client.Rust](Client.Rust/findings.md) | Claude Code | 2026-06-18 | `88915c3` | Re-reviewed | 2 | 40 | +| [Contracts](Contracts/findings.md) | Claude Code | 2026-06-18 | `88915c3` | Re-reviewed | 3 | 25 | +| [IntegrationTests](IntegrationTests/findings.md) | Claude Code | 2026-06-18 | `88915c3` | Re-reviewed | 0 | 33 | +| [Server](Server/findings.md) | Claude Code | 2026-06-18 | `88915c3` | Re-reviewed | 2 | 58 | +| [Tests](Tests/findings.md) | Claude Code | 2026-06-18 | `88915c3` | Re-reviewed | 1 | 40 | | [Worker](Worker/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 28 | -| [Worker.Tests](Worker.Tests/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 36 | +| [Worker.Tests](Worker.Tests/findings.md) | Claude Code | 2026-06-18 | `88915c3` | Re-reviewed | 0 | 36 | ## Pending findings Findings with status `Open` or `In Progress`, ordered by severity. -_No pending findings._ +| ID | Severity | Category | Location | Description | +|---|---|---|---|---| +| Client.Dotnet-030 | Medium | Correctness & logic bugs | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:91-93,113,2023-2050` | `advise-supervisory` was added to the `command switch` dispatch table at line 113 but was not added to `IsKnownGatewayCommand` (the exhaustive list at lines 2023–2050). The guard at line 91 evaluates `IsKnownGatewayCommand(command)` before… | +| Server-057 | Medium | Correctness & logic bugs | `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:976-1000` (`NormalizeOutboundCommand`), `:1085-1095` (`MapCommand` tracking), `gateway.md` (array-write ergonomics section), `clients/*/README.md` | The array-suffix `[]` normalization runs only for the single-add commands `AddItem` and `AddItem2` — `NormalizeOutboundCommand` has no case for `AddItemBulk` (or `AddBufferedItem`), and the `MapCommand` tracking switch likewise normalizes… | +| Client.Go-035 | Low | Code organization & conventions | `clients/go/cmd/mxgw-go/main.go:1298`, `clients/go/README.md:328-355` | `advise-supervisory` is wired into the `run()` dispatch switch at `main.go:91-92` but is absent from two surfaces a user consults to discover CLI commands: 1. `writeUsage()` at `main.go:1298` does not list `advise-supervisory` in its pipe-… | +| Client.Go-036 | Low | Testing coverage | `clients/go/cmd/mxgw-go/main_test.go`, `clients/go/cmd/mxgw-go/main.go:364-399` | `runAdviseSupervisory` has no CLI-level test in `main_test.go`. In particular the session-id-required guard at `main.go:376-378` is untested, unlike every other guard for session-id-required commands (e.g. `TestRunStreamEventsRequiresSessi… | +| Client.Go-037 | Low | Documentation & comments | `clients/go/README.md:136-137` | The README "Write Semantics" section states: > The CLI exposes the same command as `advise-supervisory`, and `write` / `write2` take `--user-id`. The Go CLI has no standalone `write2` command — only `write2-bulk`. The analogous statement i… | +| Client.Java-049 | Low | Code organization & conventions | `clients/java/build.gradle:16`, `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayClientVersion.java:12`, `clients/java/zb-mom-ww-mxgateway-cli/src/test/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCliTests.java:59,89`, `clients/java/README.md:399` | Commit `88915c3` (`chore(clients): bump all five clients 0.1.1 -> 0.1.2`) incremented `build.gradle` `version = '0.1.2'` but left `MxGatewayClientVersion.CLIENT_VERSION = "0.1.1"` unchanged. The two CLI test assertions that check the versi… | +| Client.Java-050 | Low | Testing coverage | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1046-1068` (new `AdviseSupervisoryCommand`), `clients/java/zb-mom-ww-mxgateway-cli/src/test/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCliTests.java:1306-1313` (stub) | Commit `9eedf9d` added the `advise-supervisory` CLI subcommand (`AdviseSupervisoryCommand`) to all language client CLIs. The Java `FakeSession.adviseSupervisoryRaw` stub was added to `MxGatewayCliTests` but no test exercises the new subcom… | +| Client.Java-051 | Low | Documentation & comments | `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewaySession.java:622-657` | `writeArrayElements` accepts `int totalLength` and `Map elements` whose keys are plain Java `int`. The proto fields `MxSparseArray.total_length` and `MxSparseElement.index` are both `uint32`. Java's protobuf runtime maps… | +| Client.Python-037 | Low | Correctness & logic bugs | `clients/python/pyproject.toml:10` | The `description` field in `pyproject.toml` reads `"Async Python client scaffold for MXAccess Gateway."` at commit `88915c3`. Client.Python-001 resolved this on 2026-05-18 by removing the word "scaffold". The fix was lost when commit `397d… | +| Client.Python-038 | Low | Testing coverage | `clients/python/tests/`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:280-299,742-758` | The new `advise-supervisory` CLI subcommand (commit `88915c3`) has no test coverage — not even a `--help` smoke registration test of the kind added for `stream-alarms` (`test_stream_alarms_is_registered`) or the earlier `advise` command. T… | +| Client.Rust-039 | Low | Design-document adherence | `clients/rust/RustClientDesign.md:101-131` (Session API block); `clients/rust/RustClientDesign.md:326-353` (CLI commands table) | The diff adds two pieces of new public surface that are not reflected in `RustClientDesign.md`: 1. `Session::write_array_elements` — a new public async method in `clients/rust/src/session.rs:567-601`. It accepts `element_data_type: MxDataT… | +| Client.Rust-040 | Low | Testing coverage | `clients/rust/tests/client_behavior.rs:1195-1224` (`sparse_int32_value` helper); `clients/rust/tests/client_behavior.rs:1226-1264` (unit tests using the helper) | The `sparse_int32_value` test helper (lines 1195-1224) carries this comment: "Build the proto `MxValue` that `write_array_elements` would send." It then constructs the outer `MxValue` with `data_type: MxDataType::Integer as i32` (line 1215… | +| Contracts-023 | Low | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs` | No round-trip test or descriptor pin exists for the new `MxSparseArray` message, `MxSparseElement` message, or `MxValue.KindOneofCase.SparseArrayValue` (field number 19). A future renumber or type-narrowing of `sparse_array_value = 19`, or… | +| Contracts-024 | Low | Documentation & comments | `docs/Contracts.md:9-11` | `docs/Contracts.md` lists `MxValue`, `MxArray`, and `MxStatusProxy` as the types defined in `mxaccess_gateway.proto`, and documents both bulk subscription and bulk write/read command families in detail. The new `MxSparseArray` value arm (`… | +| Contracts-025 | Low | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Tests/Contracts/GatewayContractInfoTests.cs:14-25` | The XML summary on `GatewayContractInfoTests.GatewayProtocolVersion_IsVersionThree` (updated under Contracts-013 resolution to enumerate the alarm and bulk write/read extensions shipped under version 3) does not mention the new `MxSparseAr… | +| Server-058 | Low | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Security/Authorization/ConstraintEnforcerTests.cs` | The new `ConstraintEnforcer.ResolveTarget` `[]`-suffix fallback is security-sensitive: it turns a bare-array address that previously missed the Galaxy index (→ spurious `tag_metadata` deny) into a real scope/classification decision. The ad… | +| Tests-040 | Low | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/GatewayArrayWriteWiringTests.cs` | `GatewayArrayWriteWiringTests` covers two of eight sparse-write arms in `GatewaySession.NormalizeOutboundCommand`: `Write` (single) and `WriteBulk`. The remaining six arms — `WriteSecured`, `Write2`, `WriteSecured2`, `Write2Bulk`, `WriteSe… | ## Closed findings diff --git a/code-reviews/Server/findings.md b/code-reviews/Server/findings.md index 09ca63d..2718076 100644 --- a/code-reviews/Server/findings.md +++ b/code-reviews/Server/findings.md @@ -4,10 +4,10 @@ |---|---| | Module | `src/ZB.MOM.WW.MxGateway.Server` | | Reviewer | Claude Code | -| Review date | 2026-06-16 | -| Commit reviewed | `8df5ab3` | +| Review date | 2026-06-18 | +| Commit reviewed | `88915c3` | | Status | Re-reviewed | -| Open findings | 0 | +| Open findings | 2 | ## Checklist coverage @@ -69,6 +69,23 @@ findings (Server-001 through Server-032) are unchanged by this pass. | 9 | Testing coverage | Issues found: Server-037 (no test for the corrupt-snapshot restore path or for `PersistSnapshot = false` at the cache level). | | 10 | Documentation & comments | No issues found — XML docs match behavior; the `GalaxyRepository.md` "On-disk snapshot" section documents the Stale-on-restore lifecycle. | +### 2026-06-18 review (commit 88915c3) + +Re-review of the array-write-ergonomics feature (`git diff 8df5ab3..88915c3 -- src/ZB.MOM.WW.MxGateway.Server/`): the new `SparseArrayExpander` and `ArrayAddressNormalizer`, the `NormalizeOutboundCommand` choke point in `GatewaySession.InvokeAsync`, the re-normalize at the `TrackCommandReply`/`MapCommand` tracking call sites, and the `ConstraintEnforcer.ResolveTarget` `[]`-suffix fallback. The range also lands the already-filed Server-055 (`_everHadEventSubscriber` detach-grace gate) and Server-056 (`SessionEventDistributor._completed` late-registrant guard) resolutions; both were re-verified sound here and remain closed. Security focus — the authorization `[]` fallback: it changes only *which* Galaxy metadata record resolves (turning a spurious `tag_metadata` deny into a real scope/classification decision), stays `IsArray`-gated, and the scope check (`MatchesPathOrTag` on `ContainedPath` + the original bare `tagAddress`) still runs unchanged, so a scoped key cannot reach a non-array tag or an out-of-scope array. The authz check (bare address) and the worker bind (suffixed via the same `IsArray`-gated probe) resolve the identical attribute — no check-vs-bind mismatch. Worker untouched; parity preserved (single whole-array COM write). Two new findings. + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | Issues found: Server-057 (`NormalizeOutboundCommand`/`MapCommand` normalize only single `AddItem`/`AddItem2`; bare array addresses added via `AddItemBulk`/`AddBufferedItem` register non-write-capable handles). SparseArrayExpander verified sound: total_length=0 / oversize / out-of-range / duplicate / kind-mismatch all → `InvalidArgument`; `Array.MaxLength` guard precedes the `(int)` cast; Int32-vs-Int64 selection mirrors the worker converter; expansion targets `Value` not `timestamp_value` across all 8 write variants. | +| 2 | mxaccessgw conventions | No issues found — file-scoped namespaces, `sealed`/`internal static`, `Async` suffixes, primary constructors, MXAccess-aligned naming; worker still does an honest whole-array COM write; no UI libraries; no value/secret logging added (write payloads never logged). | +| 3 | Concurrency & thread safety | No issues found — `NormalizeOutboundCommand` mutates the `WorkerCommand.Command` deep-clone produced by `MapCommand` (`request.Command.Clone()`), never the client's `MxCommand`, and the tracking path re-normalizes its own separate copy; `ArrayAddressNormalizer`/`SparseArrayExpander` are stateless (read the cache snapshot per call); `_everHadEventSubscriber` and the `SessionEventDistributor._completed` guard serialize correctly under their locks. | +| 4 | Error handling & resilience | No issues found — invalid sparse payloads surface as `RpcException(InvalidArgument)` and propagate uncaught (correct client-facing error); the normalizer never throws and passes through when metadata is cold. | +| 5 | Security | No issues found — the `ConstraintEnforcer` `[]` fallback is `IsArray`-gated, does not widen scope (scope/classification checks unchanged), and resolves the same attribute the worker binds; covered by `CheckReadTagAsync_WithBareArrayName_*` and the missing-non-array negative test. | +| 6 | Performance & resource management | No issues found — expander allocates one array of `total_length` slots (inherent to a whole-array write) and the work is O(length + elements); the normalizer is a single dictionary probe per AddItem. | +| 7 | Design-document adherence | No issues found — worker untouched, single whole-array COM write preserved; `gateway.md`, `docs/WorkerConversion.md`, `docs/DesignDecisions.md`, and all five client READMEs were updated in the same commit and match the code (sparse semantics, default-fill reset, `[]` normalization). | +| 8 | Code organization & conventions | No issues found — new types live under `Sessions/`, registered as a singleton in `AddGatewaySessions` consistent with `IGalaxyHierarchyCache`; the optional `addressNormalizer` ctor parameter keeps legacy unit-construction paths working. | +| 9 | Testing coverage | Issues found: Server-058 (no test asserts a bare-array name that resolves via the `[]` fallback is still *denied* when out of scope, and `CheckWriteHandleAsync` array-via-suffixed-registration / classification is untested). Otherwise strong: `SparseArrayExpanderTests` (210 lines), `ArrayAddressNormalizerTests` (105), and the two `ConstraintEnforcerTests` cases. | +| 10 | Documentation & comments | No issues found — XML docs and inline comments on the new types are accurate (the GatewaySession comment correctly notes `MapCommand` deep-clones, contradicting the plan's stale "same instance" note); the `AddItemBulk` gap is documentation-adjacent and is captured under Server-057. | + ### 2026-06-16 re-review (commit 8df5ab3) Re-review of the session-resilience epic + §8 delta (`git diff 410acc9..8df5ab3`): `SessionEventDistributor` multi-subscriber fan-out, replay-on-reconnect, detach-grace retention, bounded worker-ready wait, dashboard auto-login. @@ -1084,3 +1101,33 @@ Additionally, `GatewayAlarmMonitor.ApplyProviderModeChangeAsync` increments the **Recommendation:** Record terminal completion (a `_completed` flag plus the terminal error) under `_lifecycleLock` and have both register paths complete a late registrant's channel immediately with the same terminal state. **Resolution:** 2026-06-17: added `_completed` + `_completionError`, set inside `CompleteAllSubscribers` under `_lifecycleLock` — the same lock the register paths take, so completion and registration serialize (a subscriber added before the sweep is completed by the loop; one racing in after sees `_completed` and self-completes). `Register` and `RegisterWithReplay` now `TryComplete` a late registrant's channel with `_completionError` when `_completed`; a late resume still receives its retained replay batch, then a cleanly-completed empty live channel. No lock-ordering risk — `CompleteAllSubscribers` takes only `_lifecycleLock`, and the subscriber channels use `AllowSynchronousContinuations=false` so `TryComplete` under the lock runs no continuation inline. New regression `[Fact]` `Register_AfterSourceCompletes_CompletesLateSubscriberInsteadOfHanging` (`SessionEventDistributorTests.cs`) registers a subscriber after the pump completes and asserts its channel completes (bounded read); verified it fails without the fix (5 s timeout) and passes with it (12 ms). The racy `GatewaySessionDashboardMirrorTests.DashboardMirror_AndGrpcSubscriber_BothReceiveEvents` that exposed it was also made deterministic — see Tests-039. + +### Server-057 + +| Field | Value | +|---|---| +| Severity | Medium | +| Category | Correctness & logic bugs | +| Location | `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:976-1000` (`NormalizeOutboundCommand`), `:1085-1095` (`MapCommand` tracking), `gateway.md` (array-write ergonomics section), `clients/*/README.md` | +| Status | Open | + +**Description:** The array-suffix `[]` normalization runs only for the single-add commands `AddItem` and `AddItem2` — `NormalizeOutboundCommand` has no case for `AddItemBulk` (or `AddBufferedItem`), and the `MapCommand` tracking switch likewise normalizes only the `AddItem`/`AddItem2` arms (`AddItemBulk` flows through `TrackBulkItems`/`AddBufferedItem` with the raw address). MXAccess requires the `[]` suffix on the AddItem address for an array attribute to register a *write-capable* handle. A client that registers a bare array address via `AddItemBulk` therefore binds a non-write-capable handle, and a later `Write`/`WriteSecured`/sparse write against that handle silently lands on a read-only-ish handle — the exact failure mode this feature fixes for the single-add path. The behavior is inconsistent across the add family, and `gateway.md` / the client READMEs describe normalization as happening "at AddItem time" and explicitly carve out only `ReadBulk`, giving no signal that `AddItemBulk` is excluded. `AddBufferedItem` is lower-risk (buffered/historical read items are not normally written) but is the same gap. + +**Recommendation:** Either (a) extend `NormalizeOutboundCommand` and the `MapCommand` tracking path to normalize each `AddItemBulk.TagAddresses` entry (and `AddBufferedItem.ItemDefinition`) the same `IsArray`-gated way, keeping the constraint check, the worker bind, and the stored `SessionItemRegistration.TagAddress` consistent; or (b) if bulk-add normalization is intentionally out of scope for this feature, state that explicitly in `gateway.md` and the client READMEs (alongside the existing `ReadBulk` carve-out) so clients know bulk-added array handles must carry the `[]` suffix themselves to be writable. + +**Resolution:** _(empty until closed)_ + +### Server-058 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Testing coverage | +| Location | `src/ZB.MOM.WW.MxGateway.Tests/Security/Authorization/ConstraintEnforcerTests.cs` | +| Status | Open | + +**Description:** The new `ConstraintEnforcer.ResolveTarget` `[]`-suffix fallback is security-sensitive: it turns a bare-array address that previously missed the Galaxy index (→ spurious `tag_metadata` deny) into a real scope/classification decision. The added tests cover the in-scope-allow case (`CheckReadTagAsync_WithBareArrayName_ResolvesViaArraySuffixFallback`) and the missing/non-array negative case (`CheckReadTagAsync_WithMissingNonArrayName_StillFailsToResolve`), but **no test asserts that a bare-array name which now resolves via the fallback is still denied when it is out of the key's read/write scope** — i.e. that the fallback widened resolution but not authorization. There is also no `CheckWriteHandleAsync` test exercising an array handle whose `SessionItemRegistration.TagAddress` is the suffixed form (`Obj.Arr[]`) resolving directly through `ResolveTarget` for the write-scope and `MaxWriteClassification` checks. These are the precise paths a regression could silently widen. + +**Recommendation:** Add a `CheckReadTagAsync` (and a `CheckWriteHandleAsync`) case where the bare/suffixed array attribute resolves but the configured `ReadTagGlobs`/`WriteSubtrees` exclude it, asserting a `read_scope`/`write_scope` `ConstraintFailure` is still returned; and a `CheckWriteHandleAsync` case asserting `MaxWriteClassification` is enforced against the array attribute's `SecurityClassification` via the suffixed registration address. + +**Resolution:** _(empty until closed)_ diff --git a/code-reviews/Tests/findings.md b/code-reviews/Tests/findings.md index 0bd4db3..865f747 100644 --- a/code-reviews/Tests/findings.md +++ b/code-reviews/Tests/findings.md @@ -4,10 +4,10 @@ |---|---| | Module | `src/ZB.MOM.WW.MxGateway.Tests` | | Reviewer | Claude Code | -| Review date | 2026-06-16 | -| Commit reviewed | `8df5ab3` | +| Review date | 2026-06-18 | +| Commit reviewed | `88915c3` | | Status | Re-reviewed | -| Open findings | 0 | +| Open findings | 1 | ## Checklist coverage @@ -111,6 +111,23 @@ fakes in two test files. | 9 | Testing coverage | Issues found: Tests-026 (no test proves `EventStreamService` actually calls `IDashboardEventBroadcaster.Publish` for each event — the only consumers in tests are `Null` fakes). | | 10 | Documentation & comments | No issues found in this diff. | +### 2026-06-18 re-review (commit `88915c3`) + +Re-review of the `8df5ab3..88915c3` diff. New files: `SparseArrayExpanderTests.cs`, `ArrayAddressNormalizerTests.cs`, `GatewayArrayWriteWiringTests.cs` (array-write feature tests); additions to `ConstraintEnforcerTests.cs` (bare-array-name authz fallback), `ProtobufContractRoundTripTests.cs` (ReplayGap round-trip), `GatewaySessionTests.cs` (failed-first-attach regression), `SessionEventDistributorTests.cs` (register-after-completion + DrainUntilFaultAsync fix), `GatewaySessionDashboardMirrorTests.cs` (Tests-039 release-gate fix), `GatewayOptionsValidatorTests.cs` (DetachGraceSeconds / ReplayBuffer bounds). + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | No issues found. All new tests assert the correct post-condition; `AddItem_BareArrayAddress_NormalizedOnWireAndInRegistration` correctly verifies re-normalization in `TrackCommandReply` via a bare-address `trackingCopy`; `DrainUntilFaultAsync` graceful-completion guard is correct. | +| 2 | mxaccessgw conventions | No issues found. All new files use file-scoped namespaces, sealed classes, target-typed `new()`, PascalCase `Method_Condition_Result` names, and explicit types. | +| 3 | Concurrency & thread safety | No issues found. `Register_AfterSourceCompletes_CompletesLateSubscriberInsteadOfHanging` correctly proves pump-completed ordering via early-subscriber drain before registering the late subscriber. The Tests-039 release-gate uses `TaskCompletionSource(RunContinuationsAsynchronously)` and `ActiveEventSubscriberCount == 1` gating. | +| 4 | Error handling & resilience | No issues found. `SparseArrayExpanderTests` covers all six `RpcException(InvalidArgument)` paths (zero length, oversize length, out-of-range index, duplicate index, unsupported type, element-kind mismatch). | +| 5 | Security | No issues found. The two new `ConstraintEnforcerTests` correctly pin both directions of the `[]` suffix fallback: bare array name resolves (no spurious denial); bare non-array/missing name still fails to resolve (no false positive). | +| 6 | Performance & resource management | No issues found. `GatewayArrayWriteWiringTests.CapturingWorkerClient` implements `IAsyncDisposable` as `ValueTask.CompletedTask`; no unowned resource. `GatewaySessionTests.DetachGrace_FailedFirstAttach_DoesNotEnterGrace` uses `await using`. | +| 7 | Design-document adherence | No issues found. Sparse-expansion and array-suffix normalization tests reflect the CLAUDE.md parity contract ("MXAccess has no partial-array write primitive"); the `ConstraintEnforcer` bare-name fallback tests match the `ArrayAddressNormalizer` contract. | +| 8 | Code organization & conventions | No issues found. The `StubGalaxyHierarchyCache` is duplicated between `ArrayAddressNormalizerTests` and `GatewayArrayWriteWiringTests` (both in the same namespace), but both are `private sealed class` — duplication is confined and the existing Tests-007 / Tests-021 consolidation pattern applies only to test-support types shared across multiple test classes, not private fakes within a single class. Acceptable. | +| 9 | Testing coverage | Issue found: Tests-040 (`GatewayArrayWriteWiringTests` covers `Write` and `WriteBulk` sparse expansion wiring through `GatewaySession.InvokeAsync` but omits the other six write variants — `WriteSecured`, `Write2`, `WriteSecured2`, `Write2Bulk`, `WriteSecuredBulk`, `WriteSecured2Bulk` — each of which has its own `case` arm in `NormalizeOutboundCommand` and could silently regress). | +| 10 | Documentation & comments | No issues found. New files carry accurate class-level `` docs; test method summaries correctly describe pre/post conditions; the `DrainUntilFaultAsync` comment update precisely describes the new escape hatch. | + #### 2026-06-16 re-review (commit 8df5ab3) Re-review of the gateway-test delta (session-resilience epic + §8). New tests are high quality (bounded async waits, FakeTimeProvider, deterministic gating, meaningful assertions). Verified the §8 FakeWorkerProcess consolidation did NOT drop the `entireProcessTree` kill assertion. Only Low coverage-gap / one latent helper footgun. @@ -724,3 +741,16 @@ The cancellation tests for `WorkerClient` in `WorkerClientTests` *do* exercise t **Recommendation:** Make the test deterministic — hold the worker stream until both the dashboard mirror and the gRPC subscriber have attached, then release, so neither subscriber can miss an event regardless of scheduling. **Resolution:** 2026-06-17 — added a release-gate to the test's `FakeWorkerClient` (`HoldEventsUntilReleased()` / `ReleaseEvents()`; `ReadEventsAsync` awaits the gate before yielding, ungated by default so other tests are unaffected). The test now holds the stream, starts the gRPC reader on a background task, waits for `session.ActiveEventSubscriberCount == 1` (the internal dashboard mirror is excluded from the count, so this confirms the gRPC subscriber attached), then releases — both subscribers deterministically receive all three events. With the Server-056 production fix in place, the full `GatewaySessionDashboardMirrorTests` class now passes (5/5) instead of hanging. + +### Tests-040 + +| Field | Value | +|---|---| +| Severity | Low | +| Category | Testing coverage | +| Location | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/GatewayArrayWriteWiringTests.cs` | +| Status | Open | + +**Description:** `GatewayArrayWriteWiringTests` covers two of eight sparse-write arms in `GatewaySession.NormalizeOutboundCommand`: `Write` (single) and `WriteBulk`. The remaining six arms — `WriteSecured`, `Write2`, `WriteSecured2`, `Write2Bulk`, `WriteSecuredBulk`, `WriteSecured2Bulk` — all call `ExpandValue(entry.Value)` through the same switch, but no wiring test exercises them through `GatewaySession.InvokeAsync`. The class summary says it covers "the single outbound choke point" and that "no sparse value is ever forwarded"; the claim is accurate for the two covered variants but unverified for the other six. A regression accidentally dropping one of the six remaining `case` arms, or shifting a brace so a case falls through to the default (no-op), would pass the entire `GatewayArrayWriteWiringTests` suite. + +**Recommendation:** Add one wiring test per uncovered variant (or a single `[Theory]` over the six command kinds), constructing the matching command type with a `SparseArrayValue` and asserting `worker.LastCommand!.Command..Value.KindCase == MxValue.KindOneofCase.ArrayValue` after `session.InvokeAsync`. The `SparseArrayExpanderTests` already pin the expander logic exhaustively; the wiring tests need only check that the choke point invokes expansion for each variant, not the expansion semantics themselves. The four secured variants (`WriteSecured`, `Write2`, `WriteSecured2`, `WriteSecured2Bulk`) can reuse the same `CapturingWorkerClient` stub. diff --git a/code-reviews/Worker.Tests/findings.md b/code-reviews/Worker.Tests/findings.md index db46129..5b93808 100644 --- a/code-reviews/Worker.Tests/findings.md +++ b/code-reviews/Worker.Tests/findings.md @@ -4,11 +4,38 @@ |---|---| | Module | `src/ZB.MOM.WW.MxGateway.Worker.Tests` | | Reviewer | Claude Code | -| Review date | 2026-06-16 | -| Commit reviewed | `8df5ab3` | +| Review date | 2026-06-18 | +| Commit reviewed | `88915c3` | | Status | Re-reviewed | | Open findings | 0 | +## 2026-06-18 re-review (commit `88915c3`) + +Re-review of `git diff 8df5ab3..88915c3 -- src/ZB.MOM.WW.MxGateway.Worker.Tests/`. Three files changed, all applying previously-recorded findings rather than introducing new feature logic: + +- `WorkerPipeSessionTests.cs` — Worker.Tests-036 fix: removes the redundant wall-clock `elapsed < 5s` assertion from `RunAsync_SendsFirstHeartbeatImmediatelyOnEnteringLoop` and replaces it with an inline comment explaining the omission. +- `MxAccessCommandExecutorTests.cs` — Worker.Tests-035 fix: adds `DispatchAsync_WithUnknownCommandKind_ReturnsInvalidRequestWithUnsupportedDiagnostic`; removes the nested `FakeMxStatus` class (Worker.Tests-034 consolidation). +- `TestSupport/NoopMxAccessServer.cs` — Worker.Tests-034 fix: adds the consolidation remarks block to `FakeMxStatus` explaining it was previously duplicated in `MxAccessCommandExecutorTests`. + +The array-write-ergonomics feature (the bulk of commits between `8df5ab3` and `88915c3`) deliberately left the Worker untouched — sparse arrays are expanded gateway-side before reaching the worker IPC boundary — which is consistent with the task brief. No new test code was introduced as part of that feature in this module. + +Net48 compatibility check: no `init`-only properties, no positional records, no `IsExternalInit`, no other net48-hostile constructs in the additions. + +| # | Category | Result | +|---|---|---| +| 1 | Correctness & logic bugs | No issues found — the new executor test correctly casts `int.MaxValue` to `MxCommandKind`, dispatches through the real `MxAccessStaSession.DispatchAsync`, and asserts both the status code and the diagnostic substring; the assertion is neither too weak nor tautological. | +| 2 | mxaccessgw conventions | No issues found — test method name follows `Method_Scenario_Expectation`; `NoopMxAccessServer.cs` remarks block cites the correct finding ID; no style drift. | +| 3 | Concurrency & thread safety | No issues found — no new shared mutable state or async patterns introduced; the wall-clock assertion removal in `WorkerPipeSessionTests` is strictly a deletion. | +| 4 | Error handling & resilience | No issues found — the unknown-kind test pins the discard arm returns `InvalidRequest` rather than throwing; no new error paths added. | +| 5 | Security | No issues found — no credentials, secrets, or sensitive data involved. | +| 6 | Performance & resource management | No issues found — no new disposable objects introduced without `using`; `StaRuntime` and `MxAccessStaSession` remain `using`-declared in the new test. | +| 7 | Design-document adherence | No issues found — the executor test correctly exercises the gateway↔worker boundary at the `MxAccessStaSession` level (not raw COM), consistent with `docs/MxAccessWorkerInstanceDesign.md`. | +| 8 | Code organization & conventions | No issues found — `FakeMxStatus` deduplication is complete; the nested class is gone from `MxAccessCommandExecutorTests` and the shared copy's XML doc now explains the consolidation. | +| 9 | Testing coverage | No issues found — Worker.Tests-034/035/036 are the last open items from the prior pass; all three are now closed. No new coverage gaps introduced. | +| 10 | Documentation & comments | No issues found — the inline comment in `WorkerPipeSessionTests` correctly attributes the omission to the Worker.Tests-003/004/013/020 pattern; the `FakeMxStatus` remarks block is accurate. | + +**No new findings. All prior open items (Worker.Tests-034, -035, -036) resolved.** + ## 2026-06-15 re-review (commit `410acc9`) Re-review of the alarm-fallback test additions in `git diff 42b0037..HEAD --