# Still-Pending §8 Completion — Design > **Status:** Approved 2026-06-16. Next step: `superpowers-extended-cc:writing-plans`. **Goal:** Close the actionable items in `stillpending.md` §8 ("Deferred test-coverage follow-ups, never filed as findings") — the only Bucket-A work that is neither vendor-gated nor live-rig-gated and is not already covered by the session-resilience epic plan. **Scope decision:** Bucket A only (actionable code/test work). The session-resilience epic (Tasks 13–28) is already planned in `docs/plans/2026-06-15-session-resilience.md` and is explicitly **out of scope** here — resume it separately. Vendor-gated (§1.4/§3.4/§3.5) and live-rig/capture-gated (§1.3/§3.x/§5/§6.1) items cannot be completed from this dev box and are out of scope. **Approach:** "C" — the complete option, including new in-process gRPC test infrastructure for the Java streaming/galaxy CLI commands and a full bounded ready-wait in the gateway session hot path. --- ## Important correction (verified 2026-06-16) The three §8 items cite findings marked **Resolved** in the review backlog, but those resolutions did **not** survive into the current tree: - The Java bulk-family CLI tests that `Client.Java-026` (resolved 2026-05-20) describes were written against the old `com.dohertylan.mxgateway` package. After the rename to `com.zb.mom.ww`, the current `clients/java/zb-mom-ww-mxgateway-cli/.../MxGatewayCliTests.java` has **zero** coverage for `read-bulk`, `write-bulk`, `write2-bulk`, `write-secured-bulk`, `write-secured2-bulk`, `bench-read-bulk`, `stream-events`, `close-session`, `galaxy-discover`, `galaxy-watch`. (`galaxy-test-connection`/`galaxy-last-deploy`/ `galaxy-browse`/`stream-alarms` **do** have tests now.) - `Server-030` (both states in the not-ready diagnostic) **is** done — confirmed at `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:1676`. The *deferred follow-up* — should the gateway briefly wait for worker-Ready before failing fast? — is genuinely unbuilt. - `Tests-023` extracted a canonical `TestSupport/FakeWorkerProcess(int)`, yet three test files still define private nested copies. So §8's gap is real and current. --- ## Workstreams Four independently landable workstreams. | WS | Title | Files (language) | Classification | Depends on | |----|-------|------------------|----------------|------------| | A | Synchronous Java CLI tests (7 commands) | Java CLI test | small | — | | B | In-process gRPC harness + streaming/galaxy CLI tests (3 commands) | Java CLI test + small CLI seam | standard | A (shares test file) | | C | Worker-Ready bounded ready-wait | C# server session hot path | high-risk | — | | D | `FakeWorkerProcess` consolidation | C# tests | small | — | A, C, D are mutually independent (disjoint files/languages) and may be dispatched in parallel. B follows A because both edit `MxGatewayCliTests.java`. --- ## WS-A — Synchronous Java CLI tests **What:** Round-trip CLI tests for the 7 commands testable through the existing `FakeSession`/`FakeClient` seam (the same seam `subscribe-bulk`/`write` already use): `read-bulk`, `write-bulk`, `write2-bulk`, `write-secured-bulk`, `write-secured2-bulk`, `bench-read-bulk`, `close-session`. **How:** Upgrade `FakeSession` (currently returns empty lists) to per-call recorders that capture the parsed entries (timeout, typed values via the shared `parseValue(type, text)` switch, user-ids, timestamp) and synthesize one `BulkReadResult`/`BulkWriteResult` per requested handle, so JSON-shape assertions exercise the `bulkReadResultMap`/`bulkWriteResultMap` serializers. One `@Test` per command: - `read-bulk`: `--timeout-ms` reaches session; JSON carries `tagAddress`/`itemHandle`/ `wasCached`/`quality`. - `write-bulk`: `--type int32 --values 111,222 --user-id 5` parses through `parseValue`; entries built with the expected typed `MxValue` + `userId`. - `write2-bulk`: `--timestamp …Z` reaches the entry as `timestampValue` (`hasTimestampValue()` true). - `write-secured-bulk`: `--current-user-id`/`--verifier-user-id` both propagate. - `write-secured2-bulk`: timestamp + both user-ids. - `bench-read-bulk`: 1s steady / 0s warmup; assert cross-language schema keys (`language=java`, `command=bench-read-bulk`, `totalCalls`, `successfulCalls`, `failedCalls`, `callsPerSecond`, `latencyMs.p50/p95/p99`). - `close-session`: `CloseSessionReply` round-trips through `FakeClient`. **Verify:** `gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests`. --- ## WS-B — In-process gRPC harness + streaming/galaxy CLI tests **Why infra is required:** `MxEventStream` and `DeployEventStream` have package-private constructors; `GalaxyRepositoryClient` is `final` with a static `connect()` and `GalaxyCommand` has **no** injectable factory. None of `stream-events`/`galaxy-watch`/ `galaxy-discover` can be faked through the `FakeSession` seam. **What:** A JUnit fixture that starts a gRPC **`InProcessServer`** hosting scripted `MxAccessGateway` + `GalaxyRepository` service implementations and exposes an in-process `Channel`. The **real** `MxGatewayClient`/`GalaxyRepositoryClient` connect to it, so the real `MxEventStream`/`DeployEventStream` queue-draining and `GalaxyRepositoryClient` paging are exercised end-to-end (highest fidelity; no reflection, no package hacks). - **Production change (CLI module only, not the library):** add a `GalaxyClientFactory` seam to `GalaxyCommand` mirroring the existing `MxGatewayCliClientFactory`, so galaxy commands can target the in-process channel. - `stream-events`: server streams a scripted `MxEvent` sequence → assert CLI render, including the unsigned-uint64 worker-sequence regression. - `galaxy-watch`: server streams scripted deploy events → assert CLI feed output. - `galaxy-discover`: server returns a paged `GalaxyObject` hierarchy → assert CLI JSON. The 7 synchronous commands stay on the lightweight `FakeSession` seam (YAGNI — no reason to route them through a server). **Verify:** `gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests`. --- ## WS-C — Worker-Ready bounded ready-wait **Problem:** `GetReadyWorkerClient` (`GatewaySession.cs:1665`) fails fast when the session is `Ready` but the worker client's `WorkerClientState` has diverged (`Handshaking` after a heartbeat blip, etc.). The both-states diagnostic exists; a brief wait does not. **Constraint:** the check runs inside the `_syncRoot` lock — we cannot sleep/poll there. **Design (pinned decisions):** - New `GetReadyWorkerClientAsync`: read state under `_syncRoot`; **if** session is `Ready` but worker is **transient** (`Handshaking`/`Created`), release the lock, poll at a short interval (e.g. 25 ms) until the worker reaches `Ready` or a bounded timeout elapses, then re-check under the lock. - **Terminal worker states (`Faulted`/`Closing`/`Closed`/null) fail fast immediately** — never wait; retrying a faulted worker is pointless and would mask the fault. - New config `MxGateway:Sessions:WorkerReadyWaitTimeout` on `GatewaySessionOptions`, **default `0` = disabled** (preserves today's exact fail-fast behavior unless opted in), validated `>= 0` by the options validator. Document in `docs/GatewayConfiguration.md`. - The both-states diagnostic is preserved for the final failure. Callers at `GatewaySession.cs:918` and `:1263` become `await`. **Tests:** - Handshaking→Ready within the timeout succeeds (worker invoked once). - Faulted fails fast with both states in the message, zero waiting. - Timeout elapses → fails with both states. - Default `0` → unchanged fail-fast (no wait, no behavior change). **Verify:** `dotnet test src/ZB.MOM.WW.MxGateway.Tests --filter "FullyQualifiedName~SessionManager"` (plus the options-validator test class). --- ## WS-D — `FakeWorkerProcess` consolidation **What:** Replace the private nested `FakeWorkerProcess` in `SessionWorkerClientFactoryFakeWorkerTests`, `WorkerProcessLauncherTests`, and `WorkerClientTests` with the canonical `TestSupport/FakeWorkerProcess(int)` (which already has `MarkExited`/`Kill`/TCS-backed `WaitForExitAsync`). Where a nested copy carries extra behavior the canonical lacks, fold that into the canonical first, then delete the copies. **Verify:** `dotnet test src/ZB.MOM.WW.MxGateway.Tests --filter "FullyQualifiedName~WorkerClient | FullyQualifiedName~WorkerProcessLauncher | FullyQualifiedName~SessionWorkerClientFactory"`. --- ## Testing & sequencing Per the targeted-test rule in `CLAUDE.md` (Source Update Workflow): each task runs only its own filtered tests. Run the full gateway suite at most once, after WS-C + WS-D land. Out-of-scope items remain recorded in `stillpending.md` (vendor/rig-gated) and the session-resilience epic (`oldtasks.md`).