# Batch 35 JS Cluster Remaining Design **Date:** 2026-02-27 **Batch:** 35 (`JS Cluster Remaining`) **Scope:** 57 features + 49 unit tests **Dependency:** batch `32` (`JS Cluster Meta`) **Go source:** `golang/nats-server/server/jetstream_cluster.go` ## Problem Batch 35 covers the remaining JetStream cluster behavior in `server/jetstream_cluster.go` (roughly lines `8766-10866`): delete-range and assignment encoding/decoding, stream snapshot/catchup processing, cluster info assembly, catchup throttling counters, and sync subject helpers. It also includes 49 tests, mostly `RaftNodeTests`, that validate catchup/truncation and snapshot correctness. The plan must prevent false progress: no placeholder feature ports and no fake-pass tests. ## Context Findings ### Required command outputs - `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch show 35 --db porting.db` - Status: `pending` - Features: `57` (all `deferred`) - Tests: `49` (all `deferred`) - Depends on: `32` - Go file: `server/jetstream_cluster.go` - `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch list --db porting.db` - Confirms chain around this area: `32 -> 33/34/35`. - `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db` - Overall progress: `1924/6942 (27.7%)` Environment note: `dotnet` was not on `PATH` in this shell; use `/usr/local/share/dotnet/dotnet`. ### Feature ownership distribution (from `porting.db`) - `NatsStream`: 27 - `JetStreamCluster`: 19 - `NatsServer`: 8 - `JetStreamEngine`: 3 ### Test distribution (from `porting.db`) - `RaftNodeTests`: 42 - `JetStreamClusterTests1`: 6 - `JetStreamBatchingTests`: 1 ## Constraints and Success Criteria - Planning only; no implementation execution in this session. - Reuse Batch 0 rigor, but for **features + tests**. - Feature tasks must be grouped in chunks of max ~20 features. - Status updates must use batch-update chunks of at most 15 IDs. - Blocked work must be marked `deferred` with explicit reason, never stubbed. Success means all 57 feature IDs and 49 test IDs are either: - promoted with verification evidence (`complete/verified`), or - kept `deferred` with specific blocker notes. ## Approaches ### Approach A: Monolithic pass (all features, then all tests) - Pros: simple sequencing. - Cons: high risk, poor auditability, hard to isolate regressions. ### Approach B (Recommended): Three feature groups plus four test waves with hard gates - Pros: bounded scope, clearer rollback points, aligns with max-20 feature grouping and max-15 status updates. - Cons: more command overhead. ### Approach C: Test-first for all 49 tests before feature completion - Pros: immediate behavior pressure. - Cons: high churn because most tests depend on unported catchup/snapshot internals. **Decision:** Approach B. ## Proposed Design ### 1. Code Organization Use behavior-focused files while keeping existing class ownership: - `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamClusterTypes.cs` - assignment/sync subject encode/decode helpers and cluster utility functions. - `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.cs` - snapshot and catchup state transitions plus inbound cluster message processing. - `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamTypes.cs` - offline/online cluster info and alternate-stream assembly. - `dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.cs` and/or new partial `NatsServer.JetStreamClusterRemaining.cs` - clustered consumer request path and gcb accounting helpers. ### 2. Feature Grouping (max ~20) - **Group A (20 IDs):** `1694-1713` Delete-range, consumer assignment, stream/batch encode-decode, snapshot support/state capture, clustered inbound message entry point. - **Group B (20 IDs):** `1714-1733` Delete trace, sync request calculation, snapshot delete handling, catchup peer lifecycle, snapshot/catchup processing, stream sync handler, cluster info base methods. - **Group C (17 IDs):** `1734-1750` Cluster info checks, stream alternates/info request handling, gcb accounting/kick channel, run-catchup loop, sync subject helper family. ### 3. Test Wave Grouping - **Wave T1 (7 IDs):** `730,846,847,848,890,891,893` (`JetStreamBatchingTests` + `JetStreamClusterTests1`) - **Wave T2 (14 IDs):** `2640,2641,2643,2644,2645,2646,2647,2648,2649,2653,2655,2656,2658,2659` - **Wave T3 (14 IDs):** `2660,2661,2662,2665,2666,2668,2669,2673,2676,2677,2678,2679,2680,2681` - **Wave T4 (14 IDs):** `2682,2683,2684,2685,2686,2688,2691,2696,2697,2703,2715,2716,2717,2719` ### 4. Verification Strategy (Design-Level) - Every feature follows a per-feature loop with focused test evidence. - Every test follows method-level then class-level verification. - Stub detection runs after each loop and before status promotions. - Build and targeted/full test gates are mandatory before checkpoint status updates. - Checkpoints occur between every task boundary. ### 5. Deferral Strategy If infrastructure or dependency behavior blocks a feature/test: 1. Stop work on that ID. 2. Do not add placeholder implementation/assertions. 3. Mark `deferred` with explicit reason via `--override`. 4. Continue with next unblocked item. ## Risks and Mitigations - **Dependency readiness risk (Batch 32):** enforce preflight before `batch start 35`. - **Raft-heavy test concentration:** split `RaftNodeTests` into three equal waves and checkpoint each wave. - **Stub regression under volume:** hard anti-stub scans and strict status chunking (`<=15`). - **Class ownership drift:** keep methods in mapped classes only (`JetStreamCluster`, `NatsStream`, `JetStreamEngine`, `NatsServer`). ## Non-Goals - Executing the implementation in this session. - Expanding scope beyond Batch 35 mappings. - Changing batch dependencies/order in PortTracker.