natsnet/docs/plans/2026-02-27-batch-33-js-cluster-streams-design.md

# Batch 33 JS Cluster Streams Design

**Date:** 2026-02-27
**Batch:** 33 (`JS Cluster Streams`)
**Scope:** 58 features + 22 unit tests
**Dependency:** batch `32` (`JS Cluster Meta`)
**Go source:** `golang/nats-server/server/jetstream_cluster.go`

## Problem

Batch 33 ports JetStream cluster stream/consumer assignment execution paths from `server/jetstream_cluster.go`, covering cluster monitoring loops, metadata snapshots, raft-group creation, stream-entry application, leader-change advisories, and stream/consumer create-update-delete flows.

The mapped tests are spread across JetStream cluster, monitor, JWT, concurrency, and raft suites. The design objective is to define a strict, auditable implementation path that avoids placeholder code and only advances tracker statuses with build/test evidence.

## Context Findings

### Required command outputs

- `batch show 33 --db porting.db`
  - Status: `pending`
  - Features: `58` (all `deferred`)
  - Tests: `22` (all `deferred`)
  - Depends on: `32`
  - Go file: `server/jetstream_cluster.go`
- `batch list --db porting.db`
  - Batch chain includes `32 -> 33 -> 34` for JS cluster progression.
- `report summary --db porting.db`
  - Overall progress: `1924/6942 (27.7%)`

Note: in this environment, `dotnet` is not on `PATH`; use `/usr/local/share/dotnet/dotnet` when needed.

### Current .NET state relevant to Batch 33

- Cluster data structures exist in `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamClusterTypes.cs`.
- Core types exist in:
  - `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamTypes.cs`
  - `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.cs`
  - `dotnet/src/ZB.MOM.NatsNet.Server/Accounts/Account.cs`
  - `dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.*.cs` (partial server files)
- Backlog test coverage is mostly placeholder-level today; `JetStreamClusterTests2.Impltests.cs` is present, while several mapped classes (for example `JetStreamClusterTests3`, `JetStreamClusterLongTests`, `RaftNodeTests`) still need concrete batch coverage.

## Clarified Constraints

- Planning only in this session: no implementation execution.
- Mandatory guardrails from Batch 0 must be carried forward and adapted to features + tests.
- Feature work must be chunked into groups of at most ~20 features.
- Status updates must use `batch-update` chunks of max 15 IDs.

## Approaches

### Approach A: Monolithic pass (all 58 features + 22 tests)

- Pros: fewer task boundaries.
- Cons: weak traceability and high risk of hidden stubs/regressions.

### Approach B (Recommended): Three feature groups + three test waves with hard checkpoints

- Pros: bounded scope per task, stronger verification evidence, easier rollback/debug.
- Cons: more command overhead and checkpoint ceremony.

### Approach C: Test-heavy-first before major feature porting

- Pros: early behavior signal.
- Cons: high churn because many mapped tests depend on stream/consumer cluster plumbing not yet ported.

**Decision:** Approach B.

## Proposed Design

### 1. File ownership model

- `JetStream` cluster stream orchestration methods in `JetStreamTypes.cs` or a new focused partial file (`JetStream.ClusterStreams.cs`).
- `NatsStream` raft/cluster helpers in `NatsStream.cs` or `NatsStream.Cluster.cs`.
- `RaftGroup`, `StreamAssignment`, `ConsumerAssignment`, and cluster helpers in `JetStreamClusterTypes.cs` (or focused partials if split improves reviewability).
- Server-facing operations and advisories in a new/updated server partial (`NatsServer.JetStreamClusterStreams.cs`).

### 2. Feature slicing (max ~20 each)

- **Feature Group A (20 IDs):** cluster monitor + snapshot/recovery primitives
  IDs: `1578-1597`
- **Feature Group B (20 IDs):** meta-entry application + raft-group/stream monitoring + leader-change core
  IDs: `1598-1617`
- **Feature Group C (18 IDs):** advisory + stream assignment/process lifecycle + consumer assignment/process lifecycle
  IDs: `1618-1635`

### 3. Test slicing

- **Test Wave T1 (5 IDs):** cluster long-path + JWT/monitor/concurrency anchors
  IDs: `1118,1214,1402,2144,2504`
- **Test Wave T2 (9 IDs):** raft elections and term behavior (early raft set)
  IDs: `2616,2620,2622,2624,2627,2628,2630,2631,2634`
- **Test Wave T3 (8 IDs):** raft replay/catchup/chain-of-blocks paths
  IDs: `2637,2638,2652,2657,2670,2671,2698,2699`

### 4. Verification architecture

- Per-feature loop: `feature show` -> focused failing test -> minimal implementation -> stub scan -> build gate -> targeted test gate -> status transition.
- Per-test loop: `test show` -> Go behavioral port -> single-test run evidence -> class-level run -> status transition.
- Checkpoint after every feature group and test wave, including full unit suite run.

### 5. Deferred handling model

If blocked by missing dependency behavior/infrastructure, immediately mark item `deferred` with explicit reason via `--override`; do not leave stubs in source or tests.

## Risks and Mitigations

- **Dependency risk:** Batch 32 is prerequisite.
  **Mitigation:** block all Batch 33 status transitions until dependency preflight confirms readiness.
- **Stub-risk in backlog tests:** existing placeholder-style tests can produce false progress.
  **Mitigation:** required stub scan + assertion-quality checks + single-test execution evidence.
- **Ownership ambiguity risk:** methods span `JetStream`, `NatsStream`, `JetStreamCluster`, `NatsServer`.
  **Mitigation:** explicit file ownership map and grouped tasking by domain.

## Success Criteria

- All 58 features are either `verified` with evidence or `deferred` with explicit blocker reason.
- All 22 tests are either `verified` with evidence or `deferred` with explicit blocker reason.
- No forbidden stub patterns in touched files.
- Batch progress is auditable from command outputs and chunked status updates.

## Non-Goals

- Executing the implementation in this document.
- Extending scope into Batch 34/35.
- Building full distributed integration harness beyond mapped unit/backlog verification needs.