Generated design docs and implementation plans via Codex for: - Batch 31: Raft Part 2 - Batch 32: JS Cluster Meta - Batch 33: JS Cluster Streams - Batch 34: JS Cluster Consumers - Batch 35: JS Cluster Remaining - Batch 36: Stream Lifecycle All plans include mandatory verification protocol and anti-stub guardrails. Updated batches.md with file paths and planned status.
136 lines
5.8 KiB
Markdown
136 lines
5.8 KiB
Markdown
# Batch 35 JS Cluster Remaining Design
|
|
|
|
**Date:** 2026-02-27
|
|
**Batch:** 35 (`JS Cluster Remaining`)
|
|
**Scope:** 57 features + 49 unit tests
|
|
**Dependency:** batch `32` (`JS Cluster Meta`)
|
|
**Go source:** `golang/nats-server/server/jetstream_cluster.go`
|
|
|
|
## Problem
|
|
|
|
Batch 35 covers the remaining JetStream cluster behavior in `server/jetstream_cluster.go` (roughly lines `8766-10866`): delete-range and assignment encoding/decoding, stream snapshot/catchup processing, cluster info assembly, catchup throttling counters, and sync subject helpers. It also includes 49 tests, mostly `RaftNodeTests`, that validate catchup/truncation and snapshot correctness.
|
|
|
|
The plan must prevent false progress: no placeholder feature ports and no fake-pass tests.
|
|
|
|
## Context Findings
|
|
|
|
### Required command outputs
|
|
|
|
- `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch show 35 --db porting.db`
|
|
- Status: `pending`
|
|
- Features: `57` (all `deferred`)
|
|
- Tests: `49` (all `deferred`)
|
|
- Depends on: `32`
|
|
- Go file: `server/jetstream_cluster.go`
|
|
- `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch list --db porting.db`
|
|
- Confirms chain around this area: `32 -> 33/34/35`.
|
|
- `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db`
|
|
- Overall progress: `1924/6942 (27.7%)`
|
|
|
|
Environment note: `dotnet` was not on `PATH` in this shell; use `/usr/local/share/dotnet/dotnet`.
|
|
|
|
### Feature ownership distribution (from `porting.db`)
|
|
|
|
- `NatsStream`: 27
|
|
- `JetStreamCluster`: 19
|
|
- `NatsServer`: 8
|
|
- `JetStreamEngine`: 3
|
|
|
|
### Test distribution (from `porting.db`)
|
|
|
|
- `RaftNodeTests`: 42
|
|
- `JetStreamClusterTests1`: 6
|
|
- `JetStreamBatchingTests`: 1
|
|
|
|
## Constraints and Success Criteria
|
|
|
|
- Planning only; no implementation execution in this session.
|
|
- Reuse Batch 0 rigor, but for **features + tests**.
|
|
- Feature tasks must be grouped in chunks of max ~20 features.
|
|
- Status updates must use batch-update chunks of at most 15 IDs.
|
|
- Blocked work must be marked `deferred` with explicit reason, never stubbed.
|
|
|
|
Success means all 57 feature IDs and 49 test IDs are either:
|
|
- promoted with verification evidence (`complete/verified`), or
|
|
- kept `deferred` with specific blocker notes.
|
|
|
|
## Approaches
|
|
|
|
### Approach A: Monolithic pass (all features, then all tests)
|
|
|
|
- Pros: simple sequencing.
|
|
- Cons: high risk, poor auditability, hard to isolate regressions.
|
|
|
|
### Approach B (Recommended): Three feature groups plus four test waves with hard gates
|
|
|
|
- Pros: bounded scope, clearer rollback points, aligns with max-20 feature grouping and max-15 status updates.
|
|
- Cons: more command overhead.
|
|
|
|
### Approach C: Test-first for all 49 tests before feature completion
|
|
|
|
- Pros: immediate behavior pressure.
|
|
- Cons: high churn because most tests depend on unported catchup/snapshot internals.
|
|
|
|
**Decision:** Approach B.
|
|
|
|
## Proposed Design
|
|
|
|
### 1. Code Organization
|
|
|
|
Use behavior-focused files while keeping existing class ownership:
|
|
|
|
- `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamClusterTypes.cs`
|
|
- assignment/sync subject encode/decode helpers and cluster utility functions.
|
|
- `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.cs`
|
|
- snapshot and catchup state transitions plus inbound cluster message processing.
|
|
- `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamTypes.cs`
|
|
- offline/online cluster info and alternate-stream assembly.
|
|
- `dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.cs` and/or new partial `NatsServer.JetStreamClusterRemaining.cs`
|
|
- clustered consumer request path and gcb accounting helpers.
|
|
|
|
### 2. Feature Grouping (max ~20)
|
|
|
|
- **Group A (20 IDs):** `1694-1713`
|
|
Delete-range, consumer assignment, stream/batch encode-decode, snapshot support/state capture, clustered inbound message entry point.
|
|
- **Group B (20 IDs):** `1714-1733`
|
|
Delete trace, sync request calculation, snapshot delete handling, catchup peer lifecycle, snapshot/catchup processing, stream sync handler, cluster info base methods.
|
|
- **Group C (17 IDs):** `1734-1750`
|
|
Cluster info checks, stream alternates/info request handling, gcb accounting/kick channel, run-catchup loop, sync subject helper family.
|
|
|
|
### 3. Test Wave Grouping
|
|
|
|
- **Wave T1 (7 IDs):** `730,846,847,848,890,891,893` (`JetStreamBatchingTests` + `JetStreamClusterTests1`)
|
|
- **Wave T2 (14 IDs):** `2640,2641,2643,2644,2645,2646,2647,2648,2649,2653,2655,2656,2658,2659`
|
|
- **Wave T3 (14 IDs):** `2660,2661,2662,2665,2666,2668,2669,2673,2676,2677,2678,2679,2680,2681`
|
|
- **Wave T4 (14 IDs):** `2682,2683,2684,2685,2686,2688,2691,2696,2697,2703,2715,2716,2717,2719`
|
|
|
|
### 4. Verification Strategy (Design-Level)
|
|
|
|
- Every feature follows a per-feature loop with focused test evidence.
|
|
- Every test follows method-level then class-level verification.
|
|
- Stub detection runs after each loop and before status promotions.
|
|
- Build and targeted/full test gates are mandatory before checkpoint status updates.
|
|
- Checkpoints occur between every task boundary.
|
|
|
|
### 5. Deferral Strategy
|
|
|
|
If infrastructure or dependency behavior blocks a feature/test:
|
|
|
|
1. Stop work on that ID.
|
|
2. Do not add placeholder implementation/assertions.
|
|
3. Mark `deferred` with explicit reason via `--override`.
|
|
4. Continue with next unblocked item.
|
|
|
|
## Risks and Mitigations
|
|
|
|
- **Dependency readiness risk (Batch 32):** enforce preflight before `batch start 35`.
|
|
- **Raft-heavy test concentration:** split `RaftNodeTests` into three equal waves and checkpoint each wave.
|
|
- **Stub regression under volume:** hard anti-stub scans and strict status chunking (`<=15`).
|
|
- **Class ownership drift:** keep methods in mapped classes only (`JetStreamCluster`, `NatsStream`, `JetStreamEngine`, `NatsServer`).
|
|
|
|
## Non-Goals
|
|
|
|
- Executing the implementation in this session.
|
|
- Expanding scope beyond Batch 35 mappings.
|
|
- Changing batch dependencies/order in PortTracker.
|