Generated design docs and implementation plans via Codex for: - Batch 31: Raft Part 2 - Batch 32: JS Cluster Meta - Batch 33: JS Cluster Streams - Batch 34: JS Cluster Consumers - Batch 35: JS Cluster Remaining - Batch 36: Stream Lifecycle All plans include mandatory verification protocol and anti-stub guardrails. Updated batches.md with file paths and planned status.
5.9 KiB
Batch 33 JS Cluster Streams Design
Date: 2026-02-27
Batch: 33 (JS Cluster Streams)
Scope: 58 features + 22 unit tests
Dependency: batch 32 (JS Cluster Meta)
Go source: golang/nats-server/server/jetstream_cluster.go
Problem
Batch 33 ports JetStream cluster stream/consumer assignment execution paths from server/jetstream_cluster.go, covering cluster monitoring loops, metadata snapshots, raft-group creation, stream-entry application, leader-change advisories, and stream/consumer create-update-delete flows.
The mapped tests are spread across JetStream cluster, monitor, JWT, concurrency, and raft suites. The design objective is to define a strict, auditable implementation path that avoids placeholder code and only advances tracker statuses with build/test evidence.
Context Findings
Required command outputs
batch show 33 --db porting.db- Status:
pending - Features:
58(alldeferred) - Tests:
22(alldeferred) - Depends on:
32 - Go file:
server/jetstream_cluster.go
- Status:
batch list --db porting.db- Batch chain includes
32 -> 33 -> 34for JS cluster progression.
- Batch chain includes
report summary --db porting.db- Overall progress:
1924/6942 (27.7%)
- Overall progress:
Note: in this environment, dotnet is not on PATH; use /usr/local/share/dotnet/dotnet when needed.
Current .NET state relevant to Batch 33
- Cluster data structures exist in
dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamClusterTypes.cs. - Core types exist in:
dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamTypes.csdotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.csdotnet/src/ZB.MOM.NatsNet.Server/Accounts/Account.csdotnet/src/ZB.MOM.NatsNet.Server/NatsServer.*.cs(partial server files)
- Backlog test coverage is mostly placeholder-level today;
JetStreamClusterTests2.Impltests.csis present, while several mapped classes (for exampleJetStreamClusterTests3,JetStreamClusterLongTests,RaftNodeTests) still need concrete batch coverage.
Clarified Constraints
- Planning only in this session: no implementation execution.
- Mandatory guardrails from Batch 0 must be carried forward and adapted to features + tests.
- Feature work must be chunked into groups of at most ~20 features.
- Status updates must use
batch-updatechunks of max 15 IDs.
Approaches
Approach A: Monolithic pass (all 58 features + 22 tests)
- Pros: fewer task boundaries.
- Cons: weak traceability and high risk of hidden stubs/regressions.
Approach B (Recommended): Three feature groups + three test waves with hard checkpoints
- Pros: bounded scope per task, stronger verification evidence, easier rollback/debug.
- Cons: more command overhead and checkpoint ceremony.
Approach C: Test-heavy-first before major feature porting
- Pros: early behavior signal.
- Cons: high churn because many mapped tests depend on stream/consumer cluster plumbing not yet ported.
Decision: Approach B.
Proposed Design
1. File ownership model
JetStreamcluster stream orchestration methods inJetStreamTypes.csor a new focused partial file (JetStream.ClusterStreams.cs).NatsStreamraft/cluster helpers inNatsStream.csorNatsStream.Cluster.cs.RaftGroup,StreamAssignment,ConsumerAssignment, and cluster helpers inJetStreamClusterTypes.cs(or focused partials if split improves reviewability).- Server-facing operations and advisories in a new/updated server partial (
NatsServer.JetStreamClusterStreams.cs).
2. Feature slicing (max ~20 each)
- Feature Group A (20 IDs): cluster monitor + snapshot/recovery primitives
IDs:1578-1597 - Feature Group B (20 IDs): meta-entry application + raft-group/stream monitoring + leader-change core
IDs:1598-1617 - Feature Group C (18 IDs): advisory + stream assignment/process lifecycle + consumer assignment/process lifecycle
IDs:1618-1635
3. Test slicing
- Test Wave T1 (5 IDs): cluster long-path + JWT/monitor/concurrency anchors
IDs:1118,1214,1402,2144,2504 - Test Wave T2 (9 IDs): raft elections and term behavior (early raft set)
IDs:2616,2620,2622,2624,2627,2628,2630,2631,2634 - Test Wave T3 (8 IDs): raft replay/catchup/chain-of-blocks paths
IDs:2637,2638,2652,2657,2670,2671,2698,2699
4. Verification architecture
- Per-feature loop:
feature show-> focused failing test -> minimal implementation -> stub scan -> build gate -> targeted test gate -> status transition. - Per-test loop:
test show-> Go behavioral port -> single-test run evidence -> class-level run -> status transition. - Checkpoint after every feature group and test wave, including full unit suite run.
5. Deferred handling model
If blocked by missing dependency behavior/infrastructure, immediately mark item deferred with explicit reason via --override; do not leave stubs in source or tests.
Risks and Mitigations
- Dependency risk: Batch 32 is prerequisite.
Mitigation: block all Batch 33 status transitions until dependency preflight confirms readiness. - Stub-risk in backlog tests: existing placeholder-style tests can produce false progress.
Mitigation: required stub scan + assertion-quality checks + single-test execution evidence. - Ownership ambiguity risk: methods span
JetStream,NatsStream,JetStreamCluster,NatsServer.
Mitigation: explicit file ownership map and grouped tasking by domain.
Success Criteria
- All 58 features are either
verifiedwith evidence ordeferredwith explicit blocker reason. - All 22 tests are either
verifiedwith evidence ordeferredwith explicit blocker reason. - No forbidden stub patterns in touched files.
- Batch progress is auditable from command outputs and chunked status updates.
Non-Goals
- Executing the implementation in this document.
- Extending scope into Batch 34/35.
- Building full distributed integration harness beyond mapped unit/backlog verification needs.