Files

Joseph Doherty f8dce79ac0 Add batch plans for batches 31-36 (rounds 16-18)

Generated design docs and implementation plans via Codex for:
- Batch 31: Raft Part 2
- Batch 32: JS Cluster Meta
- Batch 33: JS Cluster Streams
- Batch 34: JS Cluster Consumers
- Batch 35: JS Cluster Remaining
- Batch 36: Stream Lifecycle

All plans include mandatory verification protocol and anti-stub guardrails.
Updated batches.md with file paths and planned status.

2026-02-27 17:01:31 -05:00

5.9 KiB

Raw Blame History

Batch 33 JS Cluster Streams Design

Date: 2026-02-27
Batch: 33 (JS Cluster Streams)
Scope: 58 features + 22 unit tests
Dependency: batch 32 (JS Cluster Meta)
Go source: golang/nats-server/server/jetstream_cluster.go

Problem

Batch 33 ports JetStream cluster stream/consumer assignment execution paths from server/jetstream_cluster.go, covering cluster monitoring loops, metadata snapshots, raft-group creation, stream-entry application, leader-change advisories, and stream/consumer create-update-delete flows.

The mapped tests are spread across JetStream cluster, monitor, JWT, concurrency, and raft suites. The design objective is to define a strict, auditable implementation path that avoids placeholder code and only advances tracker statuses with build/test evidence.

Context Findings

Required command outputs

batch show 33 --db porting.db
- Status: pending
- Features: 58 (all deferred)
- Tests: 22 (all deferred)
- Depends on: 32
- Go file: server/jetstream_cluster.go
batch list --db porting.db
- Batch chain includes 32 -> 33 -> 34 for JS cluster progression.
report summary --db porting.db
- Overall progress: 1924/6942 (27.7%)

Note: in this environment, dotnet is not on PATH; use /usr/local/share/dotnet/dotnet when needed.

Current .NET state relevant to Batch 33

Cluster data structures exist in dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamClusterTypes.cs.
Core types exist in:
- dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamTypes.cs
- dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.cs
- dotnet/src/ZB.MOM.NatsNet.Server/Accounts/Account.cs
- dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.*.cs (partial server files)
Backlog test coverage is mostly placeholder-level today; JetStreamClusterTests2.Impltests.cs is present, while several mapped classes (for example JetStreamClusterTests3, JetStreamClusterLongTests, RaftNodeTests) still need concrete batch coverage.

Clarified Constraints

Planning only in this session: no implementation execution.
Mandatory guardrails from Batch 0 must be carried forward and adapted to features + tests.
Feature work must be chunked into groups of at most ~20 features.
Status updates must use batch-update chunks of max 15 IDs.

Approaches

Approach A: Monolithic pass (all 58 features + 22 tests)

Pros: fewer task boundaries.
Cons: weak traceability and high risk of hidden stubs/regressions.

Approach B (Recommended): Three feature groups + three test waves with hard checkpoints

Pros: bounded scope per task, stronger verification evidence, easier rollback/debug.
Cons: more command overhead and checkpoint ceremony.

Approach C: Test-heavy-first before major feature porting

Pros: early behavior signal.
Cons: high churn because many mapped tests depend on stream/consumer cluster plumbing not yet ported.

Decision: Approach B.

Proposed Design

1. File ownership model

JetStream cluster stream orchestration methods in JetStreamTypes.cs or a new focused partial file (JetStream.ClusterStreams.cs).
NatsStream raft/cluster helpers in NatsStream.cs or NatsStream.Cluster.cs.
RaftGroup, StreamAssignment, ConsumerAssignment, and cluster helpers in JetStreamClusterTypes.cs (or focused partials if split improves reviewability).
Server-facing operations and advisories in a new/updated server partial (NatsServer.JetStreamClusterStreams.cs).

2. Feature slicing (max ~20 each)

Feature Group A (20 IDs): cluster monitor + snapshot/recovery primitives
IDs: 1578-1597
Feature Group B (20 IDs): meta-entry application + raft-group/stream monitoring + leader-change core
IDs: 1598-1617
Feature Group C (18 IDs): advisory + stream assignment/process lifecycle + consumer assignment/process lifecycle
IDs: 1618-1635

3. Test slicing

Test Wave T1 (5 IDs): cluster long-path + JWT/monitor/concurrency anchors
IDs: 1118,1214,1402,2144,2504
Test Wave T2 (9 IDs): raft elections and term behavior (early raft set)
IDs: 2616,2620,2622,2624,2627,2628,2630,2631,2634
Test Wave T3 (8 IDs): raft replay/catchup/chain-of-blocks paths
IDs: 2637,2638,2652,2657,2670,2671,2698,2699

4. Verification architecture

Per-feature loop: feature show -> focused failing test -> minimal implementation -> stub scan -> build gate -> targeted test gate -> status transition.
Per-test loop: test show -> Go behavioral port -> single-test run evidence -> class-level run -> status transition.
Checkpoint after every feature group and test wave, including full unit suite run.

5. Deferred handling model

If blocked by missing dependency behavior/infrastructure, immediately mark item deferred with explicit reason via --override; do not leave stubs in source or tests.

Risks and Mitigations

Dependency risk: Batch 32 is prerequisite.
Mitigation: block all Batch 33 status transitions until dependency preflight confirms readiness.
Stub-risk in backlog tests: existing placeholder-style tests can produce false progress.
Mitigation: required stub scan + assertion-quality checks + single-test execution evidence.
Ownership ambiguity risk: methods span JetStream, NatsStream, JetStreamCluster, NatsServer.
Mitigation: explicit file ownership map and grouped tasking by domain.

Success Criteria

All 58 features are either verified with evidence or deferred with explicit blocker reason.
All 22 tests are either verified with evidence or deferred with explicit blocker reason.
No forbidden stub patterns in touched files.
Batch progress is auditable from command outputs and chunked status updates.

Non-Goals

Executing the implementation in this document.
Extending scope into Batch 34/35.
Building full distributed integration harness beyond mapped unit/backlog verification needs.

5.9 KiB Raw Blame History