Files

Joseph Doherty f8dce79ac0 Add batch plans for batches 31-36 (rounds 16-18)

Generated design docs and implementation plans via Codex for:
- Batch 31: Raft Part 2
- Batch 32: JS Cluster Meta
- Batch 33: JS Cluster Streams
- Batch 34: JS Cluster Consumers
- Batch 35: JS Cluster Remaining
- Batch 36: Stream Lifecycle

All plans include mandatory verification protocol and anti-stub guardrails.
Updated batches.md with file paths and planned status.

2026-02-27 17:01:31 -05:00

6.2 KiB

Raw Blame History

Batch 32 JS Cluster Meta Design

Date: 2026-02-27
Batch: 32 (JS Cluster Meta)
Scope: 58 features + 36 unit tests
Dependencies: batches 27 (JetStream Core), 31 (Raft Part 2)
Go source: golang/nats-server/server/jetstream_cluster.go

Problem

Batch 32 ports JetStream cluster metadata and control-plane behaviors from jetstream_cluster.go, including unsupported assignment handling, cluster leadership/currentness queries, meta-group setup hooks, assignment/health checks, inflight proposal tracking, meta-recovery flags, and orphan detection.

The mapped tests are distributed across cluster/super-cluster/leaf/mqtt/concurrency/raft Go suites and require non-placeholder behavioral coverage. The design goal is to provide an implementation strategy that prevents fake progress and enforces evidence-based status transitions for both features and tests.

Context Findings

Required command outputs

batch show 32 --db porting.db
- Status: pending
- Features: 58 (deferred)
- Tests: 36 (deferred)
- Depends on: 27,31
- Go file: server/jetstream_cluster.go
batch list --db porting.db
- Batch 32 is in dependency chain ... -> 31 -> 32 -> 33/35.
report summary --db porting.db
- Overall progress: 1924/6942 (27.7%)

Dependency state

batch show 27 status: pending
batch show 31 status: pending
batch ready does not include Batch 32 currently.

Design implication: Batch 32 execution must start with an explicit dependency gate; no status changes before dependencies are ready.

Current .NET codebase state

Cluster data types already exist in dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamClusterTypes.cs.
Most Batch 32 mapped method names are not yet implemented in .NET source.
Backlog tests exist in ImplBacklog, but only one cluster class file currently exists:
- present: JetStreamClusterTests2.Impltests.cs
- missing and expected to be created during implementation:
  - JetStreamClusterTests1.Impltests.cs
  - JetStreamClusterTests3.Impltests.cs
  - JetStreamClusterTests4.Impltests.cs
  - JetStreamClusterLongTests.Impltests.cs
  - JetStreamSuperClusterTests.Impltests.cs

Approaches

Approach A: Monolithic one-pass implementation (all 58 features + 36 tests together)

Pros: single pass, less planning overhead.
Cons: high risk of regressions and undetected stubs; weak traceability for status updates.

Approach B (Recommended): Three feature groups (<=20 each) + three test waves

Implement feature groups in source-order clusters, each with strict build/test/stub gates before status updates.
Port tests in behavior-based waves aligned to cluster-domain breadth.
Pros: bounded risk, clear checkpoints, strong auditability.
Cons: more checkpoint ceremony.

Approach C: Test-first all 36 tests before feature porting

Pros: surfaces missing behavior early.
Cons: creates heavy thrash because many tests depend on broad feature slices not yet implemented.

Decision: Approach B.

Proposed Design

1. Code architecture and file strategy

Keep cluster model + cluster helper logic in JetStream cluster files:
- dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamClusterTypes.cs
Keep JetStream/JsAccount cluster-meta behaviors in JetStream engine file(s):
- dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamTypes.cs
- optional split if needed for reviewability: JetStream/JetStreamClusterMeta.cs
Keep server-facing API entry points in a server partial:
- create/modify dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.JetStreamClusterMeta.cs
Keep account-facing query helpers in account class:
- dotnet/src/ZB.MOM.NatsNet.Server/Accounts/Account.cs

2. Feature slicing (max ~20 each)

Feature Group A (20): unsupported assignment + server cluster state basics
IDs: 1520-1539
Feature Group B (20): health/leadership queries + clustering enable + assignment checks
IDs: 1540-1559
Feature Group C (18): leader/consumer assignment internals + inflight tracking + recovery/orphans
IDs: 1560-1577

This keeps each feature task under the required ~20 cap.

3. Test slicing

Wave T1 (13): core cluster behavior (cluster tests 1/2)
IDs: 772,774,775,791,809,810,811,817,853,914,993,1014,1028
Wave T2 (12): advanced cluster behavior (cluster tests 3/4 + long)
IDs: 1060,1088,1098,1106,1109,1122,1128,1136,1194,1211,1212,1217
Wave T3 (11): cross-domain coverage (leaf/super/mqtt/concurrency/raft hooks)
IDs: 1406,1453,1454,1457,1465,1528,2225,2390,2459,2489,2689

4. Verification model (features + tests)

Mandatory per-feature red/green loop with build + focused tests + stub scan before promotion.
Mandatory per-test loop (single-test pass evidence + class/wave pass evidence).
Status updates only in chunks of <=15 IDs per feature/test batch-update command.
Task checkpoints between groups/waves with full suite verification.

5. Deferred policy

If an item is blocked by missing infrastructure or unresolved dependency behavior, explicitly set it to deferred with --override "blocked: <specific reason>". Do not leave stubs or fake-pass tests.

Risks and Mitigations

Risk: dependency batches 27/31 still pending.
Mitigation: enforce preflight dependency gate before any Batch 32 status transitions.
Risk: broad cluster tests encourage placeholder assertions.
Mitigation: anti-stub guardrails + required assertion quality and per-test evidence.
Risk: cross-file method ownership ambiguity (NatsServer, JetStream, Account).
Mitigation: fixed ownership map in plan and grouped implementation order by type.

Success Criteria

All 58 features are either verified with evidence or deferred with explicit blocker reason.
All 36 tests are either verified with run evidence or deferred with explicit blocker reason.
No forbidden stub patterns in touched production/test files.
Batch 32 completion is auditable through build/test outputs and chunked status updates.

Non-Goals

Executing Batch 32 implementation in this document.
Porting Batch 33/34/35 behaviors.
Building new distributed integration infrastructure beyond what is needed for deterministic unit/backlog verification.

6.2 KiB Raw Blame History