Files
natsnet/docs/plans/2026-02-27-batch-32-js-cluster-meta-design.md
Joseph Doherty f8dce79ac0 Add batch plans for batches 31-36 (rounds 16-18)
Generated design docs and implementation plans via Codex for:
- Batch 31: Raft Part 2
- Batch 32: JS Cluster Meta
- Batch 33: JS Cluster Streams
- Batch 34: JS Cluster Consumers
- Batch 35: JS Cluster Remaining
- Batch 36: Stream Lifecycle

All plans include mandatory verification protocol and anti-stub guardrails.
Updated batches.md with file paths and planned status.
2026-02-27 17:01:31 -05:00

6.2 KiB

Batch 32 JS Cluster Meta Design

Date: 2026-02-27
Batch: 32 (JS Cluster Meta)
Scope: 58 features + 36 unit tests
Dependencies: batches 27 (JetStream Core), 31 (Raft Part 2)
Go source: golang/nats-server/server/jetstream_cluster.go

Problem

Batch 32 ports JetStream cluster metadata and control-plane behaviors from jetstream_cluster.go, including unsupported assignment handling, cluster leadership/currentness queries, meta-group setup hooks, assignment/health checks, inflight proposal tracking, meta-recovery flags, and orphan detection.

The mapped tests are distributed across cluster/super-cluster/leaf/mqtt/concurrency/raft Go suites and require non-placeholder behavioral coverage. The design goal is to provide an implementation strategy that prevents fake progress and enforces evidence-based status transitions for both features and tests.

Context Findings

Required command outputs

  • batch show 32 --db porting.db
    • Status: pending
    • Features: 58 (deferred)
    • Tests: 36 (deferred)
    • Depends on: 27,31
    • Go file: server/jetstream_cluster.go
  • batch list --db porting.db
    • Batch 32 is in dependency chain ... -> 31 -> 32 -> 33/35.
  • report summary --db porting.db
    • Overall progress: 1924/6942 (27.7%)

Dependency state

  • batch show 27 status: pending
  • batch show 31 status: pending
  • batch ready does not include Batch 32 currently.

Design implication: Batch 32 execution must start with an explicit dependency gate; no status changes before dependencies are ready.

Current .NET codebase state

  • Cluster data types already exist in dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamClusterTypes.cs.
  • Most Batch 32 mapped method names are not yet implemented in .NET source.
  • Backlog tests exist in ImplBacklog, but only one cluster class file currently exists:
    • present: JetStreamClusterTests2.Impltests.cs
    • missing and expected to be created during implementation:
      • JetStreamClusterTests1.Impltests.cs
      • JetStreamClusterTests3.Impltests.cs
      • JetStreamClusterTests4.Impltests.cs
      • JetStreamClusterLongTests.Impltests.cs
      • JetStreamSuperClusterTests.Impltests.cs

Approaches

Approach A: Monolithic one-pass implementation (all 58 features + 36 tests together)

  • Pros: single pass, less planning overhead.
  • Cons: high risk of regressions and undetected stubs; weak traceability for status updates.
  • Implement feature groups in source-order clusters, each with strict build/test/stub gates before status updates.
  • Port tests in behavior-based waves aligned to cluster-domain breadth.
  • Pros: bounded risk, clear checkpoints, strong auditability.
  • Cons: more checkpoint ceremony.

Approach C: Test-first all 36 tests before feature porting

  • Pros: surfaces missing behavior early.
  • Cons: creates heavy thrash because many tests depend on broad feature slices not yet implemented.

Decision: Approach B.

Proposed Design

1. Code architecture and file strategy

  • Keep cluster model + cluster helper logic in JetStream cluster files:
    • dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamClusterTypes.cs
  • Keep JetStream/JsAccount cluster-meta behaviors in JetStream engine file(s):
    • dotnet/src/ZB.MOM.NatsNet.Server/JetStream/JetStreamTypes.cs
    • optional split if needed for reviewability: JetStream/JetStreamClusterMeta.cs
  • Keep server-facing API entry points in a server partial:
    • create/modify dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.JetStreamClusterMeta.cs
  • Keep account-facing query helpers in account class:
    • dotnet/src/ZB.MOM.NatsNet.Server/Accounts/Account.cs

2. Feature slicing (max ~20 each)

  • Feature Group A (20): unsupported assignment + server cluster state basics
    IDs: 1520-1539
  • Feature Group B (20): health/leadership queries + clustering enable + assignment checks
    IDs: 1540-1559
  • Feature Group C (18): leader/consumer assignment internals + inflight tracking + recovery/orphans
    IDs: 1560-1577

This keeps each feature task under the required ~20 cap.

3. Test slicing

  • Wave T1 (13): core cluster behavior (cluster tests 1/2)
    IDs: 772,774,775,791,809,810,811,817,853,914,993,1014,1028
  • Wave T2 (12): advanced cluster behavior (cluster tests 3/4 + long)
    IDs: 1060,1088,1098,1106,1109,1122,1128,1136,1194,1211,1212,1217
  • Wave T3 (11): cross-domain coverage (leaf/super/mqtt/concurrency/raft hooks)
    IDs: 1406,1453,1454,1457,1465,1528,2225,2390,2459,2489,2689

4. Verification model (features + tests)

  • Mandatory per-feature red/green loop with build + focused tests + stub scan before promotion.
  • Mandatory per-test loop (single-test pass evidence + class/wave pass evidence).
  • Status updates only in chunks of <=15 IDs per feature/test batch-update command.
  • Task checkpoints between groups/waves with full suite verification.

5. Deferred policy

If an item is blocked by missing infrastructure or unresolved dependency behavior, explicitly set it to deferred with --override "blocked: <specific reason>". Do not leave stubs or fake-pass tests.

Risks and Mitigations

  • Risk: dependency batches 27/31 still pending.
    Mitigation: enforce preflight dependency gate before any Batch 32 status transitions.
  • Risk: broad cluster tests encourage placeholder assertions.
    Mitigation: anti-stub guardrails + required assertion quality and per-test evidence.
  • Risk: cross-file method ownership ambiguity (NatsServer, JetStream, Account).
    Mitigation: fixed ownership map in plan and grouped implementation order by type.

Success Criteria

  • All 58 features are either verified with evidence or deferred with explicit blocker reason.
  • All 36 tests are either verified with run evidence or deferred with explicit blocker reason.
  • No forbidden stub patterns in touched production/test files.
  • Batch 32 completion is auditable through build/test outputs and chunked status updates.

Non-Goals

  • Executing Batch 32 implementation in this document.
  • Porting Batch 33/34/35 behaviors.
  • Building new distributed integration infrastructure beyond what is needed for deterministic unit/backlog verification.