Files
natsnet/docs/plans/2026-02-27-batch-37-stream-messages-design.md
Joseph Doherty 8a126c4932 Add batch plans for batches 37-41 (rounds 19-21)
Generated design docs and implementation plans via Codex for:
- Batch 37: Stream Messages
- Batch 38: Consumer Lifecycle
- Batch 39: Consumer Dispatch
- Batch 40: MQTT Server/JSA
- Batch 41: MQTT Client/IO

All plans include mandatory verification protocol and anti-stub guardrails.
Updated batches.md with file paths and planned status.
All 42 batches (0-41) now have design docs and implementation plans.
2026-02-27 17:27:51 -05:00

8.0 KiB

Batch 37 Stream Messages Design

Date: 2026-02-27
Batch: 37 (Stream Messages)
Scope: 86 features + 13 unit tests
Dependencies: Batch 36
Go source: golang/nats-server/server/stream.go (focus from ~line 4616 onward)

Problem

Batch 37 is the stream message data plane: dedupe/message-id handling, direct-get APIs, inbound publish pipeline, consumer signaling, interest/pre-ack tracking, snapshot/restore, and monitor/replication accounting. It is a high-risk batch because most mapped methods are currently absent in .NET, while many mapped tests are still template placeholders.

If this batch is implemented with stubs, Batch 38 (Consumer Lifecycle) and later stream/consumer correctness work will be blocked by false greens.

Context Findings

Required command outputs

  • /usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch show 37 --db porting.db
    • Status: pending
    • Features: 86 (IDs 3294-3387 with mapped gaps)
    • Tests: 13
    • Depends on: 36
    • Go file: server/stream.go
  • /usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch list --db porting.db
    • Confirms dependency chain includes 36 -> 37
  • /usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db
    • Overall progress: 1924/6942 (27.7%)

Environment note: dotnet is not on PATH in this shell; use /usr/local/share/dotnet/dotnet.

Feature ownership split (from porting.db)

  • NatsStream: 78
  • JsOutQ: 3
  • JsPubMsg: 2
  • Account: 1
  • CMsg: 1
  • InMsg: 1

Test ownership split (from porting.db)

  • RaftNodeTests: 3
  • JetStreamClusterTests2: 2
  • JetStreamEngineTests: 2
  • ConcurrencyTests1: 1
  • GatewayHandlerTests: 1
  • JetStreamClusterTests1: 1
  • JetStreamFileStoreTests: 1
  • JwtProcessorTests: 1
  • LeafNodeHandlerTests: 1

Additional findings:

  • dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.cs is currently minimal (357 lines) and does not contain mapped Batch 37 methods.
  • JsOutQ does not currently exist in source and must be introduced.
  • JetStreamClusterTests1.Impltests.cs and RaftNodeTests.Impltests.cs do not currently exist and must be created.
  • Many ImplBacklog classes still use placeholder patterns (var goFile = ..., string-literal assertions), which must be replaced with behavioral tests.

Approaches

Approach A: Monolithic NatsStream.cs implementation

Port all 86 features directly in NatsStream.cs and append helper logic in existing type files.

  • Pros: fewer files.
  • Cons: poor reviewability, hard to isolate regressions, high merge conflict risk.

Convert NatsStream to partial and split message functionality by concern (headers/dedupe, direct-get, inbound pipeline, consumers/interest, snapshot/monitor), plus targeted type helpers (InMsg, CMsg, JsPubMsg, JsOutQ) and account restore path.

  • Pros: bounded review units, clearer ownership, easier per-group verification, aligns with anti-stub enforcement.
  • Cons: requires class splitting and extra file setup.

Approach C: Test-wave-first before feature groups

Port all 13 mapped tests first and drive production code entirely from failing tests.

  • Pros: stronger behavior pressure.
  • Cons: high churn because many mapped APIs/types do not yet exist (JsOutQ, many stream methods), so red state will be noisy.

Decision: Approach B.

Proposed Design

1. Code organization strategy

Keep mapped class ownership intact while splitting by concern:

  • Modify: dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.cs (convert to partial, keep core state/lifecycle)
  • Create: dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.MessageHeaders.cs
  • Create: dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.DirectGet.cs
  • Create: dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.MessagePipeline.cs
  • Create: dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.Consumers.cs
  • Create: dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.SnapshotMonitor.cs
  • Modify: dotnet/src/ZB.MOM.NatsNet.Server/JetStream/StreamTypes.cs (convert mapped message carrier types to partial if needed)
  • Create: dotnet/src/ZB.MOM.NatsNet.Server/JetStream/StreamTypes.MessageCarriers.cs (InMsg.ReturnToPool, CMsg.ReturnToPool, JsPubMsg helpers, JsOutQ)
  • Modify/Create: dotnet/src/ZB.MOM.NatsNet.Server/Accounts/Account.StreamRestore.cs (Account.RestoreStream)

Design intent: avoid duplicating existing JetStreamHeaderHelpers behavior; mapped NatsStream header methods should delegate where appropriate and preserve mapped method surface.

2. Functional decomposition

  • Headers + dedupe + scheduling metadata: unsubscribe, setupStore, dedupe tables, msg-id checks, expected headers, TTL/schedule/batch header parsing, clustered checks.
  • Direct get + ingress path: queueInbound, direct-get handlers, inbound publish entry points, processJetStreamMsg, processJetStreamBatchMsg.
  • Message carrier/outbound queue primitives: newCMsg, pooled return methods, js pub message sizing/pool access, JsOutQ.Send*, unregister.
  • Consumers + interest + pre-ack: signaling loop, consumer registry/listing, filtered-interest checks, pre-ack register/clear/ack.
  • Snapshot + restore + monitor traffic: snapshot, Account.RestoreStream, orphan/replication checks, monitor running flags, replication traffic accounting.

3. Test strategy

Port mapped tests in three waves, replacing placeholders with behavior assertions:

  • Wave T1 (5): direct-get + rollup/mirror dedupe (828,954,987,1642,1643)
  • Wave T2 (4): snapshot/restore/raft catchup (383,2617,2672,2695)
  • Wave T3 (4): gateway/jwt/leaf/perf interaction (635,1892,1961,2426)

Test files:

  • Modify: dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamEngineTests.Impltests.cs
  • Modify: dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamClusterTests2.Impltests.cs
  • Create: dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamClusterTests1.Impltests.cs
  • Modify: dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamFileStoreTests.Impltests.cs
  • Modify: dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/GatewayHandlerTests.Impltests.cs
  • Modify: dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JwtProcessorTests.Impltests.cs
  • Modify: dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/LeafNodeHandlerTests.Impltests.cs
  • Modify: dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConcurrencyTests1.Impltests.cs
  • Create: dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/RaftNodeTests.Impltests.cs

4. Verification architecture

  • Mandatory per-feature and per-test loops with evidence before status promotion.
  • Mandatory stub scan, build gate, and targeted/full test gates.
  • Status changes chunked to max 15 IDs per batch-update.
  • Mandatory checkpoints between every task.
  • If blocked, keep item deferred with specific reason; never create fake-pass stubs.

Feature Grouping (for implementation plan)

  • Group A (18): 3294,3295,3296,3297,3298,3299,3300,3301,3302,3303,3304,3305,3306,3307,3308,3310,3311,3312
  • Group B (16): 3314,3315,3316,3318,3319,3320,3321,3322,3323,3324,3325,3326,3327,3328,3329,3330
  • Group C (16): 3331,3332,3333,3334,3335,3336,3337,3338,3339,3340,3341,3342,3343,3345,3346,3347
  • Group D (17): 3350,3351,3352,3353,3354,3355,3356,3357,3358,3359,3360,3361,3362,3364,3366,3367,3368
  • Group E (19): 3369,3370,3371,3372,3373,3374,3375,3376,3377,3378,3379,3380,3381,3382,3383,3384,3385,3386,3387

Constraints

  • Planning only in this session; no implementation execution.
  • Must follow docs/standards/dotnet-standards.md (xUnit 3, Shouldly, NSubstitute, nullable, naming conventions).
  • Batch 37 work must not start unless Batch 36 is complete/ready.

Non-Goals

  • Executing Batch 37 code changes in this session.
  • Expanding beyond mapped Batch 37 IDs.
  • Introducing large integration harnesses unrelated to mapped tests.