Files
natsnet/docs/plans/2026-02-27-batch-37-stream-messages-design.md
Joseph Doherty 8a126c4932 Add batch plans for batches 37-41 (rounds 19-21)
Generated design docs and implementation plans via Codex for:
- Batch 37: Stream Messages
- Batch 38: Consumer Lifecycle
- Batch 39: Consumer Dispatch
- Batch 40: MQTT Server/JSA
- Batch 41: MQTT Client/IO

All plans include mandatory verification protocol and anti-stub guardrails.
Updated batches.md with file paths and planned status.
All 42 batches (0-41) now have design docs and implementation plans.
2026-02-27 17:27:51 -05:00

158 lines
8.0 KiB
Markdown

# Batch 37 Stream Messages Design
**Date:** 2026-02-27
**Batch:** 37 (`Stream Messages`)
**Scope:** 86 features + 13 unit tests
**Dependencies:** Batch `36`
**Go source:** `golang/nats-server/server/stream.go` (focus from ~line 4616 onward)
## Problem
Batch 37 is the stream message data plane: dedupe/message-id handling, direct-get APIs, inbound publish pipeline, consumer signaling, interest/pre-ack tracking, snapshot/restore, and monitor/replication accounting. It is a high-risk batch because most mapped methods are currently absent in .NET, while many mapped tests are still template placeholders.
If this batch is implemented with stubs, Batch 38 (`Consumer Lifecycle`) and later stream/consumer correctness work will be blocked by false greens.
## Context Findings
### Required command outputs
- `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch show 37 --db porting.db`
- Status: `pending`
- Features: `86` (IDs `3294-3387` with mapped gaps)
- Tests: `13`
- Depends on: `36`
- Go file: `server/stream.go`
- `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch list --db porting.db`
- Confirms dependency chain includes `36 -> 37`
- `/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db`
- Overall progress: `1924/6942 (27.7%)`
Environment note: `dotnet` is not on `PATH` in this shell; use `/usr/local/share/dotnet/dotnet`.
### Feature ownership split (from `porting.db`)
- `NatsStream`: 78
- `JsOutQ`: 3
- `JsPubMsg`: 2
- `Account`: 1
- `CMsg`: 1
- `InMsg`: 1
### Test ownership split (from `porting.db`)
- `RaftNodeTests`: 3
- `JetStreamClusterTests2`: 2
- `JetStreamEngineTests`: 2
- `ConcurrencyTests1`: 1
- `GatewayHandlerTests`: 1
- `JetStreamClusterTests1`: 1
- `JetStreamFileStoreTests`: 1
- `JwtProcessorTests`: 1
- `LeafNodeHandlerTests`: 1
Additional findings:
- `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.cs` is currently minimal (357 lines) and does not contain mapped Batch 37 methods.
- `JsOutQ` does not currently exist in source and must be introduced.
- `JetStreamClusterTests1.Impltests.cs` and `RaftNodeTests.Impltests.cs` do not currently exist and must be created.
- Many `ImplBacklog` classes still use placeholder patterns (`var goFile = ...`, string-literal assertions), which must be replaced with behavioral tests.
## Approaches
### Approach A: Monolithic `NatsStream.cs` implementation
Port all 86 features directly in `NatsStream.cs` and append helper logic in existing type files.
- Pros: fewer files.
- Cons: poor reviewability, hard to isolate regressions, high merge conflict risk.
### Approach B (Recommended): Message-domain partial decomposition with strict verification gates
Convert `NatsStream` to partial and split message functionality by concern (headers/dedupe, direct-get, inbound pipeline, consumers/interest, snapshot/monitor), plus targeted type helpers (`InMsg`, `CMsg`, `JsPubMsg`, `JsOutQ`) and account restore path.
- Pros: bounded review units, clearer ownership, easier per-group verification, aligns with anti-stub enforcement.
- Cons: requires class splitting and extra file setup.
### Approach C: Test-wave-first before feature groups
Port all 13 mapped tests first and drive production code entirely from failing tests.
- Pros: stronger behavior pressure.
- Cons: high churn because many mapped APIs/types do not yet exist (`JsOutQ`, many stream methods), so red state will be noisy.
**Decision:** Approach B.
## Proposed Design
### 1. Code organization strategy
Keep mapped class ownership intact while splitting by concern:
- Modify: `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.cs` (convert to `partial`, keep core state/lifecycle)
- Create: `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.MessageHeaders.cs`
- Create: `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.DirectGet.cs`
- Create: `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.MessagePipeline.cs`
- Create: `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.Consumers.cs`
- Create: `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/NatsStream.SnapshotMonitor.cs`
- Modify: `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/StreamTypes.cs` (convert mapped message carrier types to partial if needed)
- Create: `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/StreamTypes.MessageCarriers.cs` (`InMsg.ReturnToPool`, `CMsg.ReturnToPool`, `JsPubMsg` helpers, `JsOutQ`)
- Modify/Create: `dotnet/src/ZB.MOM.NatsNet.Server/Accounts/Account.StreamRestore.cs` (`Account.RestoreStream`)
Design intent: avoid duplicating existing `JetStreamHeaderHelpers` behavior; mapped `NatsStream` header methods should delegate where appropriate and preserve mapped method surface.
### 2. Functional decomposition
- **Headers + dedupe + scheduling metadata:** `unsubscribe`, `setupStore`, dedupe tables, msg-id checks, expected headers, TTL/schedule/batch header parsing, clustered checks.
- **Direct get + ingress path:** `queueInbound`, direct-get handlers, inbound publish entry points, `processJetStreamMsg`, `processJetStreamBatchMsg`.
- **Message carrier/outbound queue primitives:** `newCMsg`, pooled return methods, js pub message sizing/pool access, `JsOutQ.Send*`, unregister.
- **Consumers + interest + pre-ack:** signaling loop, consumer registry/listing, filtered-interest checks, pre-ack register/clear/ack.
- **Snapshot + restore + monitor traffic:** `snapshot`, `Account.RestoreStream`, orphan/replication checks, monitor running flags, replication traffic accounting.
### 3. Test strategy
Port mapped tests in three waves, replacing placeholders with behavior assertions:
- **Wave T1 (5):** direct-get + rollup/mirror dedupe (`828,954,987,1642,1643`)
- **Wave T2 (4):** snapshot/restore/raft catchup (`383,2617,2672,2695`)
- **Wave T3 (4):** gateway/jwt/leaf/perf interaction (`635,1892,1961,2426`)
Test files:
- Modify: `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamEngineTests.Impltests.cs`
- Modify: `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamClusterTests2.Impltests.cs`
- Create: `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamClusterTests1.Impltests.cs`
- Modify: `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamFileStoreTests.Impltests.cs`
- Modify: `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/GatewayHandlerTests.Impltests.cs`
- Modify: `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JwtProcessorTests.Impltests.cs`
- Modify: `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/LeafNodeHandlerTests.Impltests.cs`
- Modify: `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConcurrencyTests1.Impltests.cs`
- Create: `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/RaftNodeTests.Impltests.cs`
### 4. Verification architecture
- Mandatory per-feature and per-test loops with evidence before status promotion.
- Mandatory stub scan, build gate, and targeted/full test gates.
- Status changes chunked to max 15 IDs per `batch-update`.
- Mandatory checkpoints between every task.
- If blocked, keep item `deferred` with specific reason; never create fake-pass stubs.
## Feature Grouping (for implementation plan)
- Group A (18): `3294,3295,3296,3297,3298,3299,3300,3301,3302,3303,3304,3305,3306,3307,3308,3310,3311,3312`
- Group B (16): `3314,3315,3316,3318,3319,3320,3321,3322,3323,3324,3325,3326,3327,3328,3329,3330`
- Group C (16): `3331,3332,3333,3334,3335,3336,3337,3338,3339,3340,3341,3342,3343,3345,3346,3347`
- Group D (17): `3350,3351,3352,3353,3354,3355,3356,3357,3358,3359,3360,3361,3362,3364,3366,3367,3368`
- Group E (19): `3369,3370,3371,3372,3373,3374,3375,3376,3377,3378,3379,3380,3381,3382,3383,3384,3385,3386,3387`
## Constraints
- Planning only in this session; no implementation execution.
- Must follow `docs/standards/dotnet-standards.md` (`xUnit 3`, `Shouldly`, `NSubstitute`, nullable, naming conventions).
- Batch 37 work must not start unless Batch 36 is complete/ready.
## Non-Goals
- Executing Batch 37 code changes in this session.
- Expanding beyond mapped Batch 37 IDs.
- Introducing large integration harnesses unrelated to mapped tests.