Files
natsnet/docs/plans/2026-02-27-batch-24-leaf-nodes-design.md
Joseph Doherty c05d93618e Add batch plans for batches 23-30 (rounds 12-15)
Generated design docs and implementation plans via Codex for:
- Batch 23: Routes
- Batch 24: Leaf Nodes
- Batch 25: Gateways
- Batch 26: WebSocket
- Batch 27: JetStream Core
- Batch 28: JetStream API
- Batch 29: JetStream Batching
- Batch 30: Raft Part 1

All plans include mandatory verification protocol and anti-stub guardrails.
Updated batches.md with file paths and planned status.
2026-02-27 16:33:10 -05:00

153 lines
7.3 KiB
Markdown

# Batch 24 Leaf Nodes Design
**Date:** 2026-02-27
**Batch:** 24 (`Leaf Nodes`)
**Scope:** Design only. No implementation in this document.
## Problem
Batch 24 ports leaf-node connection handling from `golang/nats-server/server/leafnode.go` into the .NET server.
- Features: `67` (all currently `deferred`)
- Tests: `2` (both currently `deferred`)
- Dependencies: batches `19` and `23`
- Go source: `server/leafnode.go`
- Batch status: `pending`
Leaf-node behavior touches client protocol parsing, solicited/accepted leaf lifecycle, account interest propagation, and cross-cluster subscription semantics.
## Context Findings
Collected with:
- `dotnet run --project tools/NatsNet.PortTracker -- batch show 24 --db porting.db`
- `dotnet run --project tools/NatsNet.PortTracker -- batch list --db porting.db`
- `dotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db`
Additional repository findings:
- .NET already has leaf data models in `dotnet/src/ZB.MOM.NatsNet.Server/LeafNode/LeafNodeTypes.cs` (`Leaf`, `LeafNodeCfg`, `LeafConnectInfo`) but the Batch 24 behavior surface is largely unimplemented.
- `ClientConnection` currently has explicit leaf placeholders (`IsHubLeafNode`, `RemoteCluster`) and parser-level leaf argument helpers still marked as stubs/delegates.
- `LeafNodeHandlerTests.Impltests.cs` and `RouteHandlerTests.Impltests.cs` still contain placeholder-style tests and do not currently include the two Batch 24 mapped methods.
- Batch 24 mapped tests are:
- `1966` (`LeafNodeHandlerTests.LeafNodeRoutedSubKeyDifferentBetweenLeafSubAndRoutedSub_ShouldSucceed`)
- `2825` (`RouteHandlerTests.ClusterQueueGroupWeightTrackingLeak_ShouldSucceed`)
## Constraints and Success Criteria
Constraints:
- Follow project standards in `docs/standards/dotnet-standards.md`.
- Keep behavior equivalent to Go intent, but idiomatic C# (.NET 10, nullable enabled).
- No placeholder implementations, no fake-pass tests, no status promotion without evidence.
- Feature work must be grouped into chunks of <= ~20 features.
- Batch 24 execution must wait for dependency readiness (batches 19 and 23).
Success criteria:
- All 67 features are either implemented and verified, or explicitly deferred with a concrete blocker reason.
- Both mapped tests are implemented as real behavioral tests and pass.
- Leaf-related parser/account regression tests pass before `complete -> verified`.
- Batch 24 can be completed through PortTracker without overrides for unverifiable work.
## Approaches
### Approach A: Single monolithic leaf file
Implement all 67 methods in one large `NatsServer.LeafNodes.cs` and keep most `ClientConnection` work in `ClientConnection.cs`.
Trade-offs:
- Pros: fewer files to navigate.
- Cons: very high review risk, difficult evidence tracking, poor checkpoint isolation.
### Approach B (Recommended): Domain-segmented partials by leaf lifecycle stage
Split implementation by responsibility:
1. Leaf validation/config/bootstrap
2. Connect/handshake/lifecycle
3. Interest propagation/account/subscription maps
4. Message/sub/unsub processing and websocket solicit path
Trade-offs:
- Pros: aligns to `leafnode.go` sections, cleaner verification gates, easier status chunking and rollback.
- Cons: more file coordination.
### Approach C: Test-first only on mapped tests, then fill remaining features
Start with the two mapped tests and then backfill features.
Trade-offs:
- Pros: immediate test signal.
- Cons: two tests under-cover 67 features, making false confidence likely unless strict feature-level verification is enforced.
## Recommended Design
Use **Approach B** with four feature groups (18/18/14/17 IDs) mapped to Go line ranges and .NET class ownership.
### Architecture
- `ClientConnection` partials handle leaf protocol parsing, leaf connect payload processing, sub/unsub deltas, permission violations, and websocket solicitation.
- `NatsServer` partials handle remote solicitation, outbound reconnect/connect loops, listener accept loop, leaf info dissemination, and connection registration lifecycle.
- `LeafNode` helper surface (`LeafNodeHandler` + `LeafNodeCfg` methods) centralizes validation and key generation utilities used by both server and client paths.
- `Account` updates cover `UpdateLeafNodesEx` semantics and gateway/leaf interest propagation coupling.
### Proposed File Map
Primary production files:
- Create: `dotnet/src/ZB.MOM.NatsNet.Server/ClientConnection.LeafNodes.cs`
- Create: `dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.LeafNodes.ConfigAndConnect.cs`
- Create: `dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.LeafNodes.Subscriptions.cs`
- Create: `dotnet/src/ZB.MOM.NatsNet.Server/LeafNode/LeafNodeHandler.cs`
- Modify: `dotnet/src/ZB.MOM.NatsNet.Server/LeafNode/LeafNodeTypes.cs` (for `LeafNodeCfg` method surface)
- Modify: `dotnet/src/ZB.MOM.NatsNet.Server/Accounts/Account.cs`
- Modify: `dotnet/src/ZB.MOM.NatsNet.Server/Protocol/ProtocolParser.cs`
- Modify (as needed): `dotnet/src/ZB.MOM.NatsNet.Server/NatsServer.Lifecycle.cs`
Mapped test files:
- Modify: `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/LeafNodeHandlerTests.Impltests.cs`
- Modify: `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/RouteHandlerTests.Impltests.cs`
Related regression tests:
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/Protocol/ProtocolParserTests.cs`
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/Accounts/ResolverDefaultsOpsTests.cs`
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ClientTests.cs`
### Data and Control Flow
- Outbound flow: configured remote leaf nodes -> validate -> pick URL -> connect/proxy/ws path -> leaf CONNECT/INFO handshake -> register connection -> initialize subscription map.
- Inbound flow: accept loop -> create leaf client -> process INFO/CONNECT -> negotiate compression/perms -> account registration and interest sync.
- Interest propagation: account sub changes -> leaf smap deltas -> leaf SUB/UNSUB protocol emission -> inbound leaf message/sub updates -> permission enforcement.
### Error Handling Strategy
- Preserve Go failure modes for malformed protocol args, invalid remote config, and loop/duplicate detection paths.
- Prefer explicit guard clauses with clear exceptions over silent default returns.
- On infrastructure blockers, defer with reason rather than introducing placeholders.
### Verification Strategy
- Enforce per-feature verification loop (Go read, C# implementation, build, related test run).
- Enforce group-level stub scans and build gate before any `stub -> complete`.
- Enforce full leaf-related test gate before any `complete -> verified`.
- Enforce checkpoint protocol (full build + full tests + commit) between every task.
### Risks and Mitigations
- Risk: race-prone account/smap updates regress existing account behavior.
- Mitigation: isolate account update logic and require `ResolverDefaultsOpsTests` in every verification wave.
- Risk: parser and runtime leaf arg handling diverge.
- Mitigation: keep parser and client leaf arg updates in the same feature group and verify via `ProtocolParserTests`.
- Risk: backlog tests remain placeholders and create false green.
- Mitigation: mandatory anti-stub scans and method-level filters for mapped tests.
## Design Approval Basis
This design is based on Batch 24 metadata from PortTracker, existing Batch 0 verification standards, and explicit requirements for feature+test anti-stub guardrails and grouped execution.