Bottom-up dependency ordering: FileStore/RAFT → Cluster/API → Consumer/Stream → Client/MQTT → Config/Gateway → Route/LeafNode → Account → Monitoring/WebSocket. Full test suite every 2 phases.
64 KiB
Remaining Gap Closure Implementation Plan (93 Gaps, 8 Phases)
For Claude: REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.
Goal: Close all 93 remaining implementation gaps between the Go NATS server and the .NET port, completing the feature-complete parity across FileStore, RAFT, JetStream Cluster, API, Consumer, Stream, Client, MQTT, Config, Gateway, Route, LeafNode, Account, Monitoring, and WebSocket subsystems.
Architecture: Bottom-up dependency approach — Phase 1 builds storage durability and RAFT completion, Phase 2 adds cluster coordination and API layer, Phase 3 adds consumer engines and stream lifecycle, Phase 4 adds client protocol and MQTT, Phase 5 adds config reload and gateway, Phase 6 adds route clustering and leaf nodes, Phase 7 adds account management, Phase 8 adds monitoring and WebSocket.
Tech Stack: .NET 10 / C# 14, xUnit 3, Shouldly, NSubstitute, System.IO.Hashing (XxHash64), System.IO.Pipelines, IronSnappy (S2), ChaCha20-Poly1305/AES-GCM, SQLite (test parity DB)
Test strategy: Only run targeted unit tests during implementation (dotnet test --filter). Run full test suite every 2 phases (after Phase 2, 4, 6, 8). Update docs/test_parity.db per phase.
Parity DB update pattern:
sqlite3 docs/test_parity.db "UPDATE go_tests SET status='mapped', dotnet_test='DotNetTestName', dotnet_file='TestFile.cs' WHERE go_test='GoTestName';"
sqlite3 docs/test_parity.db "INSERT INTO dotnet_tests (test_name, test_file, category) VALUES ('TestName', 'TestFile.cs', 'category');"
Phase 1: FileStore Durability + RAFT Completion (11 gaps)
Dependencies: None — pure infrastructure Exit gate: Checksums validated on read, atomic writes verified, tombstones persisted/recovered, cache expires correctly, filtered queries use SubjectTree, RAFT streams snapshots in chunks, transfers leadership, compacts with policies, checks quorum, jitters elections
Task 1: FileStore Checksum Validation (Gap 1.5)
Add per-block last-checksum tracking and read-path validation using existing XxHash64 in MessageRecord.
Files:
- Modify:
src/NATS.Server/JetStream/Storage/MsgBlock.cs:25-44(add_lastChecksumfield) - Modify:
src/NATS.Server/JetStream/Storage/MsgBlock.cs:298(Readmethod — add validation) - Modify:
src/NATS.Server/JetStream/Storage/MessageRecord.cs:115(Decode— expose checksum) - Test:
tests/NATS.Server.Tests/JetStream/Storage/FileStoreChecksumTests.cs(create) - Go ref:
filestore.go:2204(lastChecksum),filestore.go:8180(validation in msgFromBufEx)
Step 1: Write failing tests
Create tests/NATS.Server.Tests/JetStream/Storage/FileStoreChecksumTests.cs:
using NATS.Server.JetStream.Storage;
namespace NATS.Server.Tests.JetStream.Storage;
public class FileStoreChecksumTests : IDisposable
{
private readonly DirectoryInfo _dir = Directory.CreateTempSubdirectory("checksum-");
public void Dispose() => _dir.Delete(true);
[Fact]
public void MsgBlock_tracks_last_checksum()
{
using var block = MsgBlock.Create(1, _dir.FullName, 1024 * 1024);
block.Write("test", ReadOnlyMemory<byte>.Empty, "hello"u8.ToArray());
block.LastChecksum.ShouldNotBeNull();
block.LastChecksum.Length.ShouldBe(8); // XxHash64 = 8 bytes
}
[Fact]
public void MsgBlock_validates_checksum_on_read()
{
using var block = MsgBlock.Create(1, _dir.FullName, 1024 * 1024);
block.Write("test", ReadOnlyMemory<byte>.Empty, "hello"u8.ToArray());
block.Flush();
// Read should succeed with valid checksum
var record = block.Read(1);
record.ShouldNotBeNull();
record!.Subject.ShouldBe("test");
}
[Fact]
public void MsgBlock_detects_corrupted_record()
{
using var block = MsgBlock.Create(1, _dir.FullName, 1024 * 1024);
block.Write("test", ReadOnlyMemory<byte>.Empty, "hello"u8.ToArray());
block.Flush();
block.ClearCache();
// Corrupt a byte in the block file
var files = Directory.GetFiles(_dir.FullName, "*.blk");
var bytes = File.ReadAllBytes(files[0]);
bytes[^10] ^= 0xFF;
File.WriteAllBytes(files[0], bytes);
Should.Throw<InvalidDataException>(() => block.Read(1));
}
[Fact]
public void MsgBlock_validates_checksum_flag_controls_behavior()
{
using var block = MsgBlock.Create(1, _dir.FullName, 1024 * 1024, validateOnRead: false);
block.Write("test", ReadOnlyMemory<byte>.Empty, "hello"u8.ToArray());
block.Flush();
block.ClearCache();
// Even with corruption, no exception when validation disabled
var files = Directory.GetFiles(_dir.FullName, "*.blk");
var bytes = File.ReadAllBytes(files[0]);
bytes[^10] ^= 0xFF;
File.WriteAllBytes(files[0], bytes);
var record = block.Read(1);
// May return null or corrupted data, but should not throw
}
}
Step 2: Run tests to verify they fail
dotnet test tests/NATS.Server.Tests --filter "FullyQualifiedName~FileStoreChecksumTests" -v normal
Expected: Compilation errors (missing LastChecksum, validateOnRead parameter)
Step 3: Implement
- Add
_lastChecksum: byte[]?field andLastChecksumproperty toMsgBlock.cs - Add
_validateOnRead: boolparameter toCreateandRecoverfactory methods - Update
Writeto capture checksum fromMessageRecord.Encoderesult - Update
Readto validate checksum when_validateOnReadis true and record is loaded from disk (not cache) - Expose
Checksumproperty onMessageRecordfrom the decoded trailer
Step 4: Run tests to verify they pass
dotnet test tests/NATS.Server.Tests --filter "FullyQualifiedName~FileStoreChecksumTests" -v normal
Step 5: Commit
git add src/NATS.Server/JetStream/Storage/MsgBlock.cs src/NATS.Server/JetStream/Storage/MessageRecord.cs tests/NATS.Server.Tests/JetStream/Storage/FileStoreChecksumTests.cs
git commit -m "feat: add checksum validation on MsgBlock read path (Gap 1.5)"
Task 2: Atomic File Overwrites (Gap 1.6)
Add AtomicFileWriter helper and SemaphoreSlim write lock to FileStore for crash-safe state persistence.
Files:
- Create:
src/NATS.Server/JetStream/Storage/AtomicFileWriter.cs - Modify:
src/NATS.Server/JetStream/Storage/FileStore.cs:1827(WriteStreamState— use atomic writer) - Test:
tests/NATS.Server.Tests/JetStream/Storage/AtomicFileWriterTests.cs(create) - Go ref:
filestore.go:10599(_writeFullState)
Step 1: Write failing tests
using NATS.Server.JetStream.Storage;
namespace NATS.Server.Tests.JetStream.Storage;
public class AtomicFileWriterTests : IDisposable
{
private readonly DirectoryInfo _dir = Directory.CreateTempSubdirectory("atomic-");
public void Dispose() => _dir.Delete(true);
[Fact]
public async Task WriteAtomicallyAsync_creates_file()
{
var path = Path.Combine(_dir.FullName, "state.json");
await AtomicFileWriter.WriteAtomicallyAsync(path, "hello"u8.ToArray());
File.Exists(path).ShouldBeTrue();
(await File.ReadAllTextAsync(path)).ShouldBe("hello");
}
[Fact]
public async Task WriteAtomicallyAsync_no_temp_file_remains()
{
var path = Path.Combine(_dir.FullName, "state.json");
await AtomicFileWriter.WriteAtomicallyAsync(path, "data"u8.ToArray());
Directory.GetFiles(_dir.FullName, "*.tmp").ShouldBeEmpty();
}
[Fact]
public async Task WriteAtomicallyAsync_overwrites_existing()
{
var path = Path.Combine(_dir.FullName, "state.json");
await AtomicFileWriter.WriteAtomicallyAsync(path, "old"u8.ToArray());
await AtomicFileWriter.WriteAtomicallyAsync(path, "new"u8.ToArray());
(await File.ReadAllTextAsync(path)).ShouldBe("new");
}
}
Step 2: Run tests — expect compilation failure
Step 3: Implement
- Create
AtomicFileWriter.cswith staticWriteAtomicallyAsync(string path, byte[] data): write to{path}.tmp, flush to disk,File.Move(overwrite: true) - Add
SemaphoreSlim _stateWriteLock = new(1, 1)toFileStore - Update
WriteStreamState()to use_stateWriteLockandAtomicFileWriter
Step 4: Run tests — expect pass
Step 5: Commit
git add src/NATS.Server/JetStream/Storage/AtomicFileWriter.cs src/NATS.Server/JetStream/Storage/FileStore.cs tests/NATS.Server.Tests/JetStream/Storage/AtomicFileWriterTests.cs
git commit -m "feat: add atomic file writer with SemaphoreSlim for crash-safe state writes (Gap 1.6)"
Task 3: Tombstone & Deletion Tracking (Gap 1.7)
Replace HashSet<ulong> _deleted with sparse SequenceSet and add secure erase support.
Files:
- Create:
src/NATS.Server/JetStream/Storage/SequenceSet.cs - Modify:
src/NATS.Server/JetStream/Storage/MsgBlock.cs:30(replace_deleted) - Modify:
src/NATS.Server/JetStream/Storage/MsgBlock.cs:332(Delete— add secure erase) - Modify:
src/NATS.Server/JetStream/Storage/FileStore.cs(recover tombstones) - Test:
tests/NATS.Server.Tests/JetStream/Storage/SequenceSetTests.cs(create) - Test:
tests/NATS.Server.Tests/JetStream/Storage/FileStoreTombstoneTrackingTests.cs(create) - Go ref:
filestore.go:5267(removeMsg),filestore.go:5890(eraseMsg)
Step 1: Write failing tests
SequenceSetTests.cs — tests for range-compressed sorted set: Add, Remove, Contains, Count, ranges collapse (e.g., adding 1,2,3 stores as range [1-3]).
FileStoreTombstoneTrackingTests.cs — tests for: tombstones survive MsgBlock recovery, secure erase overwrites data with random bytes, SequenceSet used instead of HashSet.
Step 2: Run tests — expect compilation failure
Step 3: Implement
SequenceSet.cs: sorted list of(ulong Start, ulong End)ranges with range compression, binary search for Contains/Add/Remove- Replace
_deleted: HashSet<ulong>with_deleted: SequenceSetinMsgBlock.cs - Add
secureEraseparameter toMsgBlock.Delete()— when true, overwrite payload bytes withRandomNumberGenerator.Fill - Persist tombstone records using existing
DeletedFlag = 0x80inMessageRecord - Recover tombstones during
RebuildIndex()by checking the deleted flag
Step 4: Run tests — expect pass
Step 5: Commit
git add src/NATS.Server/JetStream/Storage/SequenceSet.cs src/NATS.Server/JetStream/Storage/MsgBlock.cs src/NATS.Server/JetStream/Storage/FileStore.cs tests/NATS.Server.Tests/JetStream/Storage/SequenceSetTests.cs tests/NATS.Server.Tests/JetStream/Storage/FileStoreTombstoneTrackingTests.cs
git commit -m "feat: add SequenceSet for sparse deletion tracking with secure erase (Gap 1.7)"
Task 4: Multi-Block Write Cache (Gap 1.8)
Add WriteCacheManager to FileStore with bounded strong-reference cache, TTL eviction, and background flush.
Files:
- Modify:
src/NATS.Server/JetStream/Storage/FileStore.cs(addWriteCacheManagerinner class) - Modify:
src/NATS.Server/JetStream/Storage/MsgBlock.cs(integrate cache manager) - Test:
tests/NATS.Server.Tests/JetStream/Storage/WriteCacheTests.cs(create) - Go ref:
filestore.go:4443(setupWriteCache),filestore.go:6148(expireCache)
Step 1: Write failing tests
Tests for: cache entries evicted after TTL (2s default), cache bounded by size (64MB default), FlushPendingMsgsAsync flushes all cached entries, PeriodicTimer background worker.
Step 2-5: Implement, test, commit
Implement WriteCacheManager as inner class with PeriodicTimer (500ms tick), bounded Dictionary<int, CacheEntry> keyed by block ID, size tracking, TTL eviction. Integrate into FileStore.StoreMsg and FileStore.RotateBlock.
git commit -m "feat: add bounded write cache with TTL eviction and background flush (Gap 1.8)"
Task 5: Query/Filter Operations (Gap 1.10)
Optimize FilteredState, LoadMsg, and NumFiltered with block-aware binary search.
Files:
- Modify:
src/NATS.Server/JetStream/Storage/FileStore.cs:471(FilteredState— use block index ranges) - Modify:
src/NATS.Server/JetStream/Storage/FileStore.cs:1459(LoadMsg— block-aware binary search) - Test:
tests/NATS.Server.Tests/JetStream/Storage/FileStoreFilterQueryTests.cs(create) - Go ref:
filestore.go:3191(FilteredState),filestore.go:8308(LoadMsg)
Step 1-5: TDD cycle
Tests for: FilteredState with wildcard subjects, LoadMsg with block boundary crossing, CheckSkipFirstBlock optimization for range queries, NumFiltered with caching. Use SubjectMatch.IsMatch() for token-based filter matching.
git commit -m "feat: optimize FilteredState and LoadMsg with block-aware search (Gap 1.10)"
Task 6: RAFT InstallSnapshot Streaming (Gap 8.3)
Add chunk-based snapshot streaming with CRC32 validation.
Files:
- Create:
src/NATS.Server/Raft/SnapshotChunkEnumerator.cs - Modify:
src/NATS.Server/Raft/RaftNode.cs:414(InstallSnapshotFromChunksAsync— add CRC validation) - Modify:
src/NATS.Server/Raft/RaftWireFormat.cs(addRaftInstallSnapshotChunkWire) - Test:
tests/NATS.Server.Tests/Raft/RaftSnapshotStreamingTests.cs(create) - Go ref:
raft.gosnapshot install with chunks
Step 1: Write failing tests
[Fact]
public void SnapshotChunkEnumerator_yields_fixed_size_chunks()
{
var data = new byte[200_000]; // ~3 chunks at 64KB
Random.Shared.NextBytes(data);
var chunks = new SnapshotChunkEnumerator(data, chunkSize: 65536).ToList();
chunks.Count.ShouldBe(4); // 64K + 64K + 64K + remainder
chunks.Sum(c => c.Length).ShouldBe(200_000);
}
[Fact]
public async Task InstallSnapshot_validates_crc32_over_assembled_content()
{
var node = new RaftNode("n1");
var data = "snapshot-data"u8.ToArray();
var chunks = new SnapshotChunkEnumerator(data, 8).ToList();
// Corrupt one chunk
chunks[0][0] ^= 0xFF;
await Should.ThrowAsync<InvalidDataException>(
() => node.InstallSnapshotFromChunksAsync(chunks, 1, 1, default));
}
Step 2-5: Implement, test, commit
git commit -m "feat: add chunk-based snapshot streaming with CRC32 validation (Gap 8.3)"
Task 7: RAFT Leadership Transfer (Gap 8.4)
Add TransferLeadership with TimeoutNowRpc message type.
Files:
- Modify:
src/NATS.Server/Raft/RaftNode.cs(addTransferLeadershipAsync) - Modify:
src/NATS.Server/Raft/RaftWireFormat.cs(addTimeoutNowRpcwire type) - Test:
tests/NATS.Server.Tests/Raft/RaftLeadershipTransferTests.cs(create) - Go ref:
raft.goleadership transfer
Step 1-5: TDD cycle
Tests for: target receives TimeoutNow and starts election immediately, leader stops accepting proposals during transfer, transfer times out after 2x election timeout if not completed. Implement TransferLeadershipAsync(string targetId) on RaftNode.
git commit -m "feat: add leadership transfer via TimeoutNow RPC (Gap 8.4)"
Task 8: RAFT Log Compaction Policies (Gap 8.5)
Add CompactionPolicy enum and configurable compaction thresholds.
Files:
- Create:
src/NATS.Server/Raft/CompactionPolicy.cs - Modify:
src/NATS.Server/Raft/RaftNode.cs:400(CompactLogAsync— use policy) - Test:
tests/NATS.Server.Tests/Raft/RaftCompactionPolicyTests.cs(create)
Step 1-5: TDD cycle
Tests for: ByCount policy compacts when log exceeds N entries, BySize compacts when total size exceeds threshold, ByAge compacts entries older than duration. Implement CompactionPolicy enum with ByCount, BySize, ByAge variants and RaftOptions.CompactionPolicy/thresholds.
git commit -m "feat: add configurable log compaction policies (Gap 8.5)"
Task 9: RAFT Quorum Check Before Proposing (Gap 8.6)
Add HasQuorum() check using peer heartbeat timestamps.
Files:
- Modify:
src/NATS.Server/Raft/RaftNode.cs(addHasQuorum, updateProposeAsync) - Modify:
src/NATS.Server/Raft/RaftPeerState.cs(ensureLastContacttracked) - Test:
tests/NATS.Server.Tests/Raft/RaftQuorumCheckTests.cs(create)
Step 1-5: TDD cycle
Tests for: HasQuorum returns false when majority of peers have stale heartbeats, ProposeAsync returns ProposalResult.NoQuorum when check fails, heartbeat responses update LastContact.
git commit -m "feat: add quorum check before proposing entries (Gap 8.6)"
Task 10: RAFT ReadIndex Optimization (Gap 8.7)
Add ReadIndex() for linearizable reads without log growth.
Files:
- Modify:
src/NATS.Server/Raft/RaftNode.cs(addReadIndexAsync) - Test:
tests/NATS.Server.Tests/Raft/RaftReadIndexTests.cs(create)
Step 1-5: TDD cycle
Tests for: ReadIndexAsync returns current commit index after quorum heartbeat round, deposed leader's ReadIndexAsync fails (no quorum), follower can serve read when appliedIndex >= readIndex.
git commit -m "feat: add ReadIndex for linearizable reads via quorum confirmation (Gap 8.7)"
Task 11: RAFT Election Timeout Jitter (Gap 8.8)
Add RandomizedElectionTimeout() using TotalMilliseconds.
Files:
- Modify:
src/NATS.Server/Raft/RaftNode.cs:500(ResetElectionTimeout— add jitter) - Test:
tests/NATS.Server.Tests/Raft/RaftElectionJitterTests.cs(create)
Step 1-5: TDD cycle
Tests for: randomized timeout is within [base, 2*base) range, uses TotalMilliseconds (not Milliseconds), different nodes get different timeouts.
git commit -m "feat: add randomized election timeout jitter (Gap 8.8)"
Phase 1 Exit Gate
dotnet test tests/NATS.Server.Tests --filter "FullyQualifiedName~FileStoreChecksum|FullyQualifiedName~AtomicFileWriter|FullyQualifiedName~SequenceSet|FullyQualifiedName~Tombstone|FullyQualifiedName~WriteCache|FullyQualifiedName~FilterQuery|FullyQualifiedName~SnapshotStreaming|FullyQualifiedName~LeadershipTransfer|FullyQualifiedName~CompactionPolicy|FullyQualifiedName~QuorumCheck|FullyQualifiedName~ReadIndex|FullyQualifiedName~ElectionJitter" -v normal
Update test parity DB for all Phase 1 tests.
Phase 2: JetStream Cluster Coordination + API (13 gaps)
Dependencies: Phase 1 (RAFT completion) Exit gate: Peers added/removed, RAFT entries applied to state machine, assignments encoded/decoded with golden fixtures, API requests forwarded to leader, rate limiting active, advisory events published
Task 12: Peer Management & Stream Moves (Gap 2.4)
Add ProcessAddPeer/ProcessRemovePeer to JetStreamMetaGroup for peer-driven stream reassignment.
Files:
- Modify:
src/NATS.Server/JetStream/Cluster/JetStreamMetaGroup.cs - Modify:
src/NATS.Server/JetStream/Cluster/StreamReplicaGroup.cs - Test:
tests/NATS.Server.Tests/JetStream/Cluster/PeerManagementTests.cs(create) - Go ref:
jetstream_cluster.go:2290-2439
Step 1-5: TDD cycle
Tests for: ProcessAddPeer triggers re-replication of under-replicated streams, ProcessRemovePeer triggers reassignment away from removed peer, RemovePeerFromStream removes specific peer from replica group.
git commit -m "feat: add peer management with stream reassignment (Gap 2.4)"
Task 13: Entry Application Pipeline (Gap 2.7)
Add ApplyMetaEntries and ApplyStreamEntries dispatchers.
Files:
- Modify:
src/NATS.Server/JetStream/Cluster/JetStreamMetaGroup.cs:579(extendApplyEntry) - Modify:
src/NATS.Server/JetStream/Cluster/StreamReplicaGroup.cs:193(extendApplyCommittedEntriesAsync) - Test:
tests/NATS.Server.Tests/JetStream/Cluster/EntryApplicationTests.cs(create) - Go ref:
jetstream_cluster.go:2474-4261
Step 1-5: TDD cycle
Tests for: meta entry dispatch (StreamCreate, StreamUpdate, StreamDelete, ConsumerCreate, ConsumerDelete, PeerAdd, PeerRemove), stream entry dispatch (store, remove, purge), consumer entry dispatch (ack, nak, deliver).
git commit -m "feat: add entry application pipeline for meta and stream RAFT groups (Gap 2.7)"
Task 14: Topology-Aware Placement (Gap 2.8)
Extend PlacementEngine.SelectPeerGroup with tag enforcement, HA limits, and weighted selection.
Files:
- Modify:
src/NATS.Server/JetStream/Cluster/PlacementEngine.cs:14 - Modify:
src/NATS.Server/JetStream/Cluster/PlacementEngine.cs:62(PeerInfo— add tags, storage) - Test:
tests/NATS.Server.Tests/JetStream/Cluster/TopologyPlacementTests.cs(create) - Go ref:
jetstream_cluster.go:7524-7618(selectPeerGroup)
Step 1-5: TDD cycle
Tests for: JetStreamUniqueTag enforcement (no two replicas on same-tagged node), HA asset limits per peer, tag include/exclude with prefix matching, weighted selection by available resources.
git commit -m "feat: add topology-aware placement with tag enforcement (Gap 2.8)"
Task 15: RAFT Group Creation & Lifecycle (Gap 2.9)
Flesh out RaftGroup with factory method and member helpers.
Files:
- Modify:
src/NATS.Server/JetStream/Cluster/ClusterAssignmentTypes.cs:9(RaftGroup) - Test:
tests/NATS.Server.Tests/JetStream/Cluster/RaftGroupLifecycleTests.cs(create)
Step 1-5: TDD cycle
Tests for: IsMember(peerId), SetPreferred(peerId), CreateRaftGroup factory uses PlacementEngine.
git commit -m "feat: add RaftGroup lifecycle methods (Gap 2.9)"
Task 16: Assignment Encoding/Decoding (Gap 2.10)
Add AssignmentCodec with binary serialization matching Go wire format, golden fixture tests.
Files:
- Create:
src/NATS.Server/JetStream/Cluster/AssignmentCodec.cs - Test:
tests/NATS.Server.Tests/JetStream/Cluster/AssignmentCodecTests.cs(create) - Go ref:
jetstream_cluster.goencode/decode functions
Step 1-5: TDD cycle
Tests for: encode/decode round-trip for stream assignments, consumer assignments, S2 compression for configs > 1KB, golden fixture tests with captured Go output bytes for format compatibility.
git commit -m "feat: add binary assignment codec with golden fixture tests (Gap 2.10)"
Task 17: Unsupported Asset Handling (Gap 2.11)
Add graceful handling for version-incompatible stream/consumer assignments.
Files:
- Modify:
src/NATS.Server/JetStream/Cluster/JetStreamMetaGroup.cs - Test:
tests/NATS.Server.Tests/JetStream/Cluster/UnsupportedAssetTests.cs(create)
Step 1-5: TDD cycle
Tests for: unknown version assignment logged as warning and skipped, does not crash cluster.
git commit -m "feat: add unsupported asset handling for mixed-version clusters (Gap 2.11)"
Task 18: Clustered API Handlers (Gap 2.12)
Add JsClusteredStreamRequest and JsClusteredStreamUpdateRequest.
Files:
- Modify:
src/NATS.Server/JetStream/Api/Handlers/StreamApiHandlers.cs - Modify:
src/NATS.Server/JetStream/Api/Handlers/ConsumerApiHandlers.cs - Test:
tests/NATS.Server.Tests/JetStream/Api/ClusteredApiTests.cs(create) - Go ref:
jetstream_cluster.go:7620-8265
Step 1-5: TDD cycle
Tests for: stream create proposes to meta RAFT, stream update proposes with validation, system subscriptions for result processing.
git commit -m "feat: add clustered stream/consumer API handlers (Gap 2.12)"
Task 19: Leader Forwarding (Gap 7.1)
Implement ForwardToLeader middleware in JetStreamApiRouter.
Files:
- Modify:
src/NATS.Server/JetStream/Api/JetStreamApiRouter.cs:97(ForwardToLeader— implement) - Test:
tests/NATS.Server.Tests/JetStream/Api/LeaderForwardingTests.cs(create)
Step 1-5: TDD cycle
Tests for: request forwarded when not meta leader, forwarded response returned to client, timeout after 5 seconds.
git commit -m "feat: implement leader forwarding for JetStream API (Gap 7.1)"
Task 20: Clustered API Request Handlers (Gap 7.2)
Add cluster-aware create/update/delete that propose to RAFT.
Files:
- Modify:
src/NATS.Server/JetStream/Api/Handlers/StreamApiHandlers.cs - Modify:
src/NATS.Server/JetStream/Api/Handlers/ConsumerApiHandlers.cs - Test:
tests/NATS.Server.Tests/JetStream/Api/ClusteredRequestTests.cs(create) - Go ref:
jetstream_cluster.go:7620-7701
Step 1-5: TDD cycle
Tests for: cluster-aware create proposes to RAFT, waits for proposal result, returns error on proposal failure.
git commit -m "feat: add cluster-aware API request handlers (Gap 7.2)"
Task 21: API Rate Limiting & Deduplication (Gap 7.3)
Add ApiRateLimiter with SemaphoreSlim-based concurrency limiting and request deduplication.
Files:
- Create:
src/NATS.Server/JetStream/Api/ApiRateLimiter.cs - Modify:
src/NATS.Server/JetStream/Api/JetStreamApiRouter.cs(integrate limiter) - Test:
tests/NATS.Server.Tests/JetStream/Api/ApiRateLimiterTests.cs(create)
Step 1-5: TDD cycle
Tests for: concurrent requests capped at max (default 256), duplicate Nats-Request-Id returns cached response, TTL expiration of dedup entries (5 seconds).
git commit -m "feat: add API rate limiting and request deduplication (Gap 7.3)"
Task 22: Snapshot & Restore API Stub (Gap 7.4)
Wire $JS.API.STREAM.SNAPSHOT and $JS.API.STREAM.RESTORE endpoints.
Files:
- Modify:
src/NATS.Server/JetStream/Api/Handlers/StreamApiHandlers.cs:157(completeHandleSnapshot) - Modify:
src/NATS.Server/JetStream/Api/Handlers/StreamApiHandlers.cs:176(completeHandleRestore) - Test:
tests/NATS.Server.Tests/JetStream/Api/SnapshotApiTests.cs(create)
Note: Full snapshot behavior completed in Phase 3 Task 37 (Gap 4.7).
Step 1-5: TDD cycle
Tests for: endpoint responds to correct subject, validates stream exists, calls StreamSnapshotService.
git commit -m "feat: wire snapshot/restore API endpoints (Gap 7.4 stub)"
Task 23: Consumer Pause/Resume API (Gap 7.5)
Wire $JS.API.CONSUMER.PAUSE endpoint to existing ConsumerManager.Pause.
Files:
- Modify:
src/NATS.Server/JetStream/Api/Handlers/ConsumerApiHandlers.cs:79 - Test:
tests/NATS.Server.Tests/JetStream/Api/ConsumerPauseApiTests.cs(create)
Step 1-5: TDD cycle
Tests for: pause endpoint calls ConsumerManager.Pause with pauseUntil, returns pause state.
git commit -m "feat: wire consumer pause/resume API endpoint (Gap 7.5)"
Task 24: Advisory Event Publication (Gap 7.6)
Add PublishAdvisory calls in API handlers for stream/consumer lifecycle events.
Files:
- Modify:
src/NATS.Server/JetStream/Api/Handlers/StreamApiHandlers.cs - Modify:
src/NATS.Server/JetStream/Api/Handlers/ConsumerApiHandlers.cs - Modify:
src/NATS.Server/Events/InternalEventSystem.cs(addPublishAdvisoryAsync) - Modify:
src/NATS.Server/Events/EventSubjects.cs(add$JS.EVENT.ADVISORY.*subjects) - Test:
tests/NATS.Server.Tests/JetStream/Api/AdvisoryEventTests.cs(create)
Step 1-5: TDD cycle
Tests for: stream create publishes advisory, consumer delete publishes advisory, advisory includes correct event type and payload.
git commit -m "feat: add advisory event publication for API operations (Gap 7.6)"
Phase 2 Exit Gate
# Targeted tests for Phase 2
dotnet test tests/NATS.Server.Tests --filter "FullyQualifiedName~PeerManagement|FullyQualifiedName~EntryApplication|FullyQualifiedName~TopologyPlacement|FullyQualifiedName~RaftGroupLifecycle|FullyQualifiedName~AssignmentCodec|FullyQualifiedName~UnsupportedAsset|FullyQualifiedName~ClusteredApi|FullyQualifiedName~LeaderForwarding|FullyQualifiedName~ClusteredRequest|FullyQualifiedName~ApiRateLimiter|FullyQualifiedName~SnapshotApi|FullyQualifiedName~ConsumerPauseApi|FullyQualifiedName~AdvisoryEvent" -v normal
# FULL TEST SUITE CHECKPOINT (Phase 2 complete)
dotnet test
Update test parity DB for all Phase 2 tests.
Phase 3: Consumer Engines + Stream Lifecycle (13 gaps)
Dependencies: Phase 2 (cluster coordination) Exit gate: Consumer delivery loop dispatches messages, heartbeats sent, interest tracked, max deliveries enforced, filter skipping works, rate limiting via token bucket, source consumers configured, snapshot/restore operational
Task 25: Core Message Delivery Loop (Gap 3.1 — CRITICAL)
Implement LoopAndGatherMsgs in PushConsumerEngine.
Files:
- Modify:
src/NATS.Server/JetStream/Consumers/PushConsumerEngine.cs - Test:
tests/NATS.Server.Tests/JetStream/Consumers/DeliveryLoopTests.cs(create) - Go ref:
consumer.go:1400-1700(loopAndGatherMsgs)
Step 1: Write failing tests
Tests for: delivery loop polls store for new messages, checks redelivery tracker for expired entries, calculates num_pending from store state, dispatches messages via client write path, uses Channel<ConsumerSignal> for wake-up.
Step 2-5: Implement, test, commit
Add ConsumerSignal enum (NewMessage, AckEvent, ConfigChange). Add Channel<ConsumerSignal> _signalChannel to PushConsumerEngine. Implement background LoopAndGatherMsgs task that: polls IStreamStore.LoadNextMsg, checks RedeliveryTracker.GetDue(), dispatches via SendMessage.
git commit -m "feat: implement core message delivery loop for push consumers (Gap 3.1)"
Task 26: Idle Heartbeat & Flow Control (Gap 3.5)
Add SendIdleHeartbeat and SendFlowControl with pending count headers.
Files:
- Modify:
src/NATS.Server/JetStream/Consumers/PushConsumerEngine.cs:220(extend heartbeat) - Test:
tests/NATS.Server.Tests/JetStream/Consumers/IdleHeartbeatTests.cs(create) - Go ref:
consumer.go:5222(sendIdleHeartbeat),consumer.go:5495(sendFlowControl)
Step 1-5: TDD cycle
Tests for: heartbeat sent with Nats-Pending-Messages/Nats-Pending-Bytes headers when no delivery within interval, flow control reply with stall detection.
git commit -m "feat: add idle heartbeat with pending count headers and flow control (Gap 3.5)"
Task 27: Delivery Interest Tracking (Gap 3.8)
Add DeliveryInterestTracker monitoring subscribe/unsubscribe events.
Files:
- Create:
src/NATS.Server/JetStream/Consumers/DeliveryInterestTracker.cs - Modify:
src/NATS.Server/JetStream/Consumers/PushConsumerEngine.cs(integrate tracker) - Test:
tests/NATS.Server.Tests/JetStream/Consumers/DeliveryInterestTests.cs(create)
Step 1-5: TDD cycle
Tests for: HasInterest reflects subscription state, DeleteNotActive cleanup after timeout, gateway interest checking.
git commit -m "feat: add delivery interest tracking with auto-cleanup (Gap 3.8)"
Task 28: Max Deliveries Enforcement (Gap 3.9)
Add advisory generation and delivery exceeded policy.
Files:
- Modify:
src/NATS.Server/JetStream/Consumers/AckProcessor.cs - Modify:
src/NATS.Server/Events/InternalEventSystem.cs(addNotifyDeliveryExceededevent type) - Test:
tests/NATS.Server.Tests/JetStream/Consumers/MaxDeliveriesTests.cs(create)
Step 1-5: TDD cycle
Tests for: advisory generated when delivery count exceeds MaxDeliver, DeliveryExceededPolicy enum (Drop, DeadLetter).
git commit -m "feat: add max delivery enforcement with advisory generation (Gap 3.9)"
Task 29: Filter Subject Skip Tracking (Gap 3.10)
Add FilterSkipTracker using SubjectMatch.IsMatch() for token-based filter matching.
Files:
- Create:
src/NATS.Server/JetStream/Consumers/FilterSkipTracker.cs - Test:
tests/NATS.Server.Tests/JetStream/Consumers/FilterSkipTests.cs(create)
Step 1-5: TDD cycle
Tests for: filter matching uses SubjectMatch.IsMatch() (NOT Regex), SortedSet<ulong> tracks unmatched sequences, ShouldSkip returns whether message matches filter.
git commit -m "feat: add filter skip tracking using SubjectMatch (Gap 3.10)"
Task 30: Sample/Observe Mode (Gap 3.11)
Add sample frequency parsing and stochastic latency sampling.
Files:
- Modify:
src/NATS.Server/JetStream/Consumers/PushConsumerEngine.cs - Test:
tests/NATS.Server.Tests/JetStream/Consumers/SampleModeTests.cs(create)
Step 1-5: TDD cycle
Tests for: "1%" → 0.01 parsing, ShouldSample() uses Random.Shared, latency measurement and advisory.
git commit -m "feat: add sample/observe mode with latency measurement (Gap 3.11)"
Task 31: Reset to Sequence (Gap 3.12)
Add ProcessResetRequest to ConsumerManager.
Files:
- Modify:
src/NATS.Server/JetStream/ConsumerManager.cs:218(extendReset) - Test:
tests/NATS.Server.Tests/JetStream/Consumers/ConsumerResetTests.cs(create) - Go ref:
consumer.go:4241(processResetReq)
Step 1-5: TDD cycle
Tests for: reset to specific sequence updates NextSequence, clears pending acks, clears redelivery tracker, publishes advisory.
git commit -m "feat: add consumer reset to specific sequence (Gap 3.12)"
Task 32: Token Bucket Rate Limiting (Gap 3.13)
Add TokenBucketRateLimiter for accurate rate limiting.
Files:
- Create:
src/NATS.Server/JetStream/Consumers/TokenBucketRateLimiter.cs - Modify:
src/NATS.Server/JetStream/Consumers/PushConsumerEngine.cs(integrate) - Test:
tests/NATS.Server.Tests/JetStream/Consumers/TokenBucketTests.cs(create)
Step 1-5: TDD cycle
Tests for: configurable rate (bytes/sec) and burst size, WaitForTokenAsync blocks until tokens available, dynamic rate updates.
git commit -m "feat: add token bucket rate limiter for consumers (Gap 3.13)"
Task 33: Cluster-Aware Pending Requests (Gap 3.14)
Add ProposeWaitingRequest for pull requests through consumer RAFT group.
Files:
- Modify:
src/NATS.Server/JetStream/Consumers/PullConsumerEngine.cs - Test:
tests/NATS.Server.Tests/JetStream/Consumers/ClusterPendingRequestTests.cs(create)
Step 1-5: TDD cycle
Tests for: pull requests proposed through consumer RAFT group, cluster-wide pending tracking.
git commit -m "feat: add cluster-aware pending request tracking for pull consumers (Gap 3.14)"
Task 34: Source Consumer Setup (Gap 4.3)
Complete API request generation for source consumers.
Files:
- Modify:
src/NATS.Server/JetStream/Streams/SourceCoordinator.cs - Test:
tests/NATS.Server.Tests/JetStream/Streams/SourceConsumerSetupTests.cs(create)
Step 1-5: TDD cycle
Tests for: consumer create request with FilterSubject, SubjectTransforms, OptStartSeq, flow control, account isolation verification.
git commit -m "feat: complete source consumer API request generation (Gap 4.3)"
Task 35: Stream Snapshot & Restore (Gap 4.7)
Implement TAR-based snapshot with S2 compression and deadline enforcement.
Files:
- Modify:
src/NATS.Server/JetStream/Snapshots/StreamSnapshotService.cs - Test:
tests/NATS.Server.Tests/JetStream/Snapshots/StreamSnapshotTests.cs(create)
Step 1-5: TDD cycle
Tests for: TAR snapshot includes stream config, message blocks, consumer state. S2 compression applied. Deadline enforcement via CancellationTokenSource. Consumer inclusion/exclusion. Restore validates TAR, decompresses, rebuilds indices.
git commit -m "feat: implement TAR-based stream snapshot with S2 compression (Gap 4.7)"
Task 36: Stream Config Update Validation (Gap 4.8)
Add ValidateConfigUpdate with change restriction rules.
Files:
- Modify:
src/NATS.Server/JetStream/StreamManager.cs - Test:
tests/NATS.Server.Tests/JetStream/Streams/ConfigUpdateValidationTests.cs(create)
Step 1-5: TDD cycle
Tests for: subjects overlap detection, mirror/source immutability, retention policy restrictions, MaxMsgs/MaxBytes/MaxAge monotonic decrease, discard policy compatibility.
git commit -m "feat: add stream config update validation (Gap 4.8)"
Task 37: Source/Mirror Info Reporting (Gap 4.10)
Add GetMirrorInfo/GetSourceInfo for monitoring.
Files:
- Modify:
src/NATS.Server/JetStream/Streams/MirrorCoordinator.cs - Modify:
src/NATS.Server/JetStream/Streams/SourceCoordinator.cs - Test:
tests/NATS.Server.Tests/JetStream/Streams/SourceMirrorInfoTests.cs(create)
Step 1-5: TDD cycle
Tests for: MirrorInfoResponse with lag/active/error, SourceInfoResponse[], wired into stream info API.
git commit -m "feat: add source/mirror info reporting for monitoring (Gap 4.10)"
Phase 3 Exit Gate
dotnet test tests/NATS.Server.Tests --filter "FullyQualifiedName~DeliveryLoop|FullyQualifiedName~IdleHeartbeat|FullyQualifiedName~DeliveryInterest|FullyQualifiedName~MaxDeliveries|FullyQualifiedName~FilterSkip|FullyQualifiedName~SampleMode|FullyQualifiedName~ConsumerReset|FullyQualifiedName~TokenBucket|FullyQualifiedName~ClusterPendingRequest|FullyQualifiedName~SourceConsumerSetup|FullyQualifiedName~StreamSnapshot|FullyQualifiedName~ConfigUpdateValidation|FullyQualifiedName~SourceMirrorInfo" -v normal
Update test parity DB for all Phase 3 tests.
Phase 4: Client Protocol + MQTT (12 gaps)
Dependencies: Phase 3 (consumer engines) Exit gate: Route result cache active, slow consumers tracked, trace delivery works, SUB permission cached, will messages published, QoS 1/2 tracked, retained delivered on subscribe
Task 38: Per-Account Subscription Result Cache (Gap 5.4)
Add RouteResultCache with LRU eviction and generation-based invalidation.
Files:
- Create:
src/NATS.Server/Subscriptions/RouteResultCache.cs - Modify:
src/NATS.Server/NatsClient.cs(integrate cache into message dispatch) - Test:
tests/NATS.Server.Tests/Subscriptions/RouteResultCacheTests.cs(create)
Step 1-5: TDD cycle
Tests for: LRU eviction at 8192 entries, per-account partitioning, atomic generation invalidation, cache hit avoids SubList.Match call.
git commit -m "feat: add per-account subscription result cache with LRU (Gap 5.4)"
Task 39: Slow Consumer Stall Gate (Gap 5.5)
Extend StallGate with per-kind slow consumer statistics.
Files:
- Modify:
src/NATS.Server/NatsClient.cs:1017(StallGate— addSlowConsumerCount) - Modify:
src/NATS.Server/Auth/Account.cs(addIncrementSlowConsumers) - Test:
tests/NATS.Server.Tests/SlowConsumerStallGateTests.cs(create)
Step 1-5: TDD cycle
Tests for: SlowConsumerCount per ClientKind, account-level tracking, SlowConsumerEvent fired at threshold.
git commit -m "feat: add slow consumer per-kind tracking with account counters (Gap 5.5)"
Task 40: Dynamic Write Buffer Pooling (Gap 5.6)
Integrate OutboundBufferPool with flush coalescing and broadcast drain.
Files:
- Modify:
src/NATS.Server/NatsClient.cs:785(RunWriteLoopAsync— add broadcast drain) - Modify:
src/NATS.Server/IO/OutboundBufferPool.cs - Test:
tests/NATS.Server.Tests/IO/DynamicBufferPoolTests.cs(create)
Step 1-5: TDD cycle
Tests for: broadcast flush drains multiple pending clients, reduces syscall count for fan-out.
git commit -m "feat: add dynamic write buffer pooling with broadcast drain (Gap 5.6)"
Task 41: Per-Client Trace Level (Gap 5.7)
Add TraceMsgDelivery and per-client echo control.
Files:
- Modify:
src/NATS.Server/NatsClient.cs(addTraceMsgDelivery,EchoSupported) - Test:
tests/NATS.Server.Tests/ClientTraceTests.cs(create)
Step 1-5: TDD cycle
Tests for: trace message delivery logged at Trace level with subject/destination/size, echo flag controls routed message behavior.
git commit -m "feat: add per-client trace delivery and echo control (Gap 5.7)"
Task 42: Subscribe Permission Caching (Gap 5.8)
Extend PermissionLruCache with SUB permission entries and generation-based invalidation.
Files:
- Modify:
src/NATS.Server/Auth/PermissionLruCache.cs - Modify:
src/NATS.Server/Auth/Account.cs(addGenerationId) - Test:
tests/NATS.Server.Tests/Auth/SubPermissionCacheTests.cs(create)
Step 1-5: TDD cycle
Tests for: SUB permission cached alongside PUB, generation ID invalidation on permission changes.
git commit -m "feat: add SUB permission caching with generation invalidation (Gap 5.8)"
Task 43: Internal Client Kinds (Gap 5.9)
Already implemented — ClientKind.cs has System, JetStream, Account and IsInternal() extension. Verify and add tests if missing.
Files:
- Verify:
src/NATS.Server/ClientKind.cs - Test:
tests/NATS.Server.Tests/ClientKindTests.cs(create if missing)
git commit -m "test: verify internal client kinds (Gap 5.9)"
Task 44: Adaptive Read Buffer Short-Read Counter (Gap 5.10)
Add _consecutiveShortReads counter with 4-read threshold.
Files:
- Modify:
src/NATS.Server/IO/AdaptiveReadBuffer.cs - Test:
tests/NATS.Server.Tests/IO/AdaptiveReadBufferShortReadTests.cs(create)
Step 1-5: TDD cycle
Tests for: shrink only after 4 consecutive short reads, counter resets on full-buffer read.
git commit -m "feat: add consecutive short-read counter to prevent buffer oscillation (Gap 5.10)"
Task 45: MQTT Will Message Delivery (Gap 6.2)
Add PublishWillMessage triggered on abnormal disconnection.
Files:
- Modify:
src/NATS.Server/Mqtt/MqttSessionStore.cs - Modify:
src/NATS.Server/Mqtt/MqttConnection.cs(trigger on disconnect) - Test:
tests/NATS.Server.Tests/Mqtt/MqttWillMessageTests.cs(create)
Step 1-5: TDD cycle
Tests for: will message published on abnormal disconnect, NOT published on clean DISCONNECT, will delay interval support.
git commit -m "feat: add MQTT will message delivery on abnormal disconnect (Gap 6.2)"
Task 46: MQTT QoS 1/2 Tracking (Gap 6.3)
Add MqttQoS1Tracker with JetStream-backed ack tracking.
Files:
- Create:
src/NATS.Server/Mqtt/MqttQoS1Tracker.cs - Modify:
src/NATS.Server/Mqtt/MqttRetainedStore.cs(extendMqttQoS2StateMachine) - Test:
tests/NATS.Server.Tests/Mqtt/MqttQoSTrackingTests.cs(create)
Step 1-5: TDD cycle
Tests for: QoS 1 outgoing messages stored in $MQTT_out, removed on PUBACK, redelivered on reconnect. QoS 2 PUBREL delivery stream.
git commit -m "feat: add JetStream-backed QoS 1/2 tracking (Gap 6.3)"
Task 47: MQTT MaxAckPending Enforcement (Gap 6.4)
Add MqttFlowController with per-subscription ack pending limits.
Files:
- Create:
src/NATS.Server/Mqtt/MqttFlowController.cs - Test:
tests/NATS.Server.Tests/Mqtt/MqttFlowControllerTests.cs(create)
Step 1-5: TDD cycle
Tests for: SemaphoreSlim-based blocking at limit, release on PUBACK/PUBCOMP, config reload updates limits.
git commit -m "feat: add MQTT MaxAckPending flow control (Gap 6.4)"
Task 48: MQTT Retained Message Delivery on Subscribe (Gap 6.5)
Deliver matching retained messages on SUBSCRIBE.
Files:
- Modify:
src/NATS.Server/Mqtt/MqttConnection.cs(integrate retained delivery on SUB) - Modify:
src/NATS.Server/Mqtt/MqttRetainedStore.cs:80(GetMatchingRetained) - Test:
tests/NATS.Server.Tests/Mqtt/MqttRetainedDeliveryTests.cs(create)
Step 1-5: TDD cycle
Tests for: retained messages delivered on subscribe with Retain flag, wildcard subscriptions scan all retained topics.
git commit -m "feat: deliver retained messages on MQTT SUBSCRIBE (Gap 6.5)"
Task 49: MQTT Session Flapper Detection (Gap 6.6)
Complete flapper detection with exponential backoff.
Files:
- Modify:
src/NATS.Server/Mqtt/MqttSessionStore.cs:122(completeTrackConnectDisconnect) - Test:
tests/NATS.Server.Tests/Mqtt/MqttFlapperDetectionTests.cs(create)
Step 1-5: TDD cycle
Tests for: 3+ connect/disconnect cycles within 10s triggers flapper, exponential backoff on CONNACK, clear after 60s stable.
git commit -m "feat: complete MQTT session flapper detection (Gap 6.6)"
Phase 4 Exit Gate
# Targeted tests
dotnet test tests/NATS.Server.Tests --filter "FullyQualifiedName~RouteResultCache|FullyQualifiedName~SlowConsumer|FullyQualifiedName~DynamicBuffer|FullyQualifiedName~ClientTrace|FullyQualifiedName~SubPermissionCache|FullyQualifiedName~ClientKind|FullyQualifiedName~AdaptiveReadBuffer|FullyQualifiedName~MqttWill|FullyQualifiedName~MqttQoS|FullyQualifiedName~MqttFlowController|FullyQualifiedName~MqttRetained|FullyQualifiedName~MqttFlapper" -v normal
# FULL TEST SUITE CHECKPOINT (Phase 4 complete)
dotnet test
Update test parity DB for all Phase 4 tests.
Phase 5: Configuration Reload + Gateway (11 gaps)
Dependencies: Phase 4 (client protocol) Exit gate: Auth changes propagated, TLS reloaded, cluster config hot-reloaded, gateways reconnect with backoff, account-specific routing works, queue groups propagated, reply mapping cached
Task 50: Auth Change Propagation (Gap 14.2)
Add PropagateAuthChanges to ConfigReloader.
Files:
- Modify:
src/NATS.Server/Configuration/ConfigReloader.cs - Modify:
src/NATS.Server/NatsServer.cs(hook auth changes to connections) - Test:
tests/NATS.Server.Tests/Configuration/AuthChangePropagationTests.cs(create)
git commit -m "feat: add auth change propagation to existing connections (Gap 14.2)"
Task 51: TLS Certificate Reload (Gap 14.3)
Add ReloadTlsCertificates for hot-swapping certificates.
Files:
- Modify:
src/NATS.Server/Configuration/ConfigReloader.cs:496(extendReloadTlsCertificate) - Test:
tests/NATS.Server.Tests/Configuration/TlsReloadTests.cs(create)
git commit -m "feat: add TLS certificate hot-reload for new connections (Gap 14.3)"
Task 52: Cluster Config Hot Reload (Gap 14.4)
Add ApplyClusterConfigChanges for route/gateway/leaf URL changes.
Files:
- Modify:
src/NATS.Server/Configuration/ConfigReloader.cs - Test:
tests/NATS.Server.Tests/Configuration/ClusterConfigReloadTests.cs(create)
git commit -m "feat: add cluster config hot reload (Gap 14.4)"
Task 53: Logging Level Changes (Gap 14.5)
Add ApplyLoggingChanges to update Serilog LoggingLevelSwitch.
Files:
- Modify:
src/NATS.Server/Configuration/ConfigReloader.cs - Test:
tests/NATS.Server.Tests/Configuration/LoggingReloadTests.cs(create)
git commit -m "feat: add runtime logging level changes (Gap 14.5)"
Task 54: JetStream Config Changes (Gap 14.6)
Add ApplyJetStreamConfigChanges for MaxMemory/MaxStore/Domain.
Files:
- Modify:
src/NATS.Server/Configuration/ConfigReloader.cs - Test:
tests/NATS.Server.Tests/Configuration/JetStreamConfigReloadTests.cs(create)
git commit -m "feat: add JetStream config change reload (Gap 14.6)"
Task 55: Gateway Reconnection with Backoff (Gap 11.2)
Add ReconnectGatewayAsync with exponential backoff and jitter.
Files:
- Modify:
src/NATS.Server/Gateways/GatewayManager.cs - Test:
tests/NATS.Server.Tests/Gateways/GatewayReconnectionTests.cs(create)
git commit -m "feat: add gateway reconnection with exponential backoff (Gap 11.2)"
Task 56: Account-Specific Gateway Routes (Gap 11.3)
Add per-account subscription sending to gateways.
Files:
- Modify:
src/NATS.Server/Gateways/GatewayManager.cs - Modify:
src/NATS.Server/Gateways/GatewayConnection.cs - Test:
tests/NATS.Server.Tests/Gateways/AccountGatewayRoutesTests.cs(create)
git commit -m "feat: add account-specific gateway routes (Gap 11.3)"
Task 57: Queue Group Propagation (Gap 11.4)
Add SendQueueSubsToGateway for queue group load balancing across gateways.
Files:
- Modify:
src/NATS.Server/Gateways/GatewayManager.cs - Modify:
src/NATS.Server/Gateways/GatewayConnection.cs - Test:
tests/NATS.Server.Tests/Gateways/QueueGroupPropagationTests.cs(create)
git commit -m "feat: add queue group propagation to gateways (Gap 11.4)"
Task 58: Reply Subject Mapping Cache (Gap 11.5)
Add ReplyMapCache with LRU and TTL expiration.
Files:
- Modify:
src/NATS.Server/Gateways/ReplyMapper.cs - Test:
tests/NATS.Server.Tests/Gateways/ReplyMapCacheTests.cs(create)
git commit -m "feat: add reply subject mapping cache with TTL (Gap 11.5)"
Task 59: Gateway Command Protocol (Gap 11.6)
Add GatewayCommand enum with exact Go wire format byte sequences.
Files:
- Modify:
src/NATS.Server/Gateways/GatewayManager.cs - Modify:
src/NATS.Server/Gateways/GatewayConnection.cs - Test:
tests/NATS.Server.Tests/Gateways/GatewayCommandTests.cs(create)
git commit -m "feat: add gateway command protocol with Go-compatible wire format (Gap 11.6)"
Task 60: Gateway Connection Registration (Gap 11.7)
Add full gateway connection registry with state tracking.
Files:
- Modify:
src/NATS.Server/Gateways/GatewayManager.cs - Test:
tests/NATS.Server.Tests/Gateways/GatewayRegistrationTests.cs(create)
git commit -m "feat: add gateway connection registration with state tracking (Gap 11.7)"
Phase 5 Exit Gate
dotnet test tests/NATS.Server.Tests --filter "FullyQualifiedName~AuthChangePropagation|FullyQualifiedName~TlsReload|FullyQualifiedName~ClusterConfigReload|FullyQualifiedName~LoggingReload|FullyQualifiedName~JetStreamConfigReload|FullyQualifiedName~GatewayReconnection|FullyQualifiedName~AccountGatewayRoutes|FullyQualifiedName~QueueGroupPropagation|FullyQualifiedName~ReplyMapCache|FullyQualifiedName~GatewayCommand|FullyQualifiedName~GatewayRegistration" -v normal
Update test parity DB for all Phase 5 tests.
Phase 6: Route Clustering + LeafNode (12 gaps)
Dependencies: Phase 5 (config reload, gateway infrastructure) Exit gate: Account-specific routes work, pool sizes negotiated, routes stored by hash, cluster splits handled, leaf TLS hot-reloads, permissions synced, leaf connections validated, WebSocket transport works
Task 61: Account-Specific Dedicated Routes (Gap 13.2)
Add AccountRouteMap for per-account route connections.
Files:
- Modify:
src/NATS.Server/Routes/RouteManager.cs - Modify:
src/NATS.Server/Routes/RouteConnection.cs - Test:
tests/NATS.Server.Tests/Routes/AccountRouteTests.cs(create)
git commit -m "feat: add account-specific dedicated routes (Gap 13.2)"
Task 62: Route Pool Size Negotiation (Gap 13.3)
Add NegotiatePoolSize during route handshake.
Files:
- Modify:
src/NATS.Server/Routes/RouteManager.cs - Modify:
src/NATS.Server/Routes/RouteConnection.cs - Test:
tests/NATS.Server.Tests/Routes/PoolSizeNegotiationTests.cs(create)
git commit -m "feat: add route pool size negotiation (Gap 13.3)"
Task 63: Route Hash Storage (Gap 13.4)
Add ConcurrentDictionary<ulong, RouteConnection> for O(1) route lookup.
Files:
- Modify:
src/NATS.Server/Routes/RouteManager.cs - Test:
tests/NATS.Server.Tests/Routes/RouteHashStorageTests.cs(create)
git commit -m "feat: add route hash storage for O(1) lookup (Gap 13.4)"
Task 64: Cluster Split Handling (Gap 13.5)
Add RemoveAllRoutesExcept and RemoveRoute for partition handling.
Files:
- Modify:
src/NATS.Server/Routes/RouteManager.cs - Test:
tests/NATS.Server.Tests/Routes/ClusterSplitTests.cs(create)
git commit -m "feat: add cluster split handling (Gap 13.5)"
Task 65: No-Pool Route Fallback (Gap 13.6)
Add backward compatibility with pre-pool servers.
Files:
- Modify:
src/NATS.Server/Routes/RouteManager.cs - Modify:
src/NATS.Server/Routes/RouteConnection.cs - Test:
tests/NATS.Server.Tests/Routes/NoPoolFallbackTests.cs(create)
git commit -m "feat: add no-pool route fallback for backward compatibility (Gap 13.6)"
Task 66: Leaf Node TLS Certificate Hot-Reload (Gap 12.1)
Add UpdateTlsConfig for leaf node connections.
Files:
- Modify:
src/NATS.Server/LeafNodes/LeafNodeManager.cs - Test:
tests/NATS.Server.Tests/LeafNodes/LeafTlsReloadTests.cs(create)
git commit -m "feat: add leaf node TLS certificate hot-reload (Gap 12.1)"
Task 67: Permission & Account Syncing (Gap 12.2)
Add SendPermsAndAccountInfo and InitLeafNodeSmapAndSendSubs.
Files:
- Modify:
src/NATS.Server/LeafNodes/LeafNodeManager.cs - Modify:
src/NATS.Server/LeafNodes/LeafConnection.cs - Test:
tests/NATS.Server.Tests/LeafNodes/LeafPermissionSyncTests.cs(create)
git commit -m "feat: add leaf node permission and account syncing (Gap 12.2)"
Task 68: Leaf Connection State Validation (Gap 12.3)
Add ValidateRemoteLeafNode on reconnect.
Files:
- Modify:
src/NATS.Server/LeafNodes/LeafNodeManager.cs - Test:
tests/NATS.Server.Tests/LeafNodes/LeafValidationTests.cs(create)
git commit -m "feat: add leaf connection state validation on reconnect (Gap 12.3)"
Task 69: JetStream Migration Checks (Gap 12.4)
Add CheckJetStreamMigrate validation.
Files:
- Modify:
src/NATS.Server/LeafNodes/LeafNodeManager.cs - Test:
tests/NATS.Server.Tests/LeafNodes/LeafJetStreamMigrationTests.cs(create)
git commit -m "feat: add leaf node JetStream migration checks (Gap 12.4)"
Task 70: Leaf Node WebSocket Support (Gap 12.5)
Add WebSocketStreamAdapter for message-framed WebSocket → stream conversion.
Files:
- Create:
src/NATS.Server/LeafNodes/WebSocketStreamAdapter.cs - Modify:
src/NATS.Server/LeafNodes/LeafConnection.cs - Test:
tests/NATS.Server.Tests/LeafNodes/LeafWebSocketTests.cs(create)
git commit -m "feat: add leaf node WebSocket support with stream adapter (Gap 12.5)"
Task 71: Leaf Cluster Registration (Gap 12.6)
Add RegisterLeafNodeCluster and HasLeafNodeCluster.
Files:
- Modify:
src/NATS.Server/LeafNodes/LeafNodeManager.cs - Test:
tests/NATS.Server.Tests/LeafNodes/LeafClusterRegistrationTests.cs(create)
git commit -m "feat: add leaf cluster registration and topology tracking (Gap 12.6)"
Task 72: Leaf Connection Disable Flag (Gap 12.7)
Add IsLeafConnectDisabled with per-remote disable flag.
Files:
- Modify:
src/NATS.Server/LeafNodes/LeafNodeManager.cs - Test:
tests/NATS.Server.Tests/LeafNodes/LeafDisableTests.cs(create)
git commit -m "feat: add leaf connection disable flag (Gap 12.7)"
Phase 6 Exit Gate
# Targeted tests
dotnet test tests/NATS.Server.Tests --filter "FullyQualifiedName~AccountRoute|FullyQualifiedName~PoolSizeNegotiation|FullyQualifiedName~RouteHash|FullyQualifiedName~ClusterSplit|FullyQualifiedName~NoPoolFallback|FullyQualifiedName~LeafTlsReload|FullyQualifiedName~LeafPermissionSync|FullyQualifiedName~LeafValidation|FullyQualifiedName~LeafJetStreamMigration|FullyQualifiedName~LeafWebSocket|FullyQualifiedName~LeafClusterRegistration|FullyQualifiedName~LeafDisable" -v normal
# FULL TEST SUITE CHECKPOINT (Phase 6 complete)
dotnet test
Update test parity DB for all Phase 6 tests.
Phase 7: Account Management & Multi-Tenancy (10 gaps)
Dependencies: Phase 4 (client protocol), Phase 5 (config reload), Phase 6 (leaf/route for claim propagation) Exit gate: Service latency tracked, response thresholds enforced, import cycles detected, wildcard exports work, accounts expire, claims hot-reloaded, NKey revocation enforced
Task 73: Service Export Latency Tracking (Gap 9.1)
Add ServiceLatencyTracker with p50/p90/p99 histogram.
Files:
- Create:
src/NATS.Server/Auth/ServiceLatencyTracker.cs - Modify:
src/NATS.Server/Auth/Account.cs - Test:
tests/NATS.Server.Tests/Auth/ServiceLatencyTrackerTests.cs(create)
git commit -m "feat: add service export latency tracking with p50/p90/p99 (Gap 9.1)"
Task 74: Service Export Response Threshold (Gap 9.2)
Add ResponseThreshold to service export configuration.
Files:
- Modify:
src/NATS.Server/Auth/Account.cs - Test:
tests/NATS.Server.Tests/Auth/ResponseThresholdTests.cs(create)
git commit -m "feat: add service export response threshold enforcement (Gap 9.2)"
Task 75: Stream Import Cycle Detection (Gap 9.3)
Add StreamImportFormsCycle using DFS.
Files:
- Modify:
src/NATS.Server/Auth/Account.cs - Test:
tests/NATS.Server.Tests/Auth/StreamImportCycleTests.cs(create)
git commit -m "feat: add stream import cycle detection (Gap 9.3)"
Task 76: Wildcard Service Exports (Gap 9.4)
Add GetWildcardServiceExport using SubjectMatch.IsMatch.
Files:
- Modify:
src/NATS.Server/Auth/Account.cs - Test:
tests/NATS.Server.Tests/Auth/WildcardExportTests.cs(create)
git commit -m "feat: add wildcard service export matching (Gap 9.4)"
Task 77: Account Expiration & TTL (Gap 9.5)
Add ExpiresAt, IsExpired, and SetExpirationTimer.
Files:
- Modify:
src/NATS.Server/Auth/Account.cs - Test:
tests/NATS.Server.Tests/Auth/AccountExpirationTests.cs(create)
git commit -m "feat: add account expiration with TTL-based cleanup (Gap 9.5)"
Task 78: Account Claim Hot-Reload (Gap 9.6)
Add UpdateAccountClaims with diff-based update.
Files:
- Modify:
src/NATS.Server/Auth/Account.cs - Modify:
src/NATS.Server/NatsServer.cs - Test:
tests/NATS.Server.Tests/Auth/AccountClaimReloadTests.cs(create)
git commit -m "feat: add account claim hot-reload with diff-based update (Gap 9.6)"
Task 79: Service/Stream Activation Expiration (Gap 9.7)
Add CheckActivationExpiry for JWT activation claims.
Files:
- Modify:
src/NATS.Server/Auth/Account.cs - Test:
tests/NATS.Server.Tests/Auth/ActivationExpirationTests.cs(create)
git commit -m "feat: add JWT activation claim expiration checking (Gap 9.7)"
Task 80: User NKey Revocation (Gap 9.8)
Wire _revokedUsers into active connection validation.
Files:
- Modify:
src/NATS.Server/Auth/Account.cs:38(extendIsUserRevoked) - Modify:
src/NATS.Server/NatsClient.cs(check revocation on operations) - Test:
tests/NATS.Server.Tests/Auth/NKeyRevocationTests.cs(create)
git commit -m "feat: wire user NKey revocation into active connections (Gap 9.8)"
Task 81: Response Service Import (Gap 9.9)
Add AddReverseRespMapEntry and CheckForReverseEntries.
Files:
- Modify:
src/NATS.Server/Auth/Account.cs - Test:
tests/NATS.Server.Tests/Auth/ReverseResponseMapTests.cs(create)
git commit -m "feat: add reverse response mapping for cross-account request-reply (Gap 9.9)"
Task 82: Service Import Shadowing Detection (Gap 9.10)
Add ServiceImportShadowed checking for local subscription overlap.
Files:
- Modify:
src/NATS.Server/Auth/Account.cs - Test:
tests/NATS.Server.Tests/Auth/ImportShadowingTests.cs(create)
git commit -m "feat: add service import shadowing detection (Gap 9.10)"
Phase 7 Exit Gate
dotnet test tests/NATS.Server.Tests --filter "FullyQualifiedName~ServiceLatency|FullyQualifiedName~ResponseThreshold|FullyQualifiedName~StreamImportCycle|FullyQualifiedName~WildcardExport|FullyQualifiedName~AccountExpiration|FullyQualifiedName~AccountClaimReload|FullyQualifiedName~ActivationExpiration|FullyQualifiedName~NKeyRevocation|FullyQualifiedName~ReverseResponseMap|FullyQualifiedName~ImportShadowing" -v normal
Update test parity DB for all Phase 7 tests.
Phase 8: Monitoring, Events & WebSocket (11 gaps)
Dependencies: Phase 7 (account management) Exit gate: Closed connections queryable, account filtering works, sort options work, auth events published, trace propagated, event payloads complete, WebSocket TLS configured
Task 83: Closed Connections Ring Buffer (Gap 10.1)
Add ClosedConnectionRingBuffer and wire into ConnzHandler.
Files:
- Create:
src/NATS.Server/Monitoring/ClosedConnectionRingBuffer.cs - Modify:
src/NATS.Server/NatsServer.cs(record closed connections) - Modify:
src/NATS.Server/Monitoring/ConnzHandler.cs:12(supportstate=closed) - Test:
tests/NATS.Server.Tests/Monitoring/ClosedConnectionRingBufferTests.cs(create)
git commit -m "feat: add closed connection ring buffer for /connz?state=closed (Gap 10.1)"
Task 84: Account-Scoped Filtering (Gap 10.2)
Add acc query parameter to ConnzHandler.
Files:
- Modify:
src/NATS.Server/Monitoring/ConnzHandler.cs:239(addaccparam parsing) - Test:
tests/NATS.Server.Tests/Monitoring/ConnzAccountFilterTests.cs(create)
git commit -m "feat: add account-scoped filtering to /connz (Gap 10.2)"
Task 85: Sort Options (Gap 10.3)
Add SortBy enum and sort query parameter to ConnzHandler.
Files:
- Modify:
src/NATS.Server/Monitoring/ConnzHandler.cs - Test:
tests/NATS.Server.Tests/Monitoring/ConnzSortTests.cs(create)
git commit -m "feat: add sort options to /connz (Gap 10.3)"
Task 86: Message Trace Propagation (Gap 10.4)
Add trace context header propagation across servers.
Files:
- Modify:
src/NATS.Server/Internal/MessageTraceContext.cs - Modify:
src/NATS.Server/Events/InternalEventSystem.cs - Test:
tests/NATS.Server.Tests/Internal/TraceContextPropagationTests.cs(create)
git commit -m "feat: add message trace propagation across servers (Gap 10.4)"
Task 87: Auth Error Events (Gap 10.5)
Add SendAuthErrorEvent publishing to $SYS.SERVER.{id}.CLIENT.AUTH.ERR.
Files:
- Modify:
src/NATS.Server/NatsServer.cs - Modify:
src/NATS.Server/Events/InternalEventSystem.cs - Test:
tests/NATS.Server.Tests/Events/AuthErrorEventTests.cs(create)
git commit -m "feat: add auth error event publication (Gap 10.5)"
Task 88: Full System Event Payloads (Gap 10.6)
Audit and complete all event type fields.
Files:
- Modify:
src/NATS.Server/Events/EventTypes.cs - Modify:
src/NATS.Server/Events/InternalEventSystem.cs - Test:
tests/NATS.Server.Tests/Events/FullEventPayloadTests.cs(create)
git commit -m "feat: complete system event payload fields (Gap 10.6)"
Task 89: Closed Connection Reason Tracking (Gap 10.7)
Populate ClosedClient.Reason consistently across all disconnect paths.
Files:
- Modify:
src/NATS.Server/NatsClient.cs:902(MarkClosed— ensure reason set) - Modify:
src/NATS.Server/NatsServer.cs - Test:
tests/NATS.Server.Tests/Monitoring/ClosedReasonTests.cs(create)
git commit -m "feat: consistently populate closed connection reasons (Gap 10.7)"
Task 90: Remote Server Events (Gap 10.8)
Add RemoteServerShutdown, RemoteServerUpdate, LeafNodeConnected events.
Files:
- Modify:
src/NATS.Server/NatsServer.cs - Modify:
src/NATS.Server/Events/InternalEventSystem.cs - Modify:
src/NATS.Server/Events/EventSubjects.cs - Test:
tests/NATS.Server.Tests/Events/RemoteServerEventTests.cs(create)
git commit -m "feat: add remote server events for cluster visibility (Gap 10.8)"
Task 91: Event Compression (Gap 10.9)
Add S2 compression for system events when subscriber supports it.
Files:
- Modify:
src/NATS.Server/Events/InternalEventSystem.cs - Test:
tests/NATS.Server.Tests/Events/EventCompressionTests.cs(create)
git commit -m "feat: add S2 compression for system events (Gap 10.9)"
Task 92: OCSP Peer Events (Gap 10.10)
Add OCSP peer reject and chain validation events.
Files:
- Modify:
src/NATS.Server/NatsServer.cs - Modify:
src/NATS.Server/Events/InternalEventSystem.cs - Test:
tests/NATS.Server.Tests/Events/OcspEventTests.cs(create)
git commit -m "feat: add OCSP peer reject and chain validation events (Gap 10.10)"
Task 93: WebSocket-Specific TLS (Gap 15.1)
Add separate TLS configuration for WebSocket listener.
Files:
- Modify:
src/NATS.Server/WebSocket/WsUpgrade.cs - Modify:
src/NATS.Server/NatsServer.cs - Test:
tests/NATS.Server.Tests/WebSocket/WebSocketTlsTests.cs(create)
git commit -m "feat: add WebSocket-specific TLS configuration (Gap 15.1)"
Phase 8 Exit Gate
# Targeted tests
dotnet test tests/NATS.Server.Tests --filter "FullyQualifiedName~ClosedConnectionRingBuffer|FullyQualifiedName~ConnzAccountFilter|FullyQualifiedName~ConnzSort|FullyQualifiedName~TraceContextPropagation|FullyQualifiedName~AuthErrorEvent|FullyQualifiedName~FullEventPayload|FullyQualifiedName~ClosedReason|FullyQualifiedName~RemoteServerEvent|FullyQualifiedName~EventCompression|FullyQualifiedName~OcspEvent|FullyQualifiedName~WebSocketTls" -v normal
# FULL TEST SUITE CHECKPOINT (Phase 8 complete — FINAL)
dotnet test
Update test parity DB for all Phase 8 tests.
Post-Implementation
After all 8 phases:
- Update
docs/structuregaps.md— mark all 93 gaps as IMPLEMENTED with commit references - Final test parity DB update — ensure all new tests are registered
- Commit final updates
git add docs/structuregaps.md docs/test_parity.db
git commit -m "docs: mark all 93 remaining gaps as IMPLEMENTED"