Commit Graph

431 Commits

Author SHA1 Message Date
Joseph Doherty
d817d6f7a2 feat: implement leader forwarding for JetStream API (Gap 7.1)
Add ILeaderForwarder interface and DefaultLeaderForwarder, update
JetStreamApiRouter with RouteAsync + ForwardedCount so non-leader nodes
can attempt to forward mutating requests to the meta-group leader before
falling back to a NotLeader error response.
2026-02-25 09:53:50 -05:00
Joseph Doherty
aeeb2e6929 feat: add binary assignment codec with golden fixture tests (Gap 2.10)
Implements AssignmentCodec with JSON serialization for StreamAssignment
and ConsumerAssignment, plus Snappy compression helpers for large payloads.
Adds 12 tests covering round-trips, error handling, compression, and a
golden fixture for format stability.
2026-02-25 09:35:19 -05:00
Joseph Doherty
a7c094d6c1 feat: add unsupported asset handling for mixed-version clusters (Gap 2.11)
Add Version property to StreamAssignment and ConsumerAssignment so future-version
entries from newer cluster peers are gracefully skipped rather than applied incorrectly.
ProcessStreamAssignment and ProcessConsumerAssignment now reject entries with
Version > CurrentVersion (1), incrementing SkippedUnsupportedEntries for operator
visibility. Add MetaEntryType.Unknown with a no-op ApplyEntry handler so unrecognised
RAFT entry types never crash the cluster. Version 0 is treated as current for
backward compatibility with pre-versioned assignments.
2026-02-25 09:22:18 -05:00
Joseph Doherty
f69f9b3220 feat: add RaftGroup lifecycle methods (Gap 2.9)
Implement IsMember, SetPreferred, RemovePeer, AddPeer, CreateRaftGroup
factory, IsUnderReplicated, and IsOverReplicated on RaftGroup. Add 22
RaftGroupLifecycleTests covering all helpers and quorum size calculation.
2026-02-25 08:59:36 -05:00
Joseph Doherty
38ae1f6bea feat: add topology-aware placement with tag enforcement (Gap 2.8)
Extends PlacementEngine.SelectPeerGroup with three new capabilities ported
from jetstream_cluster.go:7212 selectPeerGroup:

- JetStreamUniqueTag enforcement: PlacementPolicy.UniqueTag (e.g. "az")
  ensures no two replicas share the same value for a tag with that prefix,
  matching Go's uniqueTagPrefix / checkUniqueTag logic.
- MaxAssetsPerPeer HA limit: peers at or above their asset ceiling are
  deprioritised (moved to fallback), not hard-excluded, so selection still
  succeeds when no preferred peers remain.
- Weighted scoring: candidates sorted by
  score = AvailableStorage - (CurrentAssets * assetCostWeight)
  (DefaultAssetCostWeight = 1 GiB) replacing the raw-storage sort, with a
  custom weight parameter for testing.

10 new tests in TopologyPlacementTests.cs cover all three features and their
edge cases. All 30 PlacementEngine tests continue to pass.
2026-02-25 08:59:18 -05:00
Joseph Doherty
e5f599f770 feat: add peer management with stream reassignment (Gap 2.4)
Add ProcessAddPeer/ProcessRemovePeer to JetStreamMetaGroup for
peer-driven stream reassignment. Includes AddKnownPeer/RemoveKnownPeer
tracked in a HashSet, RemovePeerFromStream, RemapStreamAssignment with
replacement-peer selection from the known pool, and DesiredReplicas on
RaftGroup for under-replication detection. Go ref: jetstream_cluster.go:2290-2439.
2026-02-25 08:53:05 -05:00
Joseph Doherty
0f8f34afaa feat: add entry application pipeline for meta and stream RAFT groups (Gap 2.7)
Extend ApplyEntry in JetStreamMetaGroup with PeerAdd/PeerRemove dispatch
to the existing Task 12 peer management methods. Add StreamMsgOp (Store,
Remove, Purge) and ConsumerOp (Ack, Nak, Deliver, Term, Progress) enums
plus ApplyStreamMsgOp and ApplyConsumerEntry methods to StreamReplicaGroup.
Extend ApplyCommittedEntriesAsync to parse smsg:/centry: command prefixes
and route to the new apply methods. Add MetaEntryType.PeerAdd/PeerRemove
enum values. 35 new tests in EntryApplicationTests.cs, all passing.
2026-02-25 08:38:21 -05:00
Joseph Doherty
f1f7bfbb51 feat: add ReadIndex for linearizable reads via quorum confirmation (Gap 8.7)
Add ReadIndexAsync() to RaftNode that confirms leader quorum by sending
heartbeat RPCs before returning CommitIndex, avoiding log growth for reads.
Add RaftReadIndexTests.cs with 13 tests covering normal path, non-leader
rejection, partitioned leader timeout, single-node fast path, log stability,
and peer LastContact refresh after heartbeat round.
2026-02-25 08:31:17 -05:00
Joseph Doherty
ae4cc6d613 feat: add randomized election timeout jitter (Gap 8.8)
Add RandomizedElectionTimeout() method to RaftNode returning TimeSpan in
[ElectionTimeoutMinMs, ElectionTimeoutMaxMs) using TotalMilliseconds (not
.Milliseconds component) to prevent synchronized elections after partitions.
Make Random injectable for deterministic testing. Fix SendHeartbeatAsync stub
in NatsRaftTransport and test-local transport implementations to satisfy the
IRaftTransport interface added in Gap 8.7.
2026-02-25 08:30:38 -05:00
Joseph Doherty
5a62100397 feat: add quorum check before proposing entries (Gap 8.6)
Add HasQuorum() to RaftNode that counts peers with LastContact within
2 × ElectionTimeoutMaxMs and returns true only when self + current peers
reaches majority. ProposeAsync now throws InvalidOperationException with
"no quorum" when HasQuorum() returns false, preventing a partitioned
leader from diverging the log. Add 14 tests in RaftQuorumCheckTests.cs
covering single-node, 3-node, 5-node, boundary window, and heartbeat
restore scenarios. Update RaftHealthTests.LastContact_updates_on_successful_replication
to avoid triggering the new quorum guard.
2026-02-25 08:26:37 -05:00
Joseph Doherty
5d3a3c73e9 feat: add leadership transfer via TimeoutNow RPC (Gap 8.4)
- Add RaftTimeoutNowWire: 16-byte wire type [8:term][8:leaderId] with
  Encode/Decode roundtrip, matching Go's sendTimeoutNow wire layout
- Add TimeoutNow(group) subject "$NRG.TN.{group}" to RaftSubjects
- Add SendTimeoutNowAsync to IRaftTransport; implement in both
  InMemoryRaftTransport (synchronous delivery) and NatsRaftTransport
  (publishes to $NRG.TN.{group})
- Add TransferLeadershipAsync(targetId, ct) to RaftNode: leader sends
  TimeoutNow RPC, blocks proposals via _transferInProgress flag, polls
  until target becomes leader or 2x election timeout elapses
- Add ReceiveTimeoutNow(term) to RaftNode: target immediately starts
  election bypassing pre-vote, updates term if sender's term is higher
- Block ProposeAsync with InvalidOperationException during transfer
- 15 tests in RaftLeadershipTransferTests covering wire roundtrip,
  ReceiveTimeoutNow behaviour, proposal blocking, target leadership,
  timeout on unreachable peer, and transfer flag lifecycle
2026-02-25 08:22:39 -05:00
Joseph Doherty
7e0bed2447 feat: add chunk-based snapshot streaming with CRC32 validation (Gap 8.3)
- Add SnapshotChunkEnumerator: IEnumerable<byte[]> that splits snapshot data
  into fixed-size chunks (default 65536 bytes) and computes CRC32 over the
  full payload for integrity validation during streaming transfer
- Add RaftInstallSnapshotChunkWire: 24-byte header + variable data wire type
  encoding [snapshotIndex:8][snapshotTerm:4][chunkIndex:4][totalChunks:4][crc32:4][data:N]
- Extend InstallSnapshotFromChunksAsync with optional expectedCrc32 parameter;
  validates assembled data against CRC32 before applying snapshot state, throwing
  InvalidDataException on mismatch to prevent corrupt state installation
- Fix stub IRaftTransport implementations in test files missing SendTimeoutNowAsync
- Fix incorrect role assertion in RaftLeadershipTransferTests (single-node quorum = 1)
- 15 new tests in RaftSnapshotStreamingTests covering enumeration, reassembly,
  CRC correctness, validation success/failure, and wire format roundtrip
2026-02-25 08:21:36 -05:00
Joseph Doherty
7434844a39 feat: optimize FilteredState and LoadMsg with block-aware search (Gap 1.10)
Add NumFiltered with generation-based result caching, block-range binary
search helpers (FindFirstBlockAtOrAfter, FindBlockForSequence), and
CheckSkipFirstBlock optimization. FilteredState uses the block index to
skip sealed blocks below the requested start sequence. LoadMsg gains an
O(log n) block fallback for cases where _messages does not hold the
sequence. Generation counter incremented on every write/delete/purge to
invalidate NumFiltered cache entries.

Add 18 tests in FileStoreFilterQueryTests covering literal and wildcard
FilteredState, start-sequence filtering across multiple blocks, NumFiltered
basic counts and cache invalidation, LoadMsg block search, and
CheckSkipFirstBlock behaviour.

Also fix pre-existing slopwatch violations in WriteCacheTests.cs: replace
Task.Delay(150)/Task.Delay(5) with TrackWriteAt using explicit past
timestamps, and restructure Dispose to avoid empty catch blocks.

Reference: golang/nats-server/server/filestore.go:3191 (FilteredState),
           golang/nats-server/server/filestore.go:8308 (LoadMsg).
2026-02-25 08:13:52 -05:00
Joseph Doherty
6d754635e7 feat: add bounded write cache with TTL eviction and background flush (Gap 1.8)
Add WriteCacheManager inner class to FileStore with ConcurrentDictionary-based
tracking of per-block write sizes and timestamps, a PeriodicTimer (500ms tick)
background eviction loop, TTL-based expiry, and size-cap enforcement. Add
TrackWrite/TrackWriteAt/EvictBlock/FlushAllAsync/DisposeAsync API. Integrate
into FileStore constructor, AppendAsync, StoreMsg, StoreRawMsg, and RotateBlock.
Add MaxCacheSize/CacheExpiry options to FileStoreOptions. 12 new tests cover
size tracking, TTL eviction, size-cap eviction, flush-all, and integration paths.

Reference: golang/nats-server/server/filestore.go:4443 (setupWriteCache),
           golang/nats-server/server/filestore.go:6148 (expireCache).
2026-02-25 08:12:06 -05:00
Joseph Doherty
cbe41d0efb feat: add SequenceSet for sparse deletion tracking with secure erase (Gap 1.7)
Replace HashSet<ulong> _deleted in MsgBlock with SequenceSet — a sorted-range
list that compresses contiguous deletions into (Start, End) intervals. Adds
O(log n) Contains/Add via binary search on range count, matching Go's avl.SequenceSet
semantics with a simpler implementation.

- Add SequenceSet.cs: sorted-range compressed set with Add/Remove/Contains/Count/Clear
  and IEnumerable<ulong> in ascending order. Binary search for all O(log n) ops.
- Replace HashSet<ulong> _deleted and _skipSequences in MsgBlock with SequenceSet.
- Add secureErase parameter (default false) to MsgBlock.Delete(): when true, payload
  bytes are overwritten with RandomNumberGenerator.Fill() before the delete record is
  written, making original content unrecoverable on disk.
- Update FileStore.DeleteInBlock() to propagate secureErase flag.
- Update FileStore.EraseMsg() to use secureErase: true via block layer instead of
  delegating to RemoveMsg().
- Add SequenceSetTests.cs: 25 tests covering Add, Remove, Contains, Count, range
  compression, gap filling, bridge merges, enumeration, boundary values, round-trip.
- Add FileStoreTombstoneTrackingTests.cs: 12 tests covering SequenceSet tracking in
  MsgBlock, tombstone persistence through RebuildIndex recovery, secure erase
  payload overwrite verification, and FileStore.EraseMsg integration.

Go reference: filestore.go:5267 (removeMsg), filestore.go:5890 (eraseMsg),
              avl/seqset.go (SequenceSet).
2026-02-25 08:02:44 -05:00
Joseph Doherty
646a5eb2ae feat: add atomic file writer with SemaphoreSlim for crash-safe state writes (Gap 1.6)
- Add AtomicFileWriter static helper: writes to {path}.{random}.tmp, flushes,
  then File.Move(overwrite:true) — concurrent-safe via unique temp path per call
- Add _stateWriteLock (SemaphoreSlim 1,1) to FileStore; dispose in both Dispose
  and DisposeAsync paths
- Promote WriteStreamState to async WriteStreamStateAsync using AtomicFileWriter
  under the write lock; FlushAllPending now returns Task
- Update IStreamStore.FlushAllPending signature to Task; fix MemStore no-op impl
- Fix FileStoreCrashRecoveryTests to await FlushAllPending (3 sync→async tests)
- Add 9 AtomicFileWriterTests covering create, no-tmp-remains, overwrite,
  concurrent safety, memory overload, empty data, and large payload
2026-02-25 07:55:33 -05:00
Joseph Doherty
5beeb1b3f6 feat: add checksum validation on MsgBlock read path (Gap 1.5)
Add _lastChecksum field and LastChecksum property to MsgBlock tracking
the XxHash64 checksum of the last written record (Go: msgBlock.lchk,
filestore.go:2204). Capture the checksum from the encoded record trailer
on every Write/WriteAt/WriteSkip call. Read-path validation happens
naturally through the existing MessageRecord.Decode checksum check.
2026-02-25 07:50:03 -05:00
Joseph Doherty
9ac29fc6f5 docs: add 93-gap implementation plan (8 phases, Tasks 1-93)
Bottom-up dependency ordering: FileStore/RAFT → Cluster/API → Consumer/Stream → Client/MQTT → Config/Gateway → Route/LeafNode → Account → Monitoring/WebSocket. Full test suite every 2 phases.
2026-02-25 07:47:11 -05:00
Joseph Doherty
8e6a53b7c0 docs: add design for remaining 93 gaps across 8 phases
Bottom-up dependency ordering: FileStore+RAFT → Cluster+API →
Consumer+Stream → Client+MQTT → Config+Gateway → Route+LeafNode →
Account → Monitoring+WebSocket. Codex-reviewed with fixes for
async locks, token-based filtering, linearizable ReadIndex, and
golden fixture tests for wire format compatibility.
2026-02-25 06:09:43 -05:00
Joseph Doherty
69785b191e docs: mark 29 production gaps as IMPLEMENTED in structuregaps.md
All Tier 1 (CRITICAL) and nearly all Tier 2 (HIGH) gaps closed by
the 27-task production gaps plan. Updated priority summary, test
counts (5,914 methods / ~6,532 parameterized), and overview.
2026-02-25 05:43:52 -05:00
Joseph Doherty
502481b6ba docs: update test_parity.db — add production gaps test mappings
Added 106 new entries to dotnet_tests table covering 15 new test files
created during the production gaps implementation phase:
- JetStreamInflightTrackingTests (9), JetStreamLeadershipTests (7)
- RedeliveryTrackerPriorityQueueTests (8), AckProcessorEnhancedTests (9)
- WaitingRequestQueueTests (8), ConsumerPauseResumeTests (8)
- PriorityGroupPinningTests (7), InterestRetentionTests (7)
- MirrorSourceRetryTests (5), FlushCoalescingTests (5)
- StallGateTests (6), WriteTimeoutTests (6)
- MqttPersistenceTests (4), SignalHandlerTests (5)
- ImplicitDiscoveryTests (12)

Total dotnet_tests rows: 5808 → 5914
2026-02-25 03:54:31 -05:00
Joseph Doherty
bfe7a71fcd feat(cluster): add implicit route and gateway discovery via INFO gossip
Implements ProcessImplicitRoute and ForwardNewRouteInfoToKnownServers on RouteManager,
and ProcessImplicitGateway on GatewayManager, mirroring Go server/route.go and
server/gateway.go INFO gossip-based peer discovery. Adds ConnectUrls to ServerInfo
and introduces GatewayInfo model. 12 new unit tests in ImplicitDiscoveryTests.
2026-02-25 03:05:35 -05:00
Joseph Doherty
e09835ca70 feat(config): add SIGHUP signal handler and config reload validation
Implements SignalHandler (PosixSignalRegistration for SIGHUP) and
ReloadFromOptionsAsync/ReloadFromOptionsResult on ConfigReloader for
in-memory options comparison without reading a config file.
Ports Go server/signal_unix.go handleSignals and server/reload.go Reload.
2026-02-25 02:54:13 -05:00
Joseph Doherty
b7bac8e68e feat(mqtt): add JetStream-backed session and retained message persistence
Add optional IStreamStore backing to MqttSessionStore and MqttRetainedStore,
enabling session and retained message state to survive process restarts via
JetStream persistence. Includes ConnectAsync/SaveSessionAsync for session
lifecycle, SetRetainedAsync/GetRetainedAsync with cleared-topic tombstone
tracking, and 4 new parity tests covering persist/restart/clear semantics.
2026-02-25 02:42:02 -05:00
Joseph Doherty
7468401bd0 feat(client): add write timeout recovery with per-kind policies
Add WriteTimeoutPolicy enum, FlushResult record struct, and
GetWriteTimeoutPolicy static method as nested types in NatsClient.
Models Go's client.go per-kind timeout handling: CLIENT kind closes
on timeout, ROUTER/GATEWAY/LEAF use TCP-level flush recovery.
2026-02-25 02:37:48 -05:00
Joseph Doherty
494d327282 feat(client): add stall gate backpressure for slow consumers
Adds StallGate nested class inside NatsClient that blocks producers when
the outbound buffer exceeds 75% of maxPending capacity, modelling Go's
stc channel and stalledRoute handling in server/client.go.
2026-02-25 02:35:57 -05:00
Joseph Doherty
36e23fa31d feat(client): add flush coalescing to reduce write syscalls
Adds MaxFlushPending constant (10), SignalFlushPending/ResetFlushPending
helpers, and ShouldCoalesceFlush property to NatsClient, matching Go's
maxFlushPending / fsp flush-signal coalescing in server/client.go.
2026-02-25 02:33:44 -05:00
Joseph Doherty
8fa16d59d2 feat(mirror): add exponential backoff retry, gap detection, and error tracking
Exposes public RecordFailure/RecordSuccess/GetRetryDelay (no-jitter, deterministic)
on MirrorCoordinator, plus RecordSourceSeq with HasGap/GapStart/GapEnd properties
and SetError/ClearError/HasError/ErrorMessage for error state. Makes IsDuplicate
and RecordMsgId public on SourceCoordinator and adds PruneDedupWindow(DateTimeOffset)
for explicit-cutoff dedup window pruning. Adds 5 unit tests in MirrorSourceRetryTests.
2026-02-25 02:30:55 -05:00
Joseph Doherty
955d568423 feat(stream): add InterestRetentionPolicy for per-consumer ack tracking
Implements Interest retention policy logic that tracks which consumers
are interested in each message subject and whether they have acknowledged
delivery, retaining messages until all interested consumers have acked.
Go reference: stream.go checkInterestState/noInterest.
2026-02-25 02:25:39 -05:00
Joseph Doherty
2eaa736b21 feat(consumer): add priority group pin ID management
Add AssignPinId, ValidatePinId, and UnassignPinId to PriorityGroupManager,
plus CurrentPinId tracking on PriorityGroup, porting Go consumer.go
(setPinnedTimer, assignNewPinId) pin ID semantics. Covered by 7 new tests.
2026-02-25 02:23:02 -05:00
Joseph Doherty
dcc3e4460e feat(consumer): add pause/resume with auto-resume timer
Adds PauseUntilUtc to ConsumerHandle, a new Pause(DateTime) overload,
Resume, IsPaused, and GetPauseUntil to ConsumerManager. A System.Threading.Timer
fires when the deadline passes and calls AutoResume, raising OnAutoResumed so
tests can synchronise via SemaphoreSlim instead of Task.Delay. ConsumerManager
now implements IDisposable to clean up outstanding timers. Timer is also
cancelled on explicit Resume and Delete.

Go reference: consumer.go (pauseConsumer / resumeConsumer / isPaused).
2026-02-25 02:21:08 -05:00
Joseph Doherty
8fb80acafe feat(consumer): add WaitingRequestQueue with expiry and batch/maxBytes tracking
Implements a FIFO pull-request queue (WaitingRequestQueue) with per-request
mutable batch countdown and byte budget tracking, plus RemoveExpired cleanup.
Complements the existing priority-based PullRequestWaitQueue for pull consumer delivery.
Go reference: consumer.go waitQueue / processNextMsgRequest (lines 4276-4450).
2026-02-25 02:18:01 -05:00
Joseph Doherty
3183fd2dc7 feat(consumer): enhance AckProcessor with NAK delay, TERM, WPI, and maxAckPending
Add RedeliveryTracker constructor overload, Register(ulong, string) for
tracker-based ack-wait, ProcessAck(ulong) payload-free overload, GetDeadline,
CanRegister for maxAckPending enforcement, ParseAckType static parser, and
AckType enum. All existing API signatures are preserved; 9 new tests added in
AckProcessorEnhancedTests.cs with no regressions to existing 10 tests.
2026-02-25 02:15:46 -05:00
Joseph Doherty
55de052009 feat(consumer): add PriorityQueue-based RedeliveryTracker overloads
Add new constructor, Schedule(DateTimeOffset), GetDue(DateTimeOffset),
IncrementDeliveryCount, IsMaxDeliveries(), and GetBackoffDelay() to
RedeliveryTracker without breaking existing API. Uses PriorityQueue<ulong,
DateTimeOffset> for deadline-ordered dispatch mirroring Go consumer.go rdq.
2026-02-25 02:12:37 -05:00
Joseph Doherty
7e4a23a0b7 feat(cluster): implement leadership transition with inflight cleanup
Add ProcessLeaderChange(bool) method and OnLeaderChange event to JetStreamMetaGroup.
Refactor StepDown() to delegate inflight clearing through ProcessLeaderChange,
enabling subscribers to react to leadership transitions.
Go reference: jetstream_cluster.go:7001-7074 processLeaderChange.
2026-02-25 02:08:37 -05:00
Joseph Doherty
63d4e43178 feat(cluster): add structured inflight proposal tracking with ops counting
Replace the simple string-keyed inflight dictionaries with account-scoped
ConcurrentDictionary<string, Dictionary<string, InflightInfo>> structures.
Adds InflightInfo record with OpsCount for duplicate proposal tracking,
TrackInflight/RemoveInflight/IsInflight methods for streams and consumers,
and ClearAllInflight(). Updates existing Propose* methods to use $G account.
Go reference: jetstream_cluster.go:1193-1278.
2026-02-25 02:05:53 -05:00
Joseph Doherty
1a9b6a9175 feat(cluster): add stream/consumer assignment processing with validation
Add 5 validated Process* methods to JetStreamMetaGroup for stream and
consumer assignment processing: ProcessStreamAssignment, ProcessUpdateStreamAssignment,
ProcessStreamRemoval, ProcessConsumerAssignment, and ProcessConsumerRemoval.
Each returns a bool indicating success, with validation guards matching
Go reference jetstream_cluster.go:4541-5925. Includes 12 new unit tests.
2026-02-25 01:57:43 -05:00
Joseph Doherty
7f9ee493b6 feat(cluster): add JetStreamClusterMonitor for meta RAFT entry processing
Implements the background loop (Go: monitorCluster) that reads RaftLogEntry
items from a channel and dispatches assignStream, removeStream, assignConsumer,
removeConsumer, and snapshot operations to JetStreamMetaGroup.

- Add five public mutation helpers to JetStreamMetaGroup (AddStreamAssignment,
  RemoveStreamAssignment, AddConsumerAssignment, RemoveConsumerAssignment,
  ReplaceAllAssignments) used by the monitor path
- JetStreamClusterMonitor: async loop over ChannelReader<RaftLogEntry>, JSON
  dispatch, ILogger injection with NullLogger default, malformed entries logged
  and skipped rather than aborting the loop
- WaitForProcessedAsync uses Monitor.PulseAll for race-free test synchronisation
  with no Task.Delay — satisfies slopwatch SW003/SW004 rules
- 10 new targeted tests all pass; 1101 cluster regression tests unchanged
2026-02-25 01:54:04 -05:00
Joseph Doherty
9fdc931ff5 feat(cluster): add MetaSnapshotCodec with S2 compression and versioned format
Implements binary codec for meta-group snapshots: 2-byte little-endian version
header followed by S2-compressed JSON of the stream assignment map. Adds
[JsonObjectCreationHandling(Populate)] to StreamAssignment.Consumers so the
getter-only dictionary is populated in-place during deserialization. 8 tests
covering round-trip, compression ratio, field fidelity, multi-consumer restore,
version rejection, and truncation guard.

Go reference: jetstream_cluster.go:2075-2145 (encodeMetaSnapshot/decodeMetaSnapshot)
2026-02-25 01:46:32 -05:00
Joseph Doherty
e0f5fe7150 feat(raft): implement joint consensus for safe two-phase membership changes
Adds BeginJointConsensus, CommitJointConsensus, and CalculateJointQuorum to
RaftNode per Raft paper Section 4. During a joint configuration transition
quorum requires majority from BOTH Cold and Cnew; CommitJointConsensus
finalizes Cnew as the sole active configuration. The existing single-phase
ProposeAddPeerAsync/ProposeRemovePeerAsync are unchanged. Includes 16 new
tests covering flag behaviour, quorum boundaries, idempotent commit, and
backward-compatibility with the existing membership API.
2026-02-25 01:40:56 -05:00
Joseph Doherty
a0894e7321 fix(raft): address WAL code quality issues — CRC perf, DRY, safety, assertions
- SyncAsync: remove redundant FlushAsync, use single Flush(flushToDisk:true)
- ComputeCrc: use incremental Crc32.Append to avoid contiguous buffer heap allocation
- Load: cast pos+length to long to guard against int overflow in bounds check
- AppendAsync: delegate to WriteEntryTo (DRY — eliminates duplicated record-building logic)
- Load: extract ParseEntries static helper to eliminate goto pattern with early returns
- Entries: change return type from IEnumerable to IReadOnlyList for index access and Count property
- RaftNode.PersistAsync: remove redundant term.txt write (meta.json now owns term+votedFor)
- RaftWalTests: tighten ShouldBeGreaterThanOrEqualTo(1) -> ShouldBe(1) in truncation/CRC tests;
  use .Count property directly on IReadOnlyList instead of .Count() LINQ extension
2026-02-25 01:36:41 -05:00
Joseph Doherty
c9ac4b9918 feat(raft): add binary WAL and VotedFor persistence
Implements a binary write-ahead log (RaftWal) for durable RAFT entry
storage, replacing in-memory-only semantics. The WAL uses a magic header
("NWAL" + version), length-prefixed records with per-record CRC32
integrity checking, and CompactAsync with atomic temp-file rename.
Load() tolerates truncated or corrupt tail records for crash safety.

Also fixes RaftNode to persist and reload TermState.VotedFor via a
meta.json file alongside term.txt, ensuring vote durability across
restarts. Falls back gracefully to legacy term.txt when meta.json is
absent.

6 new tests in RaftWalTests: persist/recover, compact, truncation
tolerance, VotedFor round-trip, empty WAL, and CRC corruption.
All 458 Raft tests pass.
2026-02-25 01:31:23 -05:00
Joseph Doherty
3ab683489e feat(filestore): implement EncodedStreamState, UpdateConfig, ResetState, fix Delete signature
Complete IStreamStore Batch 2 — all remaining interface methods now have
FileStore implementations:
- EncodedStreamState: placeholder returning empty (full codec in Task 9)
- UpdateConfig: no-op placeholder for config hot-reload
- ResetState: no-op (state derived from blocks on construction)
- Delete(bool): now calls Stop() first and matches interface signature
2026-02-25 01:22:43 -05:00
Joseph Doherty
be432c3224 feat(filestore): implement StoreRawMsg, LoadPrevMsg, Type, Stop on FileStore
Complete IStreamStore Batch 1 — all core operations now have FileStore
implementations instead of throwing NotSupportedException:
- StoreRawMsg: caller-specified seq/ts for replication/mirroring
- LoadPrevMsg: backward scan for message before given sequence
- Type: returns StorageType.File
- Stop: flush + dispose blocks, reject further writes

14 new tests in FileStoreStreamStoreTests.
2026-02-25 01:20:03 -05:00
Joseph Doherty
f402fd364f feat(filestore): implement FlushAllPending with atomic stream state writes
Add FlushAllPending() to FileStore — flushes active MsgBlock to disk and
writes a stream.state checkpoint atomically (write-to-temp + rename).
Go ref: filestore.go:5783 (flushPendingWritesUnlocked / writeFullState).

5 new tests in FileStoreCrashRecoveryTests: flush, state file, idempotent,
TTL recovery, truncated block handling.
2026-02-25 01:14:34 -05:00
Joseph Doherty
f031edb97e feat(filestore): implement FlushAllPending with atomic stream state writes
Add FlushAllPending() to FileStore, fulfilling the IStreamStore interface
contract. The method flushes the active MsgBlock to disk and atomically
writes a stream.state checkpoint using write-to-temp + rename, matching
Go's flushPendingWritesUnlocked / writeFullState pattern.

Add FileStoreCrashRecoveryTests with 5 tests covering:
- FlushAllPending flushes block data to .blk file
- FlushAllPending writes a valid atomic stream.state JSON checkpoint
- FlushAllPending is idempotent (second call overwrites with latest state)
- Recovery prunes messages backdated past the MaxAgeMs cutoff
- Recovery handles a tail-truncated block without throwing

Reference: golang/nats-server/server/filestore.go:5783-5842
2026-02-25 01:07:32 -05:00
Joseph Doherty
10e2c4ef22 fix(test): use precise InvalidDataException assertion in wrong-key encryption test
Code quality review feedback: Should.Throw<Exception> was too broad,
changed to Should.Throw<InvalidDataException> to match the actual
exception type thrown when AEAD decryption fails with wrong key.
2026-02-25 00:51:37 -05:00
Joseph Doherty
f143295392 feat(filestore): wire AeadEncryptor into MsgBlock for at-rest encryption
Add FileStoreEncryptionTests covering ChaCha20-Poly1305 and AES-GCM
round-trips and wrong-key rejection for the FSV2 AEAD path. Fix
RestorePayload to wrap CryptographicException from AEAD decryption as
InvalidDataException so RecoverBlocks correctly propagates key-mismatch
failures instead of silently swallowing them.
2026-02-25 00:43:57 -05:00
Joseph Doherty
6c268c4143 docs: remove inline mapping columns from go_tests and dotnet_tests
Drop redundant columns now that test_mappings is the single source of truth:
- go_tests: drop dotnet_test, dotnet_file
- dotnet_tests: drop go_test
- VACUUM to reclaim space
2026-02-25 00:25:12 -05:00
Joseph Doherty
8c8dab735c docs: map 1,765 additional dotnet tests to Go counterparts
Multi-strategy matching:
- Source comment references (// Go: TestXxx): +1,477 mappings
- In-body Go test name references: +249 mappings
- Name similarity with keyword overlap: +368 mappings
- File-context + lower threshold for parity files: +133 mappings

Final state:
- 4,250 / 5,808 dotnet tests mapped (73.2%), up from 2,485 (42.8%)
- 1,558 remaining are genuinely original .NET tests with no Go equivalent
- 59,516 total test_mappings rows
2026-02-25 00:22:21 -05:00