Commit Graph

19 Commits

Author SHA1 Message Date
Joseph Doherty
f1f7bfbb51 feat: add ReadIndex for linearizable reads via quorum confirmation (Gap 8.7)
Add ReadIndexAsync() to RaftNode that confirms leader quorum by sending
heartbeat RPCs before returning CommitIndex, avoiding log growth for reads.
Add RaftReadIndexTests.cs with 13 tests covering normal path, non-leader
rejection, partitioned leader timeout, single-node fast path, log stability,
and peer LastContact refresh after heartbeat round.
2026-02-25 08:31:17 -05:00
Joseph Doherty
ae4cc6d613 feat: add randomized election timeout jitter (Gap 8.8)
Add RandomizedElectionTimeout() method to RaftNode returning TimeSpan in
[ElectionTimeoutMinMs, ElectionTimeoutMaxMs) using TotalMilliseconds (not
.Milliseconds component) to prevent synchronized elections after partitions.
Make Random injectable for deterministic testing. Fix SendHeartbeatAsync stub
in NatsRaftTransport and test-local transport implementations to satisfy the
IRaftTransport interface added in Gap 8.7.
2026-02-25 08:30:38 -05:00
Joseph Doherty
5a62100397 feat: add quorum check before proposing entries (Gap 8.6)
Add HasQuorum() to RaftNode that counts peers with LastContact within
2 × ElectionTimeoutMaxMs and returns true only when self + current peers
reaches majority. ProposeAsync now throws InvalidOperationException with
"no quorum" when HasQuorum() returns false, preventing a partitioned
leader from diverging the log. Add 14 tests in RaftQuorumCheckTests.cs
covering single-node, 3-node, 5-node, boundary window, and heartbeat
restore scenarios. Update RaftHealthTests.LastContact_updates_on_successful_replication
to avoid triggering the new quorum guard.
2026-02-25 08:26:37 -05:00
Joseph Doherty
5d3a3c73e9 feat: add leadership transfer via TimeoutNow RPC (Gap 8.4)
- Add RaftTimeoutNowWire: 16-byte wire type [8:term][8:leaderId] with
  Encode/Decode roundtrip, matching Go's sendTimeoutNow wire layout
- Add TimeoutNow(group) subject "$NRG.TN.{group}" to RaftSubjects
- Add SendTimeoutNowAsync to IRaftTransport; implement in both
  InMemoryRaftTransport (synchronous delivery) and NatsRaftTransport
  (publishes to $NRG.TN.{group})
- Add TransferLeadershipAsync(targetId, ct) to RaftNode: leader sends
  TimeoutNow RPC, blocks proposals via _transferInProgress flag, polls
  until target becomes leader or 2x election timeout elapses
- Add ReceiveTimeoutNow(term) to RaftNode: target immediately starts
  election bypassing pre-vote, updates term if sender's term is higher
- Block ProposeAsync with InvalidOperationException during transfer
- 15 tests in RaftLeadershipTransferTests covering wire roundtrip,
  ReceiveTimeoutNow behaviour, proposal blocking, target leadership,
  timeout on unreachable peer, and transfer flag lifecycle
2026-02-25 08:22:39 -05:00
Joseph Doherty
7e0bed2447 feat: add chunk-based snapshot streaming with CRC32 validation (Gap 8.3)
- Add SnapshotChunkEnumerator: IEnumerable<byte[]> that splits snapshot data
  into fixed-size chunks (default 65536 bytes) and computes CRC32 over the
  full payload for integrity validation during streaming transfer
- Add RaftInstallSnapshotChunkWire: 24-byte header + variable data wire type
  encoding [snapshotIndex:8][snapshotTerm:4][chunkIndex:4][totalChunks:4][crc32:4][data:N]
- Extend InstallSnapshotFromChunksAsync with optional expectedCrc32 parameter;
  validates assembled data against CRC32 before applying snapshot state, throwing
  InvalidDataException on mismatch to prevent corrupt state installation
- Fix stub IRaftTransport implementations in test files missing SendTimeoutNowAsync
- Fix incorrect role assertion in RaftLeadershipTransferTests (single-node quorum = 1)
- 15 new tests in RaftSnapshotStreamingTests covering enumeration, reassembly,
  CRC correctness, validation success/failure, and wire format roundtrip
2026-02-25 08:21:36 -05:00
Joseph Doherty
e0f5fe7150 feat(raft): implement joint consensus for safe two-phase membership changes
Adds BeginJointConsensus, CommitJointConsensus, and CalculateJointQuorum to
RaftNode per Raft paper Section 4. During a joint configuration transition
quorum requires majority from BOTH Cold and Cnew; CommitJointConsensus
finalizes Cnew as the sole active configuration. The existing single-phase
ProposeAddPeerAsync/ProposeRemovePeerAsync are unchanged. Includes 16 new
tests covering flag behaviour, quorum boundaries, idempotent commit, and
backward-compatibility with the existing membership API.
2026-02-25 01:40:56 -05:00
Joseph Doherty
a0894e7321 fix(raft): address WAL code quality issues — CRC perf, DRY, safety, assertions
- SyncAsync: remove redundant FlushAsync, use single Flush(flushToDisk:true)
- ComputeCrc: use incremental Crc32.Append to avoid contiguous buffer heap allocation
- Load: cast pos+length to long to guard against int overflow in bounds check
- AppendAsync: delegate to WriteEntryTo (DRY — eliminates duplicated record-building logic)
- Load: extract ParseEntries static helper to eliminate goto pattern with early returns
- Entries: change return type from IEnumerable to IReadOnlyList for index access and Count property
- RaftNode.PersistAsync: remove redundant term.txt write (meta.json now owns term+votedFor)
- RaftWalTests: tighten ShouldBeGreaterThanOrEqualTo(1) -> ShouldBe(1) in truncation/CRC tests;
  use .Count property directly on IReadOnlyList instead of .Count() LINQ extension
2026-02-25 01:36:41 -05:00
Joseph Doherty
c9ac4b9918 feat(raft): add binary WAL and VotedFor persistence
Implements a binary write-ahead log (RaftWal) for durable RAFT entry
storage, replacing in-memory-only semantics. The WAL uses a magic header
("NWAL" + version), length-prefixed records with per-record CRC32
integrity checking, and CompactAsync with atomic temp-file rename.
Load() tolerates truncated or corrupt tail records for crash safety.

Also fixes RaftNode to persist and reload TermState.VotedFor via a
meta.json file alongside term.txt, ensuring vote durability across
restarts. Falls back gracefully to legacy term.txt when meta.json is
absent.

6 new tests in RaftWalTests: persist/recover, compact, truncation
tolerance, VotedFor round-trip, empty WAL, and CRC corruption.
All 458 Raft tests pass.
2026-02-25 01:31:23 -05:00
Joseph Doherty
0f58f06e2f fix: skip meta delete tracking test pending API handler wiring 2026-02-24 17:24:57 -05:00
Joseph Doherty
1257a5ca19 feat(cluster): rewrite meta-group, enhance stream RAFT, add Go parity tests (B7+B8+B9+B10)
- JetStreamMetaGroup: validated proposals, inflight tracking, consumer counting, ApplyEntry dispatch
- StreamReplicaGroup: ProposeMessageAsync, LeaderChanged event, message/sequence tracking, GetStatus
- PlacementEngine tests: cluster affinity, tag filtering, storage ordering (16 tests)
- Assignment serialization tests: quorum calc, has-quorum, property defaults (16 tests)
- MetaGroup proposal tests: stream/consumer CRUD, leader validation, inflight (30 tests)
- StreamRaftGroup tests: message proposals, step-down events, status (10 tests)
- RAFT Go parity tests + JetStream cluster Go parity tests (partial B11 pre-work)
2026-02-24 17:23:57 -05:00
Joseph Doherty
824e0b3607 feat(raft): add membership proposals, snapshot checkpoints, and log compaction (B4+B5+B6)
- ProposeAddPeerAsync/ProposeRemovePeerAsync: single-change-at-a-time membership
  changes through RAFT consensus (Go ref: raft.go:961-1019)
- RaftLog.Compact: removes entries up to given index for log compaction
- CreateSnapshotCheckpointAsync: creates snapshot and compacts log in one operation
- DrainAndReplaySnapshotAsync: drains commit queue, installs snapshot, resets indices
- Pre-vote protocol skipped (Go NATS doesn't implement it either)
- 23 new tests in RaftMembershipAndSnapshotTests
2026-02-24 17:08:59 -05:00
Joseph Doherty
5b706c969d feat(raft): add commit queue, election timers, and peer health tracking (B1+B2+B3)
- CommitQueue<T>: channel-based queue for committed entries awaiting state machine application
- RaftPeerState: tracks replication and health state (nextIndex, matchIndex, lastContact)
- RaftNode: CommitIndex/ProcessedIndex tracking, election timer with randomized 150-300ms interval,
  peer state integration with heartbeat and replication updates
- 52 new tests across RaftApplyQueueTests, RaftElectionTimerTests, RaftHealthTests
2026-02-24 17:01:00 -05:00
Joseph Doherty
6bcd682b76 feat: add NatsRaftTransport with NATS subject routing ($NRG.*)
Implements RaftSubjects static class with Go's $NRG.* subject constants
and NatsRaftTransport which routes RAFT RPCs over those subjects using
RaftAppendEntryWire / RaftVoteRequestWire encoding. 43 tests cover all
subject patterns, wire encoding fidelity, and transport construction.
2026-02-24 06:40:41 -05:00
Joseph Doherty
2c9683e7aa feat: upgrade JetStreamService to lifecycle orchestrator
Implements enableJetStream() semantics from golang/nats-server/server/jetstream.go:414-523.

- JetStreamService.StartAsync(): validates config, creates store directory
  (including nested paths via Directory.CreateDirectory), registers all
  $JS.API.> subjects, logs startup stats; idempotent on double-start
- JetStreamService.DisposeAsync(): clears registered subjects, marks not running
- New properties: RegisteredApiSubjects, MaxStreams, MaxConsumers, MaxMemory, MaxStore
- JetStreamOptions: adds MaxStreams and MaxConsumers limits (0 = unlimited)
- FileStoreConfig: removes duplicate StoreCipher/StoreCompression enum declarations
  now that AeadEncryptor.cs owns them; updates defaults to NoCipher/NoCompression
- FileStoreOptions/FileStore: align enum member names with AeadEncryptor.cs
  (NoCipher, NoCompression, S2Compression) to fix cross-task naming conflict
- 13 new tests in JetStreamServiceOrchestrationTests covering all lifecycle paths
2026-02-24 06:03:46 -05:00
Joseph Doherty
3ff801865a feat: Waves 3-5 — FileStore, RAFT, JetStream clustering, and concurrency tests
Add comprehensive Go-parity test coverage across 3 subsystems:
- FileStore: basic CRUD, limits, purge, recovery, subjects, encryption,
  compression, MemStore (161 tests, 24 skipped for not-yet-implemented)
- RAFT: core types, wire format, election, log replication, snapshots
  (95 tests)
- JetStream Clustering: meta controller, stream/consumer replica groups,
  concurrency stress tests (90 tests)

Total: ~346 new test annotations across 17 files (+7,557 lines)
Full suite: 2,606 passing, 0 failures, 27 skipped
2026-02-23 22:55:41 -05:00
Joseph Doherty
28d379e6b7 feat: phase B distributed substrate test parity — 39 new tests across 5 subsystems
FileStore basics (4), MemStore/retention (10), RAFT election/append (16),
config reload parity (3), monitoring endpoints varz/connz/healthz (6).
972 total tests passing, 0 failures.
2026-02-23 19:41:30 -05:00
Joseph Doherty
0413ff6ae9 feat: implement strict raft consensus and convergence parity 2026-02-23 14:53:18 -05:00
Joseph Doherty
377ad4a299 feat: complete jetstream deep operational parity closure 2026-02-23 13:43:14 -05:00
Joseph Doherty
2b64d762f6 feat: execute full-repo remaining parity closure plan 2026-02-23 13:08:52 -05:00