155 lines
6.5 KiB
Markdown
155 lines
6.5 KiB
Markdown
# JetStream Deep Operational Parity Design
|
|
|
|
**Date:** 2026-02-23
|
|
**Status:** Approved
|
|
**Scope:** Identify and close remaining JetStream deep operational parity gaps versus Go, including behavior-level semantics, storage durability, RAFT/cluster behavior, and documentation drift reconciliation.
|
|
|
|
## 1. Architecture and Scope Boundary
|
|
|
|
### Scope definition
|
|
This cycle is JetStream-focused and targets deep operational parity:
|
|
|
|
1. Stream runtime semantics
|
|
2. Consumer runtime/state machine semantics
|
|
3. Storage durability semantics
|
|
4. RAFT/network and JetStream clustering semantics
|
|
5. Documentation/evidence reconciliation
|
|
|
|
`JETSTREAM (internal)` is treated as implemented behavior (code + tests present). Any stale doc line stating it is unimplemented is handled as documentation drift, not a re-implementation target.
|
|
|
|
### Parity control model
|
|
Each feature area is tracked with a truth matrix:
|
|
|
|
1. Behavior
|
|
- Go-equivalent runtime behavior exists in observable server operation.
|
|
|
|
2. Tests
|
|
- Contract-positive plus negative/edge tests validate behavior and detect regressions beyond hook-level checks.
|
|
|
|
3. Docs
|
|
- `differences.md` and parity artifacts accurately reflect validated behavior.
|
|
|
|
A feature closes only when Behavior + Tests + Docs are all complete.
|
|
|
|
### Ordered implementation layers
|
|
1. Stream runtime semantics
|
|
2. Consumer state machine semantics
|
|
3. Storage durability semantics
|
|
4. RAFT and cluster governance semantics
|
|
5. Documentation synchronization
|
|
|
|
## 2. Component Plan
|
|
|
|
### A. Stream runtime semantics
|
|
Primary files:
|
|
- `src/NATS.Server/JetStream/StreamManager.cs`
|
|
- `src/NATS.Server/JetStream/Models/StreamConfig.cs`
|
|
- `src/NATS.Server/JetStream/Publish/JetStreamPublisher.cs`
|
|
- `src/NATS.Server/JetStream/Publish/PublishPreconditions.cs`
|
|
- `src/NATS.Server/JetStream/Api/Handlers/StreamApiHandlers.cs`
|
|
- `src/NATS.Server/JetStream/Validation/JetStreamConfigValidator.cs`
|
|
|
|
Focus:
|
|
- retention semantics (`Limits/Interest/WorkQueue`) under live publish/delete flows
|
|
- `MaxAge`, `MaxMsgsPer`, `MaxMsgSize`, dedupe-window semantics under mixed workloads
|
|
- guard behavior (`sealed`, `deny_delete`, `deny_purge`) with contract-accurate errors
|
|
- runtime (not parse-only) behavior for transform/republish/direct-related features
|
|
|
|
### B. Consumer runtime/state machine semantics
|
|
Primary files:
|
|
- `src/NATS.Server/JetStream/ConsumerManager.cs`
|
|
- `src/NATS.Server/JetStream/Consumers/AckProcessor.cs`
|
|
- `src/NATS.Server/JetStream/Consumers/PullConsumerEngine.cs`
|
|
- `src/NATS.Server/JetStream/Consumers/PushConsumerEngine.cs`
|
|
- `src/NATS.Server/JetStream/Models/ConsumerConfig.cs`
|
|
- `src/NATS.Server/JetStream/Api/Handlers/ConsumerApiHandlers.cs`
|
|
|
|
Focus:
|
|
- deliver-policy start resolution and cursor transitions
|
|
- ack floor and redelivery determinism (`AckPolicy.*`, backoff, max-deliver)
|
|
- flow control, rate limiting, replay timing semantics across longer scenarios
|
|
|
|
### C. Storage durability semantics
|
|
Primary files:
|
|
- `src/NATS.Server/JetStream/Storage/FileStore.cs`
|
|
- `src/NATS.Server/JetStream/Storage/FileStoreBlock.cs`
|
|
- `src/NATS.Server/JetStream/Storage/FileStoreOptions.cs`
|
|
- `src/NATS.Server/JetStream/Storage/IStreamStore.cs`
|
|
- `src/NATS.Server/JetStream/Storage/MemStore.cs`
|
|
|
|
Focus:
|
|
- durable block/index invariants under restart and prune/rewrite cycles
|
|
- compression/encryption behavior from transform stubs to parity-meaningful persistence semantics
|
|
- TTL and index consistency guarantees for large and long-running data sets
|
|
|
|
### D. RAFT and JetStream cluster semantics
|
|
Primary files:
|
|
- `src/NATS.Server/Raft/RaftNode.cs`
|
|
- `src/NATS.Server/Raft/RaftReplicator.cs`
|
|
- `src/NATS.Server/Raft/RaftTransport.cs`
|
|
- `src/NATS.Server/Raft/RaftRpcContracts.cs`
|
|
- `src/NATS.Server/JetStream/Cluster/JetStreamMetaGroup.cs`
|
|
- `src/NATS.Server/JetStream/Cluster/StreamReplicaGroup.cs`
|
|
- `src/NATS.Server/JetStream/Cluster/AssetPlacementPlanner.cs`
|
|
- integration touchpoints in `src/NATS.Server/NatsServer.cs`
|
|
|
|
Focus:
|
|
- move from hook-level consensus behaviors to term/quorum-driven outcomes
|
|
- snapshot transfer and membership semantics affecting real commit/placement behavior
|
|
- cross-cluster JetStream behavior validated beyond counter-style forwarding checks
|
|
|
|
### E. Evidence and documentation reconciliation
|
|
Primary files:
|
|
- `differences.md`
|
|
- `docs/plans/2026-02-23-jetstream-remaining-parity-map.md`
|
|
- `docs/plans/2026-02-23-jetstream-remaining-parity-verification.md`
|
|
|
|
Focus:
|
|
- remove stale contradictory lines and align notes with verified implementation state
|
|
- keep all parity claims traceable to tests and behavior evidence
|
|
|
|
## 3. Data Flow and Behavioral Contracts
|
|
|
|
1. Publish path contract
|
|
- precondition checks occur before persistence mutation
|
|
- stream policy outcomes are atomic from client perspective
|
|
- no partial state exposure on failed publish paths
|
|
|
|
2. Consumer path contract
|
|
- deterministic cursor initialization and progression
|
|
- ack/redelivery/backoff semantics form a single coherent state machine
|
|
- push/pull engines preserve contract parity under sustained load and restart boundaries
|
|
|
|
3. Storage contract
|
|
- persisted data and indices roundtrip across restarts without sequence/index drift
|
|
- pruning, ttl, and limit enforcement preserve state invariants (`first/last/messages/bytes`)
|
|
- compression/encryption boundaries are reversible and version-safe
|
|
|
|
4. RAFT/cluster contract
|
|
- append/commit behavior is consensus-gated (term/quorum aware)
|
|
- heartbeat and snapshot mechanics drive observable follower convergence
|
|
- placement/governance decisions reflect committed cluster state
|
|
|
|
5. Documentation contract
|
|
- JetStream table rows and summary notes in `differences.md` must agree
|
|
- `JETSTREAM (internal)` status remains `Y` with explicit verification evidence
|
|
|
|
## 4. Error Handling, Testing Strategy, and Completion Gates
|
|
|
|
### Error handling
|
|
1. Keep JetStream-specific error semantics and codes intact.
|
|
2. Fail closed on durability/consensus invariant breaches.
|
|
3. Reject partial cluster mutations when consensus prerequisites fail.
|
|
|
|
### Test strategy
|
|
1. Per feature area: contract-positive + edge/failure test.
|
|
2. Persistence features: restart/recovery tests are mandatory.
|
|
3. Replace hook-level “counter” tests with behavior-real integration tests for deep semantics.
|
|
4. Keep targeted suites per layer plus cross-layer integration scenarios.
|
|
|
|
### Completion gates
|
|
1. Behavior gate: deep JetStream operational parity gaps closed or explicitly blocked with evidence.
|
|
2. Test gate: focused suites and full suite pass.
|
|
3. Docs gate: parity docs reflect actual validated behavior; stale contradictions removed.
|
|
4. Drift gate: explicit verification that internal JetStream client remains implemented and documented as `Y`.
|