docs: add post-baseline jetstream parity design
This commit is contained in:
177
docs/plans/2026-02-23-jetstream-post-baseline-parity-design.md
Normal file
177
docs/plans/2026-02-23-jetstream-post-baseline-parity-design.md
Normal file
@@ -0,0 +1,177 @@
|
||||
# JetStream Post-Baseline Remaining Parity Design
|
||||
|
||||
**Date:** 2026-02-23
|
||||
**Status:** Approved
|
||||
**Scope:** Port all remaining Go JetStream functionality still marked `Baseline` or `N` in `differences.md`, including required transport prerequisites (gateway/leaf/account protocol) needed for full JetStream parity.
|
||||
|
||||
## 1. Architecture and Scope Boundary
|
||||
|
||||
### Parity closure target
|
||||
The completion target is to eliminate JetStream and JetStream-required transport deltas from `differences.md` by moving remaining rows from `Baseline`/`N` to `Y` unless an explicit external blocker is documented with evidence.
|
||||
|
||||
### In scope (remaining parity inventory)
|
||||
1. JetStream runtime stream semantics:
|
||||
- retention runtime behavior (`Limits`, `Interest`, `WorkQueue`)
|
||||
- `MaxAge` TTL pruning and `MaxMsgsPer` enforcement
|
||||
- `MaxMsgSize` reject path
|
||||
- dedupe-window semantics (bounded duplicate window, not unbounded dictionary)
|
||||
- stream config behavior for `Compression`, subject transform, republish, direct/KV toggles, sealed/delete/purge guards
|
||||
|
||||
2. JetStream consumer semantics:
|
||||
- full deliver-policy behavior (`All`, `Last`, `New`, `ByStartSequence`, `ByStartTime`, `LastPerSubject`)
|
||||
- `AckPolicy.All` wire/runtime semantics parity
|
||||
- `MaxDeliver` + backoff schedule + redelivery deadlines
|
||||
- flow control frames, idle heartbeats, and rate limiting
|
||||
- replay policy timing parity
|
||||
|
||||
3. Mirror/source advanced behavior:
|
||||
- mirror sync state tracking
|
||||
- source subject mapping
|
||||
- cross-account mirror/source behavior and auth checks
|
||||
|
||||
4. JetStream storage parity layers:
|
||||
- block-backed file layout
|
||||
- time-based expiry/TTL index integration
|
||||
- optional compression/encryption plumbing
|
||||
- deterministic sequence index behavior for recovery and lookup
|
||||
|
||||
5. RAFT/cluster semantics used by JetStream:
|
||||
- heartbeat / keepalive and election timeout behavior
|
||||
- `nextIndex` mismatch backtracking
|
||||
- snapshot transfer + install from leader
|
||||
- membership change semantics
|
||||
- durable meta/replica governance wiring for JetStream cluster control
|
||||
|
||||
6. JetStream-required transport prerequisites:
|
||||
- inter-server account interest protocol (`A+`/`A-`) with account-aware propagation
|
||||
- gateway advanced semantics (`_GR_.` reply remap + full interest-only behavior)
|
||||
- leaf advanced semantics (`$LDS.` loop detection + account remap rules)
|
||||
- cross-cluster JetStream forwarding path over gateway once interest semantics are correct
|
||||
- internal `JETSTREAM` client lifecycle parity (`ClientKind.JetStream` usage in runtime wiring)
|
||||
|
||||
### Out of scope
|
||||
Non-JetStream-only gaps that do not affect JetStream parity closure (for example route compression or non-JS auth callout features) remain out of scope for this plan.
|
||||
|
||||
## 2. Component Plan
|
||||
|
||||
### A. Transport/account prerequisite completion
|
||||
Primary files:
|
||||
- `src/NATS.Server/Gateways/GatewayConnection.cs`
|
||||
- `src/NATS.Server/Gateways/GatewayManager.cs`
|
||||
- `src/NATS.Server/LeafNodes/LeafConnection.cs`
|
||||
- `src/NATS.Server/LeafNodes/LeafNodeManager.cs`
|
||||
- `src/NATS.Server/Routes/RouteConnection.cs`
|
||||
- `src/NATS.Server/Protocol/ClientCommandMatrix.cs`
|
||||
- `src/NATS.Server/NatsServer.cs`
|
||||
- `src/NATS.Server/Subscriptions/RemoteSubscription.cs`
|
||||
- `src/NATS.Server/Subscriptions/SubList.cs`
|
||||
|
||||
Implementation intent:
|
||||
- carry account-aware remote interest metadata end-to-end
|
||||
- implement gateway reply remap contract and de-remap path
|
||||
- implement leaf loop marker handling and account remap/validation
|
||||
|
||||
### B. JetStream runtime semantic completion
|
||||
Primary files:
|
||||
- `src/NATS.Server/JetStream/StreamManager.cs`
|
||||
- `src/NATS.Server/JetStream/ConsumerManager.cs`
|
||||
- `src/NATS.Server/JetStream/Consumers/PullConsumerEngine.cs`
|
||||
- `src/NATS.Server/JetStream/Consumers/PushConsumerEngine.cs`
|
||||
- `src/NATS.Server/JetStream/Consumers/AckProcessor.cs`
|
||||
- `src/NATS.Server/JetStream/Publish/JetStreamPublisher.cs`
|
||||
- `src/NATS.Server/JetStream/Publish/PublishPreconditions.cs`
|
||||
- `src/NATS.Server/JetStream/Models/StreamConfig.cs`
|
||||
- `src/NATS.Server/JetStream/Models/ConsumerConfig.cs`
|
||||
- `src/NATS.Server/JetStream/Validation/JetStreamConfigValidator.cs`
|
||||
|
||||
Implementation intent:
|
||||
- enforce configured policies at runtime, not just parse/model shape
|
||||
- preserve Go-aligned API error codes and state transition behavior
|
||||
|
||||
### C. Storage and snapshot durability
|
||||
Primary files:
|
||||
- `src/NATS.Server/JetStream/Storage/FileStore.cs`
|
||||
- `src/NATS.Server/JetStream/Storage/FileStoreBlock.cs`
|
||||
- `src/NATS.Server/JetStream/Storage/FileStoreOptions.cs`
|
||||
- `src/NATS.Server/JetStream/Storage/MemStore.cs`
|
||||
- `src/NATS.Server/JetStream/Snapshots/StreamSnapshotService.cs`
|
||||
|
||||
Implementation intent:
|
||||
- replace JSONL-only behavior with block-oriented store semantics
|
||||
- enforce TTL pruning in store read/write paths
|
||||
|
||||
### D. RAFT and JetStream cluster governance
|
||||
Primary files:
|
||||
- `src/NATS.Server/Raft/RaftNode.cs`
|
||||
- `src/NATS.Server/Raft/RaftReplicator.cs`
|
||||
- `src/NATS.Server/Raft/RaftTransport.cs`
|
||||
- `src/NATS.Server/Raft/RaftLog.cs`
|
||||
- `src/NATS.Server/Raft/RaftSnapshotStore.cs`
|
||||
- `src/NATS.Server/JetStream/Cluster/JetStreamMetaGroup.cs`
|
||||
- `src/NATS.Server/JetStream/Cluster/StreamReplicaGroup.cs`
|
||||
- `src/NATS.Server/JetStream/Cluster/AssetPlacementPlanner.cs`
|
||||
|
||||
Implementation intent:
|
||||
- transition from in-memory baseline consensus behavior to networked state-machine semantics needed by cluster APIs.
|
||||
|
||||
### E. Internal JetStream client and observability
|
||||
Primary files:
|
||||
- `src/NATS.Server/NatsServer.cs`
|
||||
- `src/NATS.Server/InternalClient.cs`
|
||||
- `src/NATS.Server/Monitoring/JszHandler.cs`
|
||||
- `src/NATS.Server/Monitoring/VarzHandler.cs`
|
||||
- `differences.md`
|
||||
|
||||
Implementation intent:
|
||||
- wire internal `ClientKind.JetStream` client lifecycle where Go uses internal JS messaging paths
|
||||
- ensure monitoring reflects newly enforced runtime behavior
|
||||
|
||||
## 3. Data Flow and Behavioral Contracts
|
||||
|
||||
1. Interest/account propagation:
|
||||
- local subscription updates publish account-scoped interest events to route/gateway/leaf peers
|
||||
- peers update per-account remote-interest state, not global-only state
|
||||
|
||||
2. Gateway reply remap:
|
||||
- outbound cross-cluster reply subjects are rewritten with `_GR_.` metadata
|
||||
- inbound responses are de-remapped before local delivery
|
||||
- no remap leakage to end clients
|
||||
|
||||
3. Leaf loop prevention:
|
||||
- loop marker (`$LDS.`) is injected/checked at leaf boundaries
|
||||
- looped deliveries are rejected before enqueue
|
||||
|
||||
4. Stream publish lifecycle:
|
||||
- validate stream policy + preconditions
|
||||
- apply dedupe-window logic
|
||||
- append to store, prune by policy, then trigger mirror/source + consumer fanout
|
||||
|
||||
5. Consumer delivery lifecycle:
|
||||
- compute start position from deliver policy
|
||||
- enforce max-ack-pending/rate/flow-control/backoff rules
|
||||
- track pending/acks/redelivery deterministically across pull/push engines
|
||||
|
||||
6. Cluster lifecycle:
|
||||
- RAFT heartbeat/election drives leader state
|
||||
- append mismatch uses next-index backtracking
|
||||
- snapshots transfer over transport and compact follower logs
|
||||
- meta-group and stream-groups use durable consensus outputs for control APIs
|
||||
|
||||
## 4. Error Handling, Testing, and Completion Gate
|
||||
|
||||
### Error handling principles
|
||||
1. Keep JetStream API contract errors deterministic (validation vs state vs leadership vs storage).
|
||||
2. Avoid silent downgrades from strict policy semantics to baseline fallback behavior.
|
||||
3. Ensure cross-cluster remap/loop detection failures surface with protocol-safe errors and no partial state mutation.
|
||||
|
||||
### Test strategy
|
||||
1. Unit tests for each runtime policy branch and protocol transformation.
|
||||
2. Integration tests for gateway/leaf/account propagation and cross-cluster message contracts.
|
||||
3. Contract tests for RAFT election, snapshot transfer, and membership transitions.
|
||||
4. Parity-map tests tying Go feature inventory rows to concrete .NET tests.
|
||||
|
||||
### Strict completion criteria
|
||||
1. Remaining JetStream/prerequisite rows in `differences.md` are either `Y` or explicitly blocked with linked evidence.
|
||||
2. New behavior has deterministic test coverage at unit + integration level.
|
||||
3. Focused and full suite gates pass.
|
||||
4. `differences.md` and parity map are updated only after verified green evidence.
|
||||
Reference in New Issue
Block a user