Files
natsdotnet/docs/plans/2026-02-24-structuregaps-full-parity-design.md
Joseph Doherty f1e42f1b5f docs: add full Go parity design for all 15 structure gaps
5-track parallel architecture (Storage, Consensus, Protocol,
Networking, Services) covering all CRITICAL/HIGH/MEDIUM gaps
identified in structuregaps.md. Feature-first approach with
test_parity.db updates. Targets ~1,194 additional Go test mappings.
2026-02-24 11:57:15 -05:00

322 lines
13 KiB
Markdown

# Full Go Parity: All 15 Structure Gaps — Design Document
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:writing-plans to create the implementation plan from this design.
**Goal:** Port all missing functionality identified in `docs/structuregaps.md` (15 gaps, CRITICAL through MEDIUM) from the Go NATS server to the .NET port, achieving full behavioral parity. Update `docs/test_parity.db` as each Go test is ported.
**Architecture:** 5 parallel implementation tracks organized by dependency. Tracks A (Storage), D (Networking), and E (Services) are independent and start immediately. Track C (Protocol) depends on Track A. Track B (Consensus) depends on Tracks A + C. Each track builds features first, then ports corresponding Go tests.
**Approach:** Feature-first — build each missing feature, then port its Go tests as validation. Bottom-up dependency ordering ensures foundations are solid before integration.
**Estimated impact:** ~1,194 additional Go tests mapped (859 → ~2,053, from 29% to ~70%).
---
## Track A: Storage (FileStore Block Management)
**Gap #1 — CRITICAL, 20.8x gap factor**
**Go**: `filestore.go` (12,593 lines) | **NET**: `FileStore.cs` (607 lines)
**Dependencies**: None (starts immediately)
**Tests to port**: ~159 from `filestore_test.go`
### Features
1. **Message Blocks** — 65KB+ blocks with per-block index files. Header: magic, version, first/last sequence, message count, byte count. New block on size limit.
2. **Block Index** — Per-block index mapping sequence number → (offset, length). Enables O(1) lookups.
3. **S2 Compression Integration** — Wire existing `S2Codec.cs` into block writes (compress on flush) and reads (decompress on load).
4. **AEAD Encryption Integration** — Wire existing `AeadEncryptor.cs` into block lifecycle. Per-block encryption keys with rotation on seal.
5. **Crash Recovery** — Scan block directory on startup, validate checksums, rebuild indexes from raw data.
6. **Tombstone/Deletion Tracking** — Sparse sequence sets (using `SequenceSet.cs`) for deleted messages. Purge by subject and sequence range.
7. **Write Cache** — In-memory buffer for active (unsealed) block. Configurable max block count.
8. **Atomic File Operations** — Write-to-temp + rename for crash-safe block sealing.
9. **TTL Scheduling Recovery** — Reconstruct pending TTL expirations from blocks on restart, register with `HashWheel`.
### Key Files
| Action | File | Notes |
|--------|------|-------|
| Rewrite | `src/NATS.Server/JetStream/Storage/FileStore.cs` | Block-based architecture |
| Create | `src/NATS.Server/JetStream/Storage/MessageBlock.cs` | Block abstraction |
| Create | `src/NATS.Server/JetStream/Storage/BlockIndex.cs` | Per-block index |
| Modify | `src/NATS.Server/JetStream/Storage/S2Codec.cs` | Wire into block lifecycle |
| Modify | `src/NATS.Server/JetStream/Storage/AeadEncryptor.cs` | Per-block key management |
| Tests | `tests/.../JetStream/Storage/FileStoreBlockTests.cs` | New + expanded |
---
## Track B: Consensus (RAFT + JetStream Cluster)
**Gap #8 — MEDIUM, 4.4x | Gap #2 — CRITICAL, 213x**
**Go**: `raft.go` (5,037) + `jetstream_cluster.go` (10,887) | **NET**: 1,136 + 51 lines
**Dependencies**: Tracks A and C must merge first
**Tests to port**: ~85 raft + ~358 cluster + ~47 super-cluster = ~490
### Phase B1: RAFT Enhancements
1. **InstallSnapshot** — Chunked streaming snapshot transfer. Follower receives chunks, applies partial state, catches up from log.
2. **Membership Changes**`ProposeAddPeer`/`ProposeRemovePeer` with single-server changes (matching Go's approach).
3. **Pre-vote Protocol** — Candidate must get pre-vote approval before incrementing term. Prevents disruptive elections from partitioned nodes.
4. **Log Compaction** — Truncate RAFT log after snapshot. Track last applied index.
5. **Healthy Node Classification** — Current/catching-up/leaderless states.
6. **Campaign Timeout Management** — Randomized election delays.
### Phase B2: JetStream Cluster Coordination
1. **Assignment Tracking**`StreamAssignment` and `ConsumerAssignment` types via RAFT proposals. Records: stream config, replica group, placement constraints.
2. **RAFT Proposal Workflow** — Leader validates → proposes to meta-group → on commit, all nodes apply → assigned nodes start stream/consumer.
3. **Placement Engine** — Unique nodes, tag matching, cluster affinity. Expands `AssetPlacementPlanner.cs`.
4. **Inflight Deduplication** — Track pending proposals to prevent duplicates during leader transitions.
5. **Peer Remove & Stream Move** — Data rebalancing when a peer is removed.
6. **Step-down & Leadership Transfer** — Graceful leader handoff.
7. **Per-Stream RAFT Groups** — Separate RAFT group per stream for message replication.
### Key Files
| Action | File | Notes |
|--------|------|-------|
| Modify | `src/NATS.Server/Raft/RaftNode.cs` | Snapshot, membership, pre-vote, compaction |
| Create | `src/NATS.Server/Raft/RaftSnapshot.cs` | Streaming snapshot |
| Create | `src/NATS.Server/Raft/RaftMembership.cs` | Peer add/remove |
| Rewrite | `src/.../JetStream/Cluster/JetStreamMetaGroup.cs` | 51 → ~2,000+ lines |
| Create | `src/.../JetStream/Cluster/StreamAssignment.cs` | Assignment type |
| Create | `src/.../JetStream/Cluster/ConsumerAssignment.cs` | Assignment type |
| Create | `src/.../JetStream/Cluster/PlacementEngine.cs` | Topology-aware placement |
| Modify | `src/.../JetStream/Cluster/StreamReplicaGroup.cs` | Coordination logic |
---
## Track C: Protocol (Client, Consumer, JetStream API, Mirrors/Sources)
**Gaps #5, #4, #7, #3 — all HIGH**
**Dependencies**: C1/C3 independent; C2 needs C3; C4 needs Track A
**Tests to port**: ~43 client + ~134 consumer + ~184 jetstream + ~30 mirror = ~391
### C1: Client Protocol Handling (Gap #5, 7.3x)
1. Adaptive read buffer tuning (512→65536 based on throughput)
2. Write buffer pooling with flush coalescing
3. Per-client trace level
4. Full CLIENT/ROUTER/GATEWAY/LEAF/SYSTEM protocol dispatch
5. Slow consumer detection and eviction
6. Max control line enforcement (4096 bytes)
7. Write timeout with partial flush recovery
### C2: Consumer Delivery Engines (Gap #4, 13.3x)
1. NAK and redelivery tracking with exponential backoff schedules
2. Pending request queue for pull consumers with flow control
3. Max-deliveries enforcement (drop/reject/dead-letter)
4. Priority group pinning (sticky consumer assignment)
5. Idle heartbeat generation
6. Pause/resume state with advisory events
7. Filter subject skip tracking
8. Per-message redelivery delay arrays (backoff schedules)
### C3: JetStream API Layer (Gap #7, 7.7x)
1. Leader forwarding for non-leader API requests
2. Stream/consumer info caching with generation invalidation
3. Snapshot/restore API endpoints
4. Purge with subject filter, keep-N, sequence-based
5. Consumer pause/resume API
6. Advisory event publication for API operations
7. Account resource tracking (storage, streams, consumers)
### C4: Stream Mirrors, Sources & Transforms (Gap #3, 16.3x)
1. Mirror synchronization loop (continuous pull, apply locally)
2. Source/mirror ephemeral consumer setup with position tracking
3. Retry with exponential backoff and jitter
4. Deduplication window (`Nats-Msg-Id` header tracking)
5. Purge operations (subject filter, sequence-based, keep-N)
6. Stream snapshot and restore
### Key Files
| Action | File | Notes |
|--------|------|-------|
| Modify | `NatsClient.cs` | Adaptive buffers, slow consumer, trace |
| Modify | `PushConsumerEngine.cs` | Major expansion |
| Modify | `PullConsumerEngine.cs` | Major expansion |
| Create | `.../Consumers/RedeliveryTracker.cs` | NAK/redelivery state |
| Create | `.../Consumers/PriorityGroupManager.cs` | Priority pinning |
| Modify | `JetStreamApiRouter.cs` | Leader forwarding |
| Modify | `StreamApiHandlers.cs` | Purge, snapshot |
| Modify | `ConsumerApiHandlers.cs` | Pause/resume |
| Rewrite | `MirrorCoordinator.cs` | 22 → ~500+ lines |
| Rewrite | `SourceCoordinator.cs` | 36 → ~500+ lines |
---
## Track D: Networking (Gateway, Leaf Node, Routes)
**Gaps #11, #12, #13 — all MEDIUM**
**Dependencies**: None (starts immediately)
**Tests to port**: ~61 gateway + ~59 leafnode + ~39 routes = ~159
### D1: Gateway Bridging (Gap #11, 6.7x)
1. Interest-only mode (flood → interest switch)
2. Account-specific gateway routes
3. Reply mapper expansion (`_GR_.` prefix)
4. Outbound connection pooling (default 3)
5. Gateway TLS mutual auth
6. Message trace through gateways
7. Reconnection with exponential backoff
### D2: Leaf Node Connections (Gap #12, 6.7x)
1. Solicited leaf connection management with retry/reconnect
2. Hub-spoke subject filtering
3. JetStream domain awareness
4. Account-scoped leaf connections
5. Leaf compression negotiation (S2)
6. Dynamic subscription interest updates
7. Loop detection refinement (`$LDS.` prefix)
### D3: Route Clustering (Gap #13, 5.7x)
1. Route pooling (configurable, default 3)
2. Account-specific dedicated routes
3. Route compression (wire `RouteCompressionCodec.cs`)
4. Solicited route connections with discovery
5. Route permission enforcement
6. Dynamic route add/remove without restart
7. Gossip-based topology discovery
### Key Files
| Action | File | Notes |
|--------|------|-------|
| Modify | `GatewayManager.cs`, `GatewayConnection.cs` | Interest-only, pooling |
| Create | `GatewayInterestTracker.cs` | Interest tracking |
| Modify | `ReplyMapper.cs` | Full `_GR_.` handling |
| Modify | `LeafNodeManager.cs`, `LeafConnection.cs` | Solicited, JetStream |
| Modify | `LeafLoopDetector.cs` | `$LDS.` refinement |
| Modify | `RouteManager.cs`, `RouteConnection.cs` | Pooling, permissions |
| Create | `RoutePool.cs` | Connection pool |
| Modify | `RouteCompressionCodec.cs` | Wire into connection |
---
## Track E: Services (MQTT, Accounts, Config, WebSocket, Monitoring)
**Gaps #6 (HIGH), #9, #10, #14, #15 (MEDIUM)**
**Dependencies**: None (starts immediately)
**Tests to port**: ~59 mqtt + ~34 accounts + ~105 config + ~53 websocket + ~118 monitoring = ~369
### E1: MQTT Protocol (Gap #6, 10.9x)
1. Session persistence with JetStream-backed ClientID mapping
2. Will message handling
3. QoS 1/2 tracking with packet ID mapping and retry
4. Retained messages (per-account JetStream stream)
5. MQTT wildcard translation (`+``*`, `#``>`, `/``.`)
6. Session flapper detection with backoff
7. MaxAckPending enforcement
8. CONNECT packet validation and version negotiation
### E2: Account Management (Gap #9, 13x)
1. Service/stream export whitelist enforcement
2. Service import with weighted destination selection
3. Cycle detection for import chains
4. Response tracking (request-reply latency)
5. Account-level JetStream limits
6. Client tracking per account with eviction
7. Weighted subject mappings for traffic shaping
8. System account with `$SYS.>` handling
### E3: Configuration & Hot Reload (Gap #14, 2.7x)
1. SIGHUP signal handling (`PosixSignalRegistration`)
2. Auth change propagation (disconnect invalidated clients)
3. TLS certificate reloading for rotation
4. JetStream config changes at runtime
5. Logger reconfiguration without restart
6. Account list updates with connection cleanup
### E4: WebSocket Support (Gap #15, 1.3x)
1. WebSocket-specific TLS configuration
2. Origin checking refinement
3. `permessage-deflate` compression negotiation
4. JWT auth through WebSocket upgrade
### E5: Monitoring & Events (Gap #10, 3.5x)
1. Full system event payloads (connect/disconnect/auth)
2. Message trace propagation through full pipeline
3. Closed connection tracking (ring buffer for `/connz`)
4. Account-scoped monitoring (`/connz?acc=ACCOUNT`)
5. Sort options for monitoring endpoints
### Key Files
| Action | File | Notes |
|--------|------|-------|
| Modify | All `Mqtt/` files | Major expansion |
| Modify | `Account.cs`, `AuthService.cs` | Import/export, limits |
| Create | `AccountImportExport.cs` | Import/export logic |
| Create | `AccountLimits.cs` | Per-account JetStream limits |
| Modify | `ConfigReloader.cs` | Signal handling, auth propagation |
| Modify | `WebSocket/` files | TLS, compression, JWT |
| Modify | Monitoring handlers | Events, trace, connz |
| Modify | `MessageTraceContext.cs` | 22 → ~200+ lines |
---
## DB Update Protocol
For every Go test ported:
```sql
UPDATE go_tests
SET status='mapped',
dotnet_test='<DotNetTestMethodName>',
dotnet_file='<DotNetTestFile.cs>',
notes='Ported from <GoFunctionName> in <go_file>:<line>'
WHERE go_file='<go_test_file>' AND go_test='<GoTestName>';
```
For Go tests that cannot be ported (e.g., `signal_test.go` on .NET):
```sql
UPDATE go_tests
SET status='not_applicable',
notes='<reason: e.g., Unix signal handling not applicable to .NET>'
WHERE go_file='<go_test_file>' AND go_test='<GoTestName>';
```
Batch DB updates at the end of each sub-phase to avoid per-test overhead.
---
## Execution Order
```
Week 1-2: Tracks A, D, E start in parallel (3 worktrees)
Week 2-3: Track C starts (after Track A merges for C4)
Week 3-4: Track B starts (after Tracks A + C merge)
Merge order: A → D → E → C → B → main
```
## Task Dependencies
| Task | Track | Blocked By |
|------|-------|------------|
| #3 | A: Storage | (none) |
| #6 | D: Networking | (none) |
| #7 | E: Services | (none) |
| #5 | C: Protocol | #3 (for C4 mirrors) |
| #4 | B: Consensus | #3, #5 |
## Success Criteria
- All 15 gaps from `structuregaps.md` addressed
- ~1,194 additional Go tests mapped in `test_parity.db`
- Mapped ratio: 29% → ~70%
- All new tests passing (`dotnet test` green)
- Feature-first: each feature validated by its corresponding Go tests