# Full Go Parity: All 15 Structure Gaps — Design Document > **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:writing-plans to create the implementation plan from this design. **Goal:** Port all missing functionality identified in `docs/structuregaps.md` (15 gaps, CRITICAL through MEDIUM) from the Go NATS server to the .NET port, achieving full behavioral parity. Update `docs/test_parity.db` as each Go test is ported. **Architecture:** 5 parallel implementation tracks organized by dependency. Tracks A (Storage), D (Networking), and E (Services) are independent and start immediately. Track C (Protocol) depends on Track A. Track B (Consensus) depends on Tracks A + C. Each track builds features first, then ports corresponding Go tests. **Approach:** Feature-first — build each missing feature, then port its Go tests as validation. Bottom-up dependency ordering ensures foundations are solid before integration. **Estimated impact:** ~1,194 additional Go tests mapped (859 → ~2,053, from 29% to ~70%). --- ## Track A: Storage (FileStore Block Management) **Gap #1 — CRITICAL, 20.8x gap factor** **Go**: `filestore.go` (12,593 lines) | **NET**: `FileStore.cs` (607 lines) **Dependencies**: None (starts immediately) **Tests to port**: ~159 from `filestore_test.go` ### Features 1. **Message Blocks** — 65KB+ blocks with per-block index files. Header: magic, version, first/last sequence, message count, byte count. New block on size limit. 2. **Block Index** — Per-block index mapping sequence number → (offset, length). Enables O(1) lookups. 3. **S2 Compression Integration** — Wire existing `S2Codec.cs` into block writes (compress on flush) and reads (decompress on load). 4. **AEAD Encryption Integration** — Wire existing `AeadEncryptor.cs` into block lifecycle. Per-block encryption keys with rotation on seal. 5. **Crash Recovery** — Scan block directory on startup, validate checksums, rebuild indexes from raw data. 6. **Tombstone/Deletion Tracking** — Sparse sequence sets (using `SequenceSet.cs`) for deleted messages. Purge by subject and sequence range. 7. **Write Cache** — In-memory buffer for active (unsealed) block. Configurable max block count. 8. **Atomic File Operations** — Write-to-temp + rename for crash-safe block sealing. 9. **TTL Scheduling Recovery** — Reconstruct pending TTL expirations from blocks on restart, register with `HashWheel`. ### Key Files | Action | File | Notes | |--------|------|-------| | Rewrite | `src/NATS.Server/JetStream/Storage/FileStore.cs` | Block-based architecture | | Create | `src/NATS.Server/JetStream/Storage/MessageBlock.cs` | Block abstraction | | Create | `src/NATS.Server/JetStream/Storage/BlockIndex.cs` | Per-block index | | Modify | `src/NATS.Server/JetStream/Storage/S2Codec.cs` | Wire into block lifecycle | | Modify | `src/NATS.Server/JetStream/Storage/AeadEncryptor.cs` | Per-block key management | | Tests | `tests/.../JetStream/Storage/FileStoreBlockTests.cs` | New + expanded | --- ## Track B: Consensus (RAFT + JetStream Cluster) **Gap #8 — MEDIUM, 4.4x | Gap #2 — CRITICAL, 213x** **Go**: `raft.go` (5,037) + `jetstream_cluster.go` (10,887) | **NET**: 1,136 + 51 lines **Dependencies**: Tracks A and C must merge first **Tests to port**: ~85 raft + ~358 cluster + ~47 super-cluster = ~490 ### Phase B1: RAFT Enhancements 1. **InstallSnapshot** — Chunked streaming snapshot transfer. Follower receives chunks, applies partial state, catches up from log. 2. **Membership Changes** — `ProposeAddPeer`/`ProposeRemovePeer` with single-server changes (matching Go's approach). 3. **Pre-vote Protocol** — Candidate must get pre-vote approval before incrementing term. Prevents disruptive elections from partitioned nodes. 4. **Log Compaction** — Truncate RAFT log after snapshot. Track last applied index. 5. **Healthy Node Classification** — Current/catching-up/leaderless states. 6. **Campaign Timeout Management** — Randomized election delays. ### Phase B2: JetStream Cluster Coordination 1. **Assignment Tracking** — `StreamAssignment` and `ConsumerAssignment` types via RAFT proposals. Records: stream config, replica group, placement constraints. 2. **RAFT Proposal Workflow** — Leader validates → proposes to meta-group → on commit, all nodes apply → assigned nodes start stream/consumer. 3. **Placement Engine** — Unique nodes, tag matching, cluster affinity. Expands `AssetPlacementPlanner.cs`. 4. **Inflight Deduplication** — Track pending proposals to prevent duplicates during leader transitions. 5. **Peer Remove & Stream Move** — Data rebalancing when a peer is removed. 6. **Step-down & Leadership Transfer** — Graceful leader handoff. 7. **Per-Stream RAFT Groups** — Separate RAFT group per stream for message replication. ### Key Files | Action | File | Notes | |--------|------|-------| | Modify | `src/NATS.Server/Raft/RaftNode.cs` | Snapshot, membership, pre-vote, compaction | | Create | `src/NATS.Server/Raft/RaftSnapshot.cs` | Streaming snapshot | | Create | `src/NATS.Server/Raft/RaftMembership.cs` | Peer add/remove | | Rewrite | `src/.../JetStream/Cluster/JetStreamMetaGroup.cs` | 51 → ~2,000+ lines | | Create | `src/.../JetStream/Cluster/StreamAssignment.cs` | Assignment type | | Create | `src/.../JetStream/Cluster/ConsumerAssignment.cs` | Assignment type | | Create | `src/.../JetStream/Cluster/PlacementEngine.cs` | Topology-aware placement | | Modify | `src/.../JetStream/Cluster/StreamReplicaGroup.cs` | Coordination logic | --- ## Track C: Protocol (Client, Consumer, JetStream API, Mirrors/Sources) **Gaps #5, #4, #7, #3 — all HIGH** **Dependencies**: C1/C3 independent; C2 needs C3; C4 needs Track A **Tests to port**: ~43 client + ~134 consumer + ~184 jetstream + ~30 mirror = ~391 ### C1: Client Protocol Handling (Gap #5, 7.3x) 1. Adaptive read buffer tuning (512→65536 based on throughput) 2. Write buffer pooling with flush coalescing 3. Per-client trace level 4. Full CLIENT/ROUTER/GATEWAY/LEAF/SYSTEM protocol dispatch 5. Slow consumer detection and eviction 6. Max control line enforcement (4096 bytes) 7. Write timeout with partial flush recovery ### C2: Consumer Delivery Engines (Gap #4, 13.3x) 1. NAK and redelivery tracking with exponential backoff schedules 2. Pending request queue for pull consumers with flow control 3. Max-deliveries enforcement (drop/reject/dead-letter) 4. Priority group pinning (sticky consumer assignment) 5. Idle heartbeat generation 6. Pause/resume state with advisory events 7. Filter subject skip tracking 8. Per-message redelivery delay arrays (backoff schedules) ### C3: JetStream API Layer (Gap #7, 7.7x) 1. Leader forwarding for non-leader API requests 2. Stream/consumer info caching with generation invalidation 3. Snapshot/restore API endpoints 4. Purge with subject filter, keep-N, sequence-based 5. Consumer pause/resume API 6. Advisory event publication for API operations 7. Account resource tracking (storage, streams, consumers) ### C4: Stream Mirrors, Sources & Transforms (Gap #3, 16.3x) 1. Mirror synchronization loop (continuous pull, apply locally) 2. Source/mirror ephemeral consumer setup with position tracking 3. Retry with exponential backoff and jitter 4. Deduplication window (`Nats-Msg-Id` header tracking) 5. Purge operations (subject filter, sequence-based, keep-N) 6. Stream snapshot and restore ### Key Files | Action | File | Notes | |--------|------|-------| | Modify | `NatsClient.cs` | Adaptive buffers, slow consumer, trace | | Modify | `PushConsumerEngine.cs` | Major expansion | | Modify | `PullConsumerEngine.cs` | Major expansion | | Create | `.../Consumers/RedeliveryTracker.cs` | NAK/redelivery state | | Create | `.../Consumers/PriorityGroupManager.cs` | Priority pinning | | Modify | `JetStreamApiRouter.cs` | Leader forwarding | | Modify | `StreamApiHandlers.cs` | Purge, snapshot | | Modify | `ConsumerApiHandlers.cs` | Pause/resume | | Rewrite | `MirrorCoordinator.cs` | 22 → ~500+ lines | | Rewrite | `SourceCoordinator.cs` | 36 → ~500+ lines | --- ## Track D: Networking (Gateway, Leaf Node, Routes) **Gaps #11, #12, #13 — all MEDIUM** **Dependencies**: None (starts immediately) **Tests to port**: ~61 gateway + ~59 leafnode + ~39 routes = ~159 ### D1: Gateway Bridging (Gap #11, 6.7x) 1. Interest-only mode (flood → interest switch) 2. Account-specific gateway routes 3. Reply mapper expansion (`_GR_.` prefix) 4. Outbound connection pooling (default 3) 5. Gateway TLS mutual auth 6. Message trace through gateways 7. Reconnection with exponential backoff ### D2: Leaf Node Connections (Gap #12, 6.7x) 1. Solicited leaf connection management with retry/reconnect 2. Hub-spoke subject filtering 3. JetStream domain awareness 4. Account-scoped leaf connections 5. Leaf compression negotiation (S2) 6. Dynamic subscription interest updates 7. Loop detection refinement (`$LDS.` prefix) ### D3: Route Clustering (Gap #13, 5.7x) 1. Route pooling (configurable, default 3) 2. Account-specific dedicated routes 3. Route compression (wire `RouteCompressionCodec.cs`) 4. Solicited route connections with discovery 5. Route permission enforcement 6. Dynamic route add/remove without restart 7. Gossip-based topology discovery ### Key Files | Action | File | Notes | |--------|------|-------| | Modify | `GatewayManager.cs`, `GatewayConnection.cs` | Interest-only, pooling | | Create | `GatewayInterestTracker.cs` | Interest tracking | | Modify | `ReplyMapper.cs` | Full `_GR_.` handling | | Modify | `LeafNodeManager.cs`, `LeafConnection.cs` | Solicited, JetStream | | Modify | `LeafLoopDetector.cs` | `$LDS.` refinement | | Modify | `RouteManager.cs`, `RouteConnection.cs` | Pooling, permissions | | Create | `RoutePool.cs` | Connection pool | | Modify | `RouteCompressionCodec.cs` | Wire into connection | --- ## Track E: Services (MQTT, Accounts, Config, WebSocket, Monitoring) **Gaps #6 (HIGH), #9, #10, #14, #15 (MEDIUM)** **Dependencies**: None (starts immediately) **Tests to port**: ~59 mqtt + ~34 accounts + ~105 config + ~53 websocket + ~118 monitoring = ~369 ### E1: MQTT Protocol (Gap #6, 10.9x) 1. Session persistence with JetStream-backed ClientID mapping 2. Will message handling 3. QoS 1/2 tracking with packet ID mapping and retry 4. Retained messages (per-account JetStream stream) 5. MQTT wildcard translation (`+`→`*`, `#`→`>`, `/`→`.`) 6. Session flapper detection with backoff 7. MaxAckPending enforcement 8. CONNECT packet validation and version negotiation ### E2: Account Management (Gap #9, 13x) 1. Service/stream export whitelist enforcement 2. Service import with weighted destination selection 3. Cycle detection for import chains 4. Response tracking (request-reply latency) 5. Account-level JetStream limits 6. Client tracking per account with eviction 7. Weighted subject mappings for traffic shaping 8. System account with `$SYS.>` handling ### E3: Configuration & Hot Reload (Gap #14, 2.7x) 1. SIGHUP signal handling (`PosixSignalRegistration`) 2. Auth change propagation (disconnect invalidated clients) 3. TLS certificate reloading for rotation 4. JetStream config changes at runtime 5. Logger reconfiguration without restart 6. Account list updates with connection cleanup ### E4: WebSocket Support (Gap #15, 1.3x) 1. WebSocket-specific TLS configuration 2. Origin checking refinement 3. `permessage-deflate` compression negotiation 4. JWT auth through WebSocket upgrade ### E5: Monitoring & Events (Gap #10, 3.5x) 1. Full system event payloads (connect/disconnect/auth) 2. Message trace propagation through full pipeline 3. Closed connection tracking (ring buffer for `/connz`) 4. Account-scoped monitoring (`/connz?acc=ACCOUNT`) 5. Sort options for monitoring endpoints ### Key Files | Action | File | Notes | |--------|------|-------| | Modify | All `Mqtt/` files | Major expansion | | Modify | `Account.cs`, `AuthService.cs` | Import/export, limits | | Create | `AccountImportExport.cs` | Import/export logic | | Create | `AccountLimits.cs` | Per-account JetStream limits | | Modify | `ConfigReloader.cs` | Signal handling, auth propagation | | Modify | `WebSocket/` files | TLS, compression, JWT | | Modify | Monitoring handlers | Events, trace, connz | | Modify | `MessageTraceContext.cs` | 22 → ~200+ lines | --- ## DB Update Protocol For every Go test ported: ```sql UPDATE go_tests SET status='mapped', dotnet_test='', dotnet_file='', notes='Ported from in :' WHERE go_file='' AND go_test=''; ``` For Go tests that cannot be ported (e.g., `signal_test.go` on .NET): ```sql UPDATE go_tests SET status='not_applicable', notes='' WHERE go_file='' AND go_test=''; ``` Batch DB updates at the end of each sub-phase to avoid per-test overhead. --- ## Execution Order ``` Week 1-2: Tracks A, D, E start in parallel (3 worktrees) Week 2-3: Track C starts (after Track A merges for C4) Week 3-4: Track B starts (after Tracks A + C merge) Merge order: A → D → E → C → B → main ``` ## Task Dependencies | Task | Track | Blocked By | |------|-------|------------| | #3 | A: Storage | (none) | | #6 | D: Networking | (none) | | #7 | E: Services | (none) | | #5 | C: Protocol | #3 (for C4 mirrors) | | #4 | B: Consensus | #3, #5 | ## Success Criteria - All 15 gaps from `structuregaps.md` addressed - ~1,194 additional Go tests mapped in `test_parity.db` - Mapped ratio: 29% → ~70% - All new tests passing (`dotnet test` green) - Feature-first: each feature validated by its corresponding Go tests