Add batch plans for batches 13-15, 18-22 (rounds 8-11)
Generated design docs and implementation plans via Codex for: - Batch 13: FileStore Read/Query - Batch 14: FileStore Write/Lifecycle - Batch 15: MsgBlock + ConsumerFileStore - Batch 18: Server Core - Batch 19: Accounts Core - Batch 20: Accounts Resolvers - Batch 21: Events + MsgTrace - Batch 22: Monitoring All plans include mandatory verification protocol and anti-stub guardrails. Updated batches.md with file paths and planned status.
This commit is contained in:
@@ -0,0 +1,133 @@
|
||||
# Batch 14 FileStore Write/Lifecycle Design
|
||||
|
||||
## Context
|
||||
|
||||
- Batch: `14` (`FileStore Write/Lifecycle`)
|
||||
- Dependency: Batch `13` (`FileStore Read/Query`)
|
||||
- Scope: `76` features + `64` tests
|
||||
- Go reference: `golang/nats-server/server/filestore.go` (primarily lines `4394-12549`)
|
||||
- .NET target surface:
|
||||
- `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/FileStore.cs`
|
||||
- `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/MessageBlock.cs`
|
||||
- `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/FileStoreTypes.cs`
|
||||
|
||||
Current implementation state:
|
||||
|
||||
- `JetStreamFileStore` is still a delegation shell to `JetStreamMemStore` for most `IStreamStore` behavior.
|
||||
- Batch 14 methods are mostly not present yet in .NET.
|
||||
- Most mapped Batch 14 tests are not implemented in backlog files and need real behavioral coverage.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Batch 14 is the first large FileStore execution batch where write path, retention/compaction, purge/reset, state flush, snapshot, and shutdown lifecycle all need file-backed behavior instead of memory-store delegation. If this batch is implemented without strict lock discipline and anti-stub verification, downstream batches (`15`, `36`, `37`) will inherit brittle storage behavior and unreliable test evidence.
|
||||
|
||||
## Clarified Constraints
|
||||
|
||||
- Keep Batch 14 scoped to mapped methods/tests only; do not pull Batch 15 `MessageBlock + ConsumerFileStore` work forward.
|
||||
- Use evidence-backed status updates in PortTracker (`max 15 IDs` per update).
|
||||
- Keep tests real: no placeholders, no always-pass assertions, no non-behavioral smoke tests.
|
||||
- If a test requires unavailable runtime infrastructure, keep it `deferred` with a concrete reason instead of stubbing.
|
||||
|
||||
## Approaches Considered
|
||||
|
||||
### Approach 1 (Recommended): Vertical implementation by four feature groups with dependency-resolved test waves
|
||||
|
||||
Implement Batch 14 in four functional feature groups (`18 + 18 + 20 + 20`), each followed by targeted test waves that only promote tests whose feature dependencies are already implemented and verified.
|
||||
|
||||
Pros:
|
||||
|
||||
- Keeps each cycle below the complexity threshold for file-store concurrency code.
|
||||
- Makes failures local and debuggable.
|
||||
- Aligns naturally with mandatory build/test/status checkpoints.
|
||||
|
||||
Cons:
|
||||
|
||||
- Requires careful bookkeeping of cross-group tests.
|
||||
- More commits and checkpoint overhead.
|
||||
|
||||
### Approach 2: Implement all 76 features first, then all 64 tests
|
||||
|
||||
Complete production surface in one pass, then backfill all tests at the end.
|
||||
|
||||
Pros:
|
||||
|
||||
- Fewer context switches.
|
||||
|
||||
Cons:
|
||||
|
||||
- High risk of broad regressions discovered late.
|
||||
- Weak traceability between feature status and test evidence.
|
||||
- Encourages accidental stub completion pressure near the end.
|
||||
|
||||
### Approach 3: Test-first only with synthetic wrappers over memstore delegation
|
||||
|
||||
Attempt to satisfy mapped tests through wrapper behavior while delaying real file-backed implementation.
|
||||
|
||||
Pros:
|
||||
|
||||
- Fast initial green tests.
|
||||
|
||||
Cons:
|
||||
|
||||
- Violates batch intent (real FileStore write/lifecycle parity).
|
||||
- Produces fragile tests that validate wrappers, not storage behavior.
|
||||
- Increases later rework and hidden defects.
|
||||
|
||||
## Recommended Design
|
||||
|
||||
### 1) Implementation topology
|
||||
|
||||
Use four feature groups with bounded scope:
|
||||
|
||||
- Group 1 (`18`): write-path foundation, per-subject totals, limits/removal entrypoints.
|
||||
- Group 2 (`18`): age/scheduling loops, record/tombstone writes, block sync/select helpers.
|
||||
- Group 3 (`20`): seq/read helpers, cache/state counters, purge/compact/reset and block-list mutation.
|
||||
- Group 4 (`20`): purge-block/global subject info, stream-state write loop, stop/snapshot/delete-map lifecycle.
|
||||
|
||||
### 2) Locking and lifecycle model
|
||||
|
||||
- Preserve `ReaderWriterLockSlim` ownership boundaries in `JetStreamFileStore`.
|
||||
- Keep timer/background loop ownership explicit (`_ageChk`, `_syncTmr`, `_qch`, `_fsld`).
|
||||
- Ensure stop/flush/snapshot/delete paths are idempotent and race-safe under repeated calls.
|
||||
- Treat file writes and state writes as durability boundaries; enforce explicit optional-sync behavior parity.
|
||||
|
||||
### 3) Test model
|
||||
|
||||
Implement backlog tests as real behavioral tests in:
|
||||
|
||||
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamFileStoreTests.Impltests.cs`
|
||||
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConcurrencyTests1.Impltests.cs`
|
||||
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConcurrencyTests2.Impltests.cs`
|
||||
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/LeafNodeHandlerTests.Impltests.cs`
|
||||
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/RouteHandlerTests.Impltests.cs`
|
||||
|
||||
Create missing backlog classes for mapped tests:
|
||||
|
||||
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/LeafNodeProxyTests.Impltests.cs`
|
||||
- `dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamClusterLongTests.Impltests.cs`
|
||||
|
||||
### 4) Status strategy
|
||||
|
||||
- Features: `deferred -> stub -> complete -> verified`.
|
||||
- Tests: `deferred -> stub -> verified` or `deferred` with explicit blocker reason.
|
||||
- Promote only IDs that have direct Go-read + build + targeted test evidence.
|
||||
|
||||
### 5) Risk controls
|
||||
|
||||
- Mandatory stub scans after each feature/test wave.
|
||||
- Build gate after each feature group.
|
||||
- Related test gate before any `verified` promotion.
|
||||
- Full checkpoint (`build + full unit tests + commit`) between groups.
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- Port Batch 15 (`MessageBlock + ConsumerFileStore`) behaviors beyond what Batch 14 methods directly require.
|
||||
- Converting integration-only tests into unit tests by weakening assertions.
|
||||
- Marking blocked runtime-heavy tests `verified` without executable evidence.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- All 76 Batch 14 features implemented with non-stub behavior and verified evidence.
|
||||
- All implementable mapped tests in Batch 14 converted to real behavioral tests and verified.
|
||||
- Runtime-blocked tests remain `deferred` with concrete blocker notes.
|
||||
- Batch 14 can be completed with `batch complete 14` after status/audit validation.
|
||||
Reference in New Issue
Block a user