Generated design docs and implementation plans via Codex for: - Batch 13: FileStore Read/Query - Batch 14: FileStore Write/Lifecycle - Batch 15: MsgBlock + ConsumerFileStore - Batch 18: Server Core - Batch 19: Accounts Core - Batch 20: Accounts Resolvers - Batch 21: Events + MsgTrace - Batch 22: Monitoring All plans include mandatory verification protocol and anti-stub guardrails. Updated batches.md with file paths and planned status.
5.7 KiB
Batch 14 FileStore Write/Lifecycle Design
Context
- Batch:
14(FileStore Write/Lifecycle) - Dependency: Batch
13(FileStore Read/Query) - Scope:
76features +64tests - Go reference:
golang/nats-server/server/filestore.go(primarily lines4394-12549) - .NET target surface:
dotnet/src/ZB.MOM.NatsNet.Server/JetStream/FileStore.csdotnet/src/ZB.MOM.NatsNet.Server/JetStream/MessageBlock.csdotnet/src/ZB.MOM.NatsNet.Server/JetStream/FileStoreTypes.cs
Current implementation state:
JetStreamFileStoreis still a delegation shell toJetStreamMemStorefor mostIStreamStorebehavior.- Batch 14 methods are mostly not present yet in .NET.
- Most mapped Batch 14 tests are not implemented in backlog files and need real behavioral coverage.
Problem Statement
Batch 14 is the first large FileStore execution batch where write path, retention/compaction, purge/reset, state flush, snapshot, and shutdown lifecycle all need file-backed behavior instead of memory-store delegation. If this batch is implemented without strict lock discipline and anti-stub verification, downstream batches (15, 36, 37) will inherit brittle storage behavior and unreliable test evidence.
Clarified Constraints
- Keep Batch 14 scoped to mapped methods/tests only; do not pull Batch 15
MessageBlock + ConsumerFileStorework forward. - Use evidence-backed status updates in PortTracker (
max 15 IDsper update). - Keep tests real: no placeholders, no always-pass assertions, no non-behavioral smoke tests.
- If a test requires unavailable runtime infrastructure, keep it
deferredwith a concrete reason instead of stubbing.
Approaches Considered
Approach 1 (Recommended): Vertical implementation by four feature groups with dependency-resolved test waves
Implement Batch 14 in four functional feature groups (18 + 18 + 20 + 20), each followed by targeted test waves that only promote tests whose feature dependencies are already implemented and verified.
Pros:
- Keeps each cycle below the complexity threshold for file-store concurrency code.
- Makes failures local and debuggable.
- Aligns naturally with mandatory build/test/status checkpoints.
Cons:
- Requires careful bookkeeping of cross-group tests.
- More commits and checkpoint overhead.
Approach 2: Implement all 76 features first, then all 64 tests
Complete production surface in one pass, then backfill all tests at the end.
Pros:
- Fewer context switches.
Cons:
- High risk of broad regressions discovered late.
- Weak traceability between feature status and test evidence.
- Encourages accidental stub completion pressure near the end.
Approach 3: Test-first only with synthetic wrappers over memstore delegation
Attempt to satisfy mapped tests through wrapper behavior while delaying real file-backed implementation.
Pros:
- Fast initial green tests.
Cons:
- Violates batch intent (real FileStore write/lifecycle parity).
- Produces fragile tests that validate wrappers, not storage behavior.
- Increases later rework and hidden defects.
Recommended Design
1) Implementation topology
Use four feature groups with bounded scope:
- Group 1 (
18): write-path foundation, per-subject totals, limits/removal entrypoints. - Group 2 (
18): age/scheduling loops, record/tombstone writes, block sync/select helpers. - Group 3 (
20): seq/read helpers, cache/state counters, purge/compact/reset and block-list mutation. - Group 4 (
20): purge-block/global subject info, stream-state write loop, stop/snapshot/delete-map lifecycle.
2) Locking and lifecycle model
- Preserve
ReaderWriterLockSlimownership boundaries inJetStreamFileStore. - Keep timer/background loop ownership explicit (
_ageChk,_syncTmr,_qch,_fsld). - Ensure stop/flush/snapshot/delete paths are idempotent and race-safe under repeated calls.
- Treat file writes and state writes as durability boundaries; enforce explicit optional-sync behavior parity.
3) Test model
Implement backlog tests as real behavioral tests in:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamFileStoreTests.Impltests.csdotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConcurrencyTests1.Impltests.csdotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/ConcurrencyTests2.Impltests.csdotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/LeafNodeHandlerTests.Impltests.csdotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/RouteHandlerTests.Impltests.cs
Create missing backlog classes for mapped tests:
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/LeafNodeProxyTests.Impltests.csdotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamClusterLongTests.Impltests.cs
4) Status strategy
- Features:
deferred -> stub -> complete -> verified. - Tests:
deferred -> stub -> verifiedordeferredwith explicit blocker reason. - Promote only IDs that have direct Go-read + build + targeted test evidence.
5) Risk controls
- Mandatory stub scans after each feature/test wave.
- Build gate after each feature group.
- Related test gate before any
verifiedpromotion. - Full checkpoint (
build + full unit tests + commit) between groups.
Non-Goals
- Port Batch 15 (
MessageBlock + ConsumerFileStore) behaviors beyond what Batch 14 methods directly require. - Converting integration-only tests into unit tests by weakening assertions.
- Marking blocked runtime-heavy tests
verifiedwithout executable evidence.
Acceptance Criteria
- All 76 Batch 14 features implemented with non-stub behavior and verified evidence.
- All implementable mapped tests in Batch 14 converted to real behavioral tests and verified.
- Runtime-blocked tests remain
deferredwith concrete blocker notes. - Batch 14 can be completed with
batch complete 14after status/audit validation.