Add batch plans for batches 6-7, 9-12, 16-17 (rounds 4-7)
Generated design docs and implementation plans via Codex for: - Batch 6: Opts package-level functions - Batch 7: Opts class methods + Reload - Batch 9: Auth, DirStore, OCSP foundations - Batch 10: OCSP Cache + JS Events - Batch 11: FileStore Init - Batch 12: FileStore Recovery - Batch 16: Client Core (first half) - Batch 17: Client Core (second half) All plans include mandatory verification protocol and anti-stub guardrails. Updated batches.md with file paths and planned status.
This commit is contained in:
360
docs/plans/2026-02-27-batch-12-filestore-recovery-plan.md
Normal file
360
docs/plans/2026-02-27-batch-12-filestore-recovery-plan.md
Normal file
@@ -0,0 +1,360 @@
|
||||
# Batch 12 FileStore Recovery Implementation Plan
|
||||
|
||||
> **For Codex:** REQUIRED SUB-SKILL: Use `executeplan` to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Implement and verify all Batch 12 FileStore Recovery features from `server/filestore.go` with no stub logic and evidence-backed status transitions.
|
||||
|
||||
**Architecture:** Execute Batch 12 in two vertical feature groups (5 + 3). Implement recovery logic directly in `JetStream/FileStore.cs`, touching supporting JetStream types only when required. After each group, run strict stub scans, build, and related test gates before any status updates.
|
||||
|
||||
**Tech Stack:** .NET 10, C# latest, xUnit 3, Shouldly, NSubstitute, PortTracker CLI, SQLite (`porting.db`)
|
||||
|
||||
**Design doc:** `docs/plans/2026-02-27-batch-12-filestore-recovery-design.md`
|
||||
|
||||
---
|
||||
|
||||
I'm using `writeplan` to create the implementation plan.
|
||||
|
||||
## Batch Inputs
|
||||
|
||||
- Batch: `12` (`FileStore Recovery`)
|
||||
- Depends on: Batch `11`
|
||||
- Features: `8`
|
||||
- Tests: `0` (batch-owned), with known related reverse dependencies:
|
||||
- test `#519` (`FileStoreRecoverFullStateDetectCorruptState_ShouldSucceed`)
|
||||
- test `#545` (`FileStoreNoPanicOnRecoverTTLWithCorruptBlocks_ShouldSucceed`)
|
||||
- Go source scope: `golang/nats-server/server/filestore.go` lines ~1708-2580
|
||||
|
||||
Feature groups (max ~20 features each):
|
||||
- **Group 1 (5):** `987,988,991,992,993`
|
||||
- **Group 2 (3):** `995,996,997`
|
||||
|
||||
---
|
||||
|
||||
## MANDATORY VERIFICATION PROTOCOL
|
||||
|
||||
> **NON-NEGOTIABLE:** Every task and every status update in this plan must follow this protocol.
|
||||
|
||||
### Per-Feature Verification Loop (REQUIRED for every feature ID)
|
||||
|
||||
For each feature ID in the active group:
|
||||
|
||||
1. Read feature mapping and exact Go intent:
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- feature show <FEATURE_ID> --db porting.db
|
||||
```
|
||||
2. Read corresponding Go method span in `golang/nats-server/server/filestore.go`.
|
||||
3. Implement minimal real C# behavior (no placeholders).
|
||||
4. Build immediately:
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet build dotnet/
|
||||
```
|
||||
5. Run related tests for the touched behavior (see Test Gate below).
|
||||
6. Record evidence (command + summary output) before adding the ID to status-update candidates.
|
||||
|
||||
### Stub Detection Check (REQUIRED after each feature group)
|
||||
|
||||
Run all scans below. Any match is a hard blocker:
|
||||
|
||||
```bash
|
||||
# Production placeholder detection
|
||||
rg -n "NotImplementedException|TODO|PLACEHOLDER" \
|
||||
dotnet/src/ZB.MOM.NatsNet.Server/JetStream -g '*.cs'
|
||||
|
||||
# Empty method bodies (FileStore recovery surface)
|
||||
rg -n "^\s*(public|private|internal|protected).*(Warn|Debug|RecoverFullState|RecoverTTLState|RecoverMsgSchedulingState|CleanupOldMeta|RecoverMsgs|ExpireMsgsOnRecover)\s*\([^)]*\)\s*\{\s*\}$" \
|
||||
dotnet/src/ZB.MOM.NatsNet.Server/JetStream/FileStore.cs
|
||||
|
||||
# Test placeholders in directly related classes
|
||||
rg -n "NotImplementedException|Assert\.True\(true\)|Assert\.Pass|// TODO|// PLACEHOLDER" \
|
||||
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/JetStream/JetStreamFileStoreTests.cs \
|
||||
dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ImplBacklog/JetStreamFileStoreTests.Impltests.cs
|
||||
```
|
||||
|
||||
### Build Gate (REQUIRED after each feature group)
|
||||
|
||||
This must pass before status updates and before moving to next group:
|
||||
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet build dotnet/
|
||||
```
|
||||
|
||||
### Test Gate (REQUIRED before marking features `verified`)
|
||||
|
||||
All related tests must pass. Run at least:
|
||||
|
||||
```bash
|
||||
# Existing JetStream FileStore coverage
|
||||
/usr/local/share/dotnet/dotnet test dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ \
|
||||
--filter "FullyQualifiedName~ZB.MOM.NatsNet.Server.Tests.JetStream.JetStreamFileStoreTests" \
|
||||
--verbosity normal
|
||||
|
||||
# Backlog coverage for FileStore implementation
|
||||
/usr/local/share/dotnet/dotnet test dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ \
|
||||
--filter "FullyQualifiedName~ZB.MOM.NatsNet.Server.Tests.ImplBacklog.JetStreamFileStoreTests" \
|
||||
--verbosity normal
|
||||
|
||||
# Feature-linked methods from reverse dependencies
|
||||
/usr/local/share/dotnet/dotnet test dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ \
|
||||
--filter "FullyQualifiedName~FileStoreRecoverFullStateDetectCorruptState|FullyQualifiedName~FileStoreNoPanicOnRecoverTTLWithCorruptBlocks" \
|
||||
--verbosity normal
|
||||
```
|
||||
|
||||
Gate rule:
|
||||
- If related tests run and pass, eligible for `verified`.
|
||||
- If related tests are unavailable/not yet implemented (0 discovered), feature may be set to `complete` only, with explicit note explaining why `verified` is deferred.
|
||||
|
||||
### Status Update Protocol (REQUIRED)
|
||||
|
||||
- Use max **15 IDs** per `feature batch-update` call.
|
||||
- Required status progression: `deferred -> stub -> complete -> verified`.
|
||||
- Do not mark `verified` without evidence from Build Gate + Test Gate.
|
||||
- Keep an evidence log folder (example: `/tmp/batch12-evidence/`) with per-group command outputs.
|
||||
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
# Move active group to stub before editing
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- \
|
||||
feature batch-update --ids "987,988,991,992,993" --set-status stub --db porting.db --execute
|
||||
|
||||
# Move group to complete after successful implementation + build/test evidence
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- \
|
||||
feature batch-update --ids "987,988,991,992,993" --set-status complete --db porting.db --execute
|
||||
```
|
||||
|
||||
### Checkpoint Protocol Between Tasks (REQUIRED)
|
||||
|
||||
At each group boundary:
|
||||
|
||||
1. Full build:
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet build dotnet/
|
||||
```
|
||||
2. Full unit test sweep (not just filtered):
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet test dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ --verbosity normal
|
||||
```
|
||||
3. Commit checkpoint before next task:
|
||||
```bash
|
||||
git add dotnet/src/ZB.MOM.NatsNet.Server/JetStream \
|
||||
dotnet/tests/ZB.MOM.NatsNet.Server.Tests \
|
||||
porting.db
|
||||
git commit -m "feat(batch12): complete group <N> filestore recovery"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ANTI-STUB GUARDRAILS (NON-NEGOTIABLE)
|
||||
|
||||
### Forbidden Patterns
|
||||
|
||||
The following are forbidden in Batch 12 feature or related test code:
|
||||
|
||||
- `throw new NotImplementedException(...)`
|
||||
- Empty recovery method bodies (`{ }`)
|
||||
- `// TODO` or `// PLACEHOLDER` in implemented recovery methods
|
||||
- Fake test pass patterns (`Assert.True(true)`, `Assert.Pass()`, assertion-only smoke checks that do not exercise production behavior)
|
||||
- Swallowing corruption/IO errors silently instead of preserving Go intent
|
||||
|
||||
### Hard Limits
|
||||
|
||||
- Max ~20 features per implementation group (fixed here as 5 and 3)
|
||||
- Max 15 feature IDs per status-update command
|
||||
- One feature group per verification/update cycle
|
||||
- Zero stub-scan matches before `complete` or `verified` transitions
|
||||
- No `verified` transition without explicit Build Gate + Test Gate evidence
|
||||
|
||||
### If You Get Stuck (MANDATORY)
|
||||
|
||||
1. Do **not** add a stub, placeholder, or no-op workaround.
|
||||
2. Mark only blocked feature IDs as `deferred` with a concrete reason.
|
||||
3. Continue with remaining IDs in the group.
|
||||
4. Record blocker details in evidence log and PortTracker override reason.
|
||||
|
||||
Example:
|
||||
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- \
|
||||
feature update <ID> --status deferred --db porting.db \
|
||||
--override "blocked: <specific technical reason>"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Batch Start and Group 1 Staging
|
||||
|
||||
**Files:**
|
||||
- Modify: `porting.db`
|
||||
- Create: `/tmp/batch12-evidence/` (evidence logs)
|
||||
|
||||
**Step 1: Confirm current batch state**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch show 12 --db porting.db
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch list --db porting.db
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db
|
||||
```
|
||||
Expected: Batch 12 pending, dependency 11, 8 features, 0 tests.
|
||||
|
||||
**Step 2: Start batch**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch start 12 --db porting.db
|
||||
```
|
||||
Expected: batch marked in-progress.
|
||||
|
||||
**Step 3: Stage Group 1 IDs to `stub`**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- \
|
||||
feature batch-update --ids "987,988,991,992,993" --set-status stub --db porting.db --execute
|
||||
```
|
||||
Expected: only Group 1 IDs set to `stub`.
|
||||
|
||||
**Step 4: Commit checkpoint**
|
||||
|
||||
```bash
|
||||
git add porting.db
|
||||
git commit -m "chore(batch12): start batch and stage group1 recovery ids"
|
||||
```
|
||||
|
||||
### Task 2: Implement Group 1 Recovery Features (5 IDs)
|
||||
|
||||
**Files:**
|
||||
- Modify: `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/FileStore.cs`
|
||||
- Modify (if needed): `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/MessageBlock.cs`
|
||||
- Modify (if needed): `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/FileStoreTypes.cs`
|
||||
|
||||
**Feature IDs:** `987,988,991,992,993`
|
||||
|
||||
**Step 1: Implement logging helpers**
|
||||
|
||||
- ID `987` (`Warn`) and ID `988` (`Debug`) with FileStore-context prefixing and no-op behavior when logger/server is unavailable.
|
||||
|
||||
**Step 2: Implement full-state recovery**
|
||||
|
||||
- ID `991` (`RecoverFullState`): stream state file load, length/checksum validation, decode, stale/corrupt fallback signaling.
|
||||
|
||||
**Step 3: Implement TTL and schedule recovery**
|
||||
|
||||
- ID `992` (`RecoverTTLState`)
|
||||
- ID `993` (`RecoverMsgSchedulingState`)
|
||||
- Include stale-state linear scan fallback over recovered message blocks.
|
||||
|
||||
**Step 4: Run mandatory verification protocol for Group 1**
|
||||
|
||||
- Per-feature loop for all 5 IDs.
|
||||
- Stub Detection Check.
|
||||
- Build Gate.
|
||||
- Test Gate.
|
||||
|
||||
**Step 5: Status updates (chunk <=15)**
|
||||
|
||||
- Set Group 1 IDs to `complete` after successful evidence.
|
||||
- Promote to `verified` only if Test Gate evidence is sufficient for each feature.
|
||||
|
||||
### Task 3: Group 1 Checkpoint
|
||||
|
||||
**Files:**
|
||||
- Modify: `porting.db`
|
||||
|
||||
**Step 1: Run Checkpoint Protocol**
|
||||
|
||||
- Full build + full unit tests.
|
||||
|
||||
**Step 2: Commit**
|
||||
|
||||
```bash
|
||||
git add dotnet/src/ZB.MOM.NatsNet.Server/JetStream \
|
||||
dotnet/tests/ZB.MOM.NatsNet.Server.Tests \
|
||||
porting.db
|
||||
git commit -m "feat(batch12): complete group1 filestore recovery"
|
||||
```
|
||||
|
||||
### Task 4: Implement Group 2 Recovery Features (3 IDs)
|
||||
|
||||
**Files:**
|
||||
- Modify: `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/FileStore.cs`
|
||||
- Modify (if needed): `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/MessageBlock.cs`
|
||||
- Modify (if needed): `dotnet/src/ZB.MOM.NatsNet.Server/JetStream/FileStoreTypes.cs`
|
||||
|
||||
**Feature IDs:** `995,996,997`
|
||||
|
||||
**Step 1: Implement metadata cleanup**
|
||||
|
||||
- ID `995` (`CleanupOldMeta`): remove stale metadata file types in message directory safely.
|
||||
|
||||
**Step 2: Implement ordered message block recovery**
|
||||
|
||||
- ID `996` (`RecoverMsgs`): enumerate/sort blocks, recover block state, reconcile stream accounting, prune orphan keys.
|
||||
|
||||
**Step 3: Implement startup expiration path**
|
||||
|
||||
- ID `997` (`ExpireMsgsOnRecover`): max-age pass at startup, per-subject updates, empty-block cleanup, tombstone continuity.
|
||||
|
||||
**Step 4: Run mandatory verification protocol for Group 2**
|
||||
|
||||
- Per-feature loop for all 3 IDs.
|
||||
- Stub Detection Check.
|
||||
- Build Gate.
|
||||
- Test Gate.
|
||||
|
||||
**Step 5: Status updates (chunk <=15)**
|
||||
|
||||
- Set Group 2 IDs to `complete`, then `verified` only when test evidence criteria are met.
|
||||
|
||||
### Task 5: Group 2 Checkpoint and Batch Closure
|
||||
|
||||
**Files:**
|
||||
- Modify: `porting.db`
|
||||
- Generate: `reports/current.md`
|
||||
|
||||
**Step 1: Final gates**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet build dotnet/
|
||||
/usr/local/share/dotnet/dotnet test dotnet/tests/ZB.MOM.NatsNet.Server.Tests/ --verbosity normal
|
||||
/usr/local/share/dotnet/dotnet test dotnet/tests/ZB.MOM.NatsNet.Server.IntegrationTests/ --verbosity normal
|
||||
```
|
||||
Expected: zero failures in executed suites.
|
||||
|
||||
**Step 2: Verify batch status and unblocked work**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch show 12 --db porting.db
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- dependency ready --db porting.db
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- report summary --db porting.db
|
||||
```
|
||||
|
||||
**Step 3: Complete batch**
|
||||
|
||||
Run:
|
||||
```bash
|
||||
/usr/local/share/dotnet/dotnet run --project tools/NatsNet.PortTracker -- batch complete 12 --db porting.db
|
||||
```
|
||||
Expected: completion succeeds only if all items meet allowed terminal states.
|
||||
|
||||
**Step 4: Generate report + commit**
|
||||
|
||||
```bash
|
||||
./reports/generate-report.sh
|
||||
git add dotnet/src/ZB.MOM.NatsNet.Server/JetStream \
|
||||
dotnet/tests/ZB.MOM.NatsNet.Server.Tests \
|
||||
porting.db reports/
|
||||
git commit -m "feat(batch12): complete filestore recovery"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
Plan complete and saved to `docs/plans/2026-02-27-batch-12-filestore-recovery-plan.md`. Two execution options:
|
||||
|
||||
**1. Subagent-Driven (this session)** - I dispatch fresh subagent per task, review between tasks, fast iteration
|
||||
|
||||
**2. Parallel Session (separate)** - Open new session with `executeplan`, batch execution with checkpoints
|
||||
|
||||
Which approach?
|
||||
Reference in New Issue
Block a user