Initialize CBDD solution and add a .NET-focused gitignore for generated artifacts.
This commit is contained in:
498
compact_plan.md
Normal file
498
compact_plan.md
Normal file
@@ -0,0 +1,498 @@
|
||||
# Compression + Compaction Implementation Plan
|
||||
|
||||
## 1. Objectives
|
||||
|
||||
Implement two major capabilities in CBDD:
|
||||
|
||||
1. Compression for stored document payloads (including overflow support, compatibility, safety, and telemetry).
|
||||
2. Compaction/Vacuum for reclaiming fragmented space and shrinking database files safely.
|
||||
|
||||
This plan is grounded in the current architecture:
|
||||
- Storage pages and free-list in `src/CBDD.Core/Storage/PageFile.cs`
|
||||
- Slot/page layout in `src/CBDD.Core/Storage/SlottedPage.cs`
|
||||
- Document CRUD + overflow paths in `src/CBDD.Core/Collections/DocumentCollection.cs`
|
||||
- WAL/transaction/recovery semantics in `src/CBDD.Core/Storage/StorageEngine*.cs` and `src/CBDD.Core/Transactions/WriteAheadLog.cs`
|
||||
- Collection metadata in `src/CBDD.Core/Storage/StorageEngine.Collections.cs`
|
||||
|
||||
## 2. Key Design Decisions
|
||||
|
||||
### 2.1 Compression unit: **Per-document before overflow**
|
||||
|
||||
Chosen model: compress the full serialized BSON document first, then apply existing overflow chunking to the stored bytes.
|
||||
|
||||
Why this choice:
|
||||
- Single compression decision per document (simple threshold logic).
|
||||
- Overflow logic remains generic over byte blobs.
|
||||
- Update path can compare stored compressed size vs existing slot size directly.
|
||||
- Better ratio than per-chunk for many document shapes.
|
||||
|
||||
Consequences:
|
||||
- `HasOverflow` continues to represent storage chaining only.
|
||||
- `Compressed` continues to represent payload encoding.
|
||||
- Read path always reconstructs full stored blob (from primary + overflow), then decompresses if flagged.
|
||||
|
||||
### 2.2 Codec strategy
|
||||
|
||||
Implement a codec abstraction with initial built-in codecs backed by .NET primitives:
|
||||
- `None`
|
||||
- `Brotli`
|
||||
- `Deflate`
|
||||
|
||||
Expose via config:
|
||||
- `EnableCompression`
|
||||
- `Codec`
|
||||
- `Level`
|
||||
- `MinSizeBytes`
|
||||
- `MinSavingsPercent`
|
||||
|
||||
Add safety knobs:
|
||||
- `MaxDecompressedSizeBytes` (guardrail)
|
||||
- optional `MaxCompressionInputBytes` (defensive cap for write path)
|
||||
|
||||
### 2.3 Metadata strategy
|
||||
|
||||
Use layered metadata for compatibility and decoding:
|
||||
|
||||
1. DB-level persistent metadata (Page 0 extension region):
|
||||
- DB format version
|
||||
- feature flags (compression enabled capability)
|
||||
- default codec id
|
||||
|
||||
2. Page-level format metadata:
|
||||
- page format version marker (for mixed old/new page parsing)
|
||||
- optional default codec hint (for diagnostics and future tuning)
|
||||
|
||||
3. Slot payload metadata for compressed entries (fixed header prefix in stored payload):
|
||||
- codec id
|
||||
- original length
|
||||
- compressed length
|
||||
- checksum (CRC32 of compressed payload bytes)
|
||||
|
||||
This avoids breaking old uncompressed pages while still satisfying “readers know how to decode” and checksum requirements.
|
||||
|
||||
## 3. Workstreams
|
||||
|
||||
## 3.1 Workstream A: Compression Core + Config Surface
|
||||
|
||||
### Deliverables
|
||||
- New compression options model and codec abstraction.
|
||||
- Persistent file/page format metadata support.
|
||||
- Telemetry primitives for compression counters.
|
||||
|
||||
### Changes
|
||||
- Add `src/CBDD.Core/Compression/`:
|
||||
- `CompressionOptions.cs`
|
||||
- `CompressionCodec.cs`
|
||||
- `ICompressionCodec.cs`
|
||||
- `CompressionService.cs`
|
||||
- `CompressedPayloadHeader.cs`
|
||||
- `CompressionTelemetry.cs`
|
||||
- Extend context/engine construction:
|
||||
- `src/CBDD.Core/DocumentDbContext.cs`
|
||||
- `src/CBDD.Core/Storage/StorageEngine.cs`
|
||||
- Add DB metadata read/write helpers:
|
||||
- `src/CBDD.Core/Storage/PageFile.cs`
|
||||
- new `src/CBDD.Core/Storage/StorageEngine.Format.cs`
|
||||
|
||||
### Implementation tasks
|
||||
1. Introduce `CompressionOptions` with defaults:
|
||||
- compression disabled by default
|
||||
- conservative thresholds (`MinSizeBytes`, `MinSavingsPercent`)
|
||||
2. Add codec registry/factory.
|
||||
3. Add DB format metadata block in page 0 extension with version + feature flags + default codec id.
|
||||
4. Add page format marker to slotted pages on allocation path (new pages only).
|
||||
5. Add telemetry counter container (thread-safe atomic counters).
|
||||
|
||||
### Acceptance
|
||||
- Existing DB files open unchanged.
|
||||
- New DB files persist format metadata.
|
||||
- Compression service can roundtrip payloads with selected codec.
|
||||
|
||||
## 3.2 Workstream B: Insert + Read Paths (new writes first)
|
||||
|
||||
### Deliverables
|
||||
- Compression on insert with threshold logic.
|
||||
- Safe decompression on reads with checksum and size validation.
|
||||
- Fallback to uncompressed write on compression failure.
|
||||
|
||||
### Changes
|
||||
- `src/CBDD.Core/Collections/DocumentCollection.cs`:
|
||||
- `InsertDataCore`
|
||||
- `InsertIntoPage`
|
||||
- `InsertWithOverflow`
|
||||
- `FindByLocation`
|
||||
- `src/CBDD.Core/Storage/SlottedPage.cs` (if slot/page metadata helpers are added)
|
||||
|
||||
### Implementation tasks
|
||||
1. Insert path:
|
||||
- Serialize BSON (existing behavior).
|
||||
- If compression enabled and `docData.Length >= MinSizeBytes`, try codec.
|
||||
- Compute savings and only set `SlotFlags.Compressed` if threshold met.
|
||||
- Build compressed payload as: `[CompressedPayloadHeader][compressed bytes]`.
|
||||
- On any compression exception, increment failure counter and store uncompressed.
|
||||
2. Overflow path:
|
||||
- Apply overflow based on stored bytes length (compressed or uncompressed).
|
||||
- No separate compression of chunks/pages.
|
||||
3. Read path:
|
||||
- Existing overflow reassembly first.
|
||||
- If `Compressed` flag present:
|
||||
- parse payload header
|
||||
- validate compressed length + original length bounds
|
||||
- validate checksum before decompression
|
||||
- decompress using codec id
|
||||
- enforce `MaxDecompressedSizeBytes`
|
||||
4. Corruption handling:
|
||||
- throw deterministic `InvalidDataException` for header/checksum/size violations.
|
||||
|
||||
### Acceptance
|
||||
- Inserts/reads succeed for uncompressed docs (regression-safe).
|
||||
- Mixed compressed/uncompressed documents in same collection read correctly.
|
||||
- Corrupted compressed payload is detected and rejected predictably.
|
||||
|
||||
## 3.3 Workstream C: Update/Delete + Overflow Consistency
|
||||
|
||||
### Deliverables
|
||||
- Compression-aware update decisions.
|
||||
- Correct delete behavior for compressed+overflow combinations.
|
||||
|
||||
### Changes
|
||||
- `src/CBDD.Core/Collections/DocumentCollection.cs`:
|
||||
- `UpdateDataCore`
|
||||
- `DeleteCore`
|
||||
- `FreeOverflowChain`
|
||||
|
||||
### Implementation tasks
|
||||
1. Update path:
|
||||
- Recompute storage payload for new document using same compression decision logic as insert.
|
||||
- In-place update only when:
|
||||
- existing slot is non-overflow, and
|
||||
- new stored payload length <= old slot length, and
|
||||
- compression flag/metadata can be updated safely.
|
||||
- Otherwise relocate (existing delete+insert strategy).
|
||||
2. Delete path:
|
||||
- Keep logical semantics unchanged.
|
||||
- Ensure overflow chain extraction still works when slot has both `Compressed` and `HasOverflow`.
|
||||
3. Overflow consistency tests:
|
||||
- compressed small -> compressed overflow transitions on update
|
||||
- compressed overflow -> uncompressed small transitions
|
||||
|
||||
### Acceptance
|
||||
- Update behavior preserves correctness for all compression/overflow combinations.
|
||||
- Delete frees overflow pages for compressed and uncompressed overflow docs.
|
||||
|
||||
## 3.4 Workstream D: Compaction / Shrink (Offline first)
|
||||
|
||||
### Deliverables
|
||||
- Public `Compact`/`Vacuum` maintenance API.
|
||||
- Offline copy-and-swap compaction with crash-safe finalize.
|
||||
- Exact pre/post space accounting.
|
||||
|
||||
### API surface
|
||||
- Add to `DocumentDbContext`:
|
||||
- `Compact(CompactionOptions? options = null)`
|
||||
- `CompactAsync(...)`
|
||||
- alias `Vacuum(...)`
|
||||
- Engine-level operation in new file:
|
||||
- `src/CBDD.Core/Storage/StorageEngine.Maintenance.cs`
|
||||
|
||||
### Offline mode algorithm (Phase 1)
|
||||
1. Acquire exclusive maintenance gate (block writes).
|
||||
2. Checkpoint WAL before start.
|
||||
3. Build a temporary database file (`*.compact.tmp`) with same page config and compression config.
|
||||
4. Copy logical contents collection-by-collection:
|
||||
- preserve collection metadata/index definitions
|
||||
- reinsert documents through collection APIs so locations are rewritten correctly
|
||||
- rebuild/update index roots in metadata
|
||||
5. Checkpoint temp DB and fsync.
|
||||
6. Atomic finalize (copy-and-swap):
|
||||
- write state marker (`*.compact.state`) for resumability
|
||||
- rename original -> backup, temp -> original
|
||||
- reset/remove WAL appropriately
|
||||
- remove marker
|
||||
7. Produce `CompactionStats` with exact pre/post bytes and deltas.
|
||||
|
||||
### Crash safety
|
||||
- Use explicit state-machine marker file with phases (`Started`, `Copied`, `Swapped`, `CleanupDone`).
|
||||
- On startup, detect marker and resume/repair idempotently.
|
||||
|
||||
### Acceptance
|
||||
- File shrinks when free tail pages exist.
|
||||
- No data/index loss after compaction.
|
||||
- Crash during compaction is recoverable and deterministic.
|
||||
|
||||
## 3.5 Workstream E: Online Compaction + Scheduling
|
||||
|
||||
### Deliverables
|
||||
- Online mode with throttled relocation.
|
||||
- Scheduling hooks (manual/startup/threshold-based trigger).
|
||||
|
||||
### Online mode strategy (Phase 2)
|
||||
1. Background scanner identifies fragmented pages and relocation candidates.
|
||||
2. Move documents in bounded batches under short write exclusion windows.
|
||||
3. Update primary and secondary index locations transactionally.
|
||||
4. Periodic checkpoints to bound WAL growth.
|
||||
5. Tail truncation pass when contiguous free pages reach EOF.
|
||||
|
||||
### Scheduling hooks
|
||||
- `MaintenanceOptions`:
|
||||
- `RunAtStartup`
|
||||
- `MinFragmentationPercent`
|
||||
- `MinReclaimableBytes`
|
||||
- `MaxRunDuration`
|
||||
- `OnlineThrottle` (ops/sec or pages/batch)
|
||||
|
||||
### Acceptance
|
||||
- Writes continue during online mode except small lock windows.
|
||||
- Recovery semantics remain valid with WAL and checkpoints.
|
||||
|
||||
## 4. Compaction Internals Required by Both Modes
|
||||
|
||||
### 4.1 Page defragmentation utilities
|
||||
- Add slotted-page defrag helper:
|
||||
- rewrites active slots/data densely
|
||||
- recomputes `FreeSpaceStart/End`
|
||||
|
||||
### 4.2 Free-list consolidation + tail truncation
|
||||
- Extend `PageFile` with:
|
||||
- free-page enumeration
|
||||
- free-list normalization
|
||||
- safe truncation when free pages are contiguous at end-of-file
|
||||
|
||||
### 4.3 Metadata/index pointer correctness
|
||||
- Ensure collection metadata root IDs and index root IDs are rewritten/verified after relocation/copy.
|
||||
- Add validation pass that checks all primary index locations resolve to non-deleted slots.
|
||||
|
||||
### 4.4 WAL coordination
|
||||
- Explicit checkpoint before and after compaction.
|
||||
- Ensure compaction writes follow normal WAL durability semantics.
|
||||
- Keep `Recover()` behavior valid with compaction marker states.
|
||||
|
||||
## 5. Compatibility and Migration
|
||||
|
||||
### 5.1 Compatibility goals
|
||||
- Read old uncompressed files unchanged.
|
||||
- Support mixed pages/documents (compressed + uncompressed) in same DB.
|
||||
- Preserve existing APIs unless explicitly extended.
|
||||
|
||||
### 5.2 Migration tool (optional one-time rewrite)
|
||||
- Implement `MigrateCompression(...)` as an offline rewrite command using the same copy-and-swap machinery.
|
||||
- Options:
|
||||
- target codec/level
|
||||
- per-collection include/exclude
|
||||
- dry-run estimation mode
|
||||
|
||||
## 6. Telemetry + Admin Tooling
|
||||
|
||||
### 6.1 Compression telemetry counters
|
||||
- compressed document count
|
||||
- bytes before/after
|
||||
- compression ratio aggregate
|
||||
- compression CPU time
|
||||
- decompression CPU time
|
||||
- compression failure count
|
||||
- checksum failure count
|
||||
- safety-limit rejection count
|
||||
|
||||
Expose via:
|
||||
- `StorageEngine.GetCompressionStats()`
|
||||
- context-level forwarding method in `DocumentDbContext`
|
||||
|
||||
### 6.2 Compaction telemetry/stats
|
||||
- pre/post file size
|
||||
- live bytes
|
||||
- free bytes
|
||||
- fragmentation percentage
|
||||
- documents/pages relocated
|
||||
- runtime and throughput
|
||||
|
||||
### 6.3 Admin inspection APIs
|
||||
Add diagnostics APIs (engine/context):
|
||||
- page usage by collection/page type
|
||||
- compression ratio by collection
|
||||
- fragmentation map and free-list summary
|
||||
|
||||
## 7. Tests
|
||||
|
||||
Add focused test suites in `tests/CBDD.Tests/`:
|
||||
|
||||
1. `CompressionInsertReadTests.cs`
|
||||
- threshold on/off behavior
|
||||
- mixed compressed/uncompressed reads
|
||||
- fallback to uncompressed on forced codec error
|
||||
|
||||
2. `CompressionOverflowTests.cs`
|
||||
- compressed docs that span overflow pages
|
||||
- transitions across size thresholds
|
||||
|
||||
3. `CompressionCorruptionTests.cs`
|
||||
- bad checksum
|
||||
- bad original length
|
||||
- oversized decompression guardrail
|
||||
- invalid codec id
|
||||
|
||||
4. `CompressionCompatibilityTests.cs`
|
||||
- open existing uncompressed DB files
|
||||
- mixed-format pages after partial migration
|
||||
|
||||
5. `CompactionOfflineTests.cs`
|
||||
- logical equivalence pre/post compact
|
||||
- index correctness pre/post compact
|
||||
- tail truncation actually reduces file size
|
||||
|
||||
6. `CompactionCrashRecoveryTests.cs`
|
||||
- simulate crashes at each copy/swap state
|
||||
- resume/finalize behavior
|
||||
|
||||
7. `CompactionOnlineConcurrencyTests.cs`
|
||||
- concurrent writes + reads during online compact
|
||||
- correctness and no deadlock
|
||||
|
||||
8. `CompactionWalCoordinationTests.cs`
|
||||
- checkpoint before/after behavior
|
||||
- recoverability with in-flight WAL entries
|
||||
|
||||
Also update/extend existing tests:
|
||||
- `tests/CBDD.Tests/DocumentOverflowTests.cs`
|
||||
- `tests/CBDD.Tests/DbContextTests.cs`
|
||||
- `tests/CBDD.Tests/WalIndexTests.cs`
|
||||
|
||||
## 8. Benchmark Additions
|
||||
|
||||
Extend `tests/CBDD.Tests.Benchmark/` with:
|
||||
|
||||
1. `CompressionBenchmarks.cs`
|
||||
- insert/update/read workloads with compression on/off
|
||||
- codec and level comparison
|
||||
|
||||
2. `CompactionBenchmarks.cs`
|
||||
- offline compact runtime
|
||||
- reclaimable bytes vs elapsed
|
||||
|
||||
3. `MixedWorkloadBenchmarks.cs`
|
||||
- insert/update/delete-heavy cache workload
|
||||
- periodic compact impact
|
||||
|
||||
4. Update `DatabaseSizeBenchmark.cs`
|
||||
- pre/post compact shrink delta reporting
|
||||
- compression ratio reporting
|
||||
|
||||
## 9. Suggested Implementation Order (Execution Plan)
|
||||
|
||||
### Phase 1 (as requested): Compression config + read/write path for new writes only
|
||||
- Workstream A
|
||||
- Workstream B (insert/read only)
|
||||
- initial tests: insert/read + compatibility
|
||||
|
||||
### Phase 2 (as requested): Compression-aware updates/deletes + overflow handling
|
||||
- Workstream C
|
||||
- overflow-focused tests + corruption guards
|
||||
|
||||
### Phase 3 (as requested): Offline copy-and-swap compaction/shrink
|
||||
- Workstream D + shared internals from section 4
|
||||
- crash-safe finalize + space accounting
|
||||
|
||||
### Phase 4 (as requested): Online compaction + automation hooks
|
||||
- Workstream E
|
||||
- concurrency and scheduling tests
|
||||
|
||||
## 10. Subagent Execution Safety + Completion Verification
|
||||
|
||||
### 10.1 Subagent ownership model
|
||||
|
||||
Use explicit, non-overlapping ownership to avoid unsafe parallel edits:
|
||||
|
||||
1. Subagent A (Compression Core)
|
||||
- Owns `src/CBDD.Core/Compression/*`
|
||||
- Owns format/config plumbing in `src/CBDD.Core/Storage/StorageEngine.Format.cs`, `src/CBDD.Core/Storage/StorageEngine.cs`, `src/CBDD.Core/DocumentDbContext.cs`
|
||||
|
||||
2. Subagent B (CRUD + Overflow Compression Semantics)
|
||||
- Owns compression-aware CRUD changes in `src/CBDD.Core/Collections/DocumentCollection.cs`
|
||||
- Owns slot/payload metadata helpers in `src/CBDD.Core/Storage/SlottedPage.cs` (if needed)
|
||||
|
||||
3. Subagent C (Compaction/Vacuum Engine)
|
||||
- Owns `src/CBDD.Core/Storage/StorageEngine.Maintenance.cs`
|
||||
- Owns related `PageFile` extensions in `src/CBDD.Core/Storage/PageFile.cs`
|
||||
|
||||
4. Subagent D (Verification Assets)
|
||||
- Owns new/updated tests in `tests/CBDD.Tests/*Compression*`, `tests/CBDD.Tests/*Compaction*`
|
||||
- Owns benchmark additions in `tests/CBDD.Tests.Benchmark/*`
|
||||
|
||||
Rule: one file has exactly one active subagent owner at a time.
|
||||
|
||||
### 10.2 Safe collaboration rules for subagents
|
||||
|
||||
1. Do not edit files outside assigned ownership scope.
|
||||
2. Do not revert or reformat unrelated existing changes.
|
||||
3. Do not change public contracts outside assigned workstream without explicit handoff.
|
||||
4. If overlap is discovered mid-task, stop and reassign ownership before continuing.
|
||||
5. Keep changes atomic and reviewable (small PR-sized batches per workstream phase).
|
||||
|
||||
### 10.3 Required handoff payload from each subagent
|
||||
|
||||
Each completion handoff MUST include:
|
||||
|
||||
1. Summary of implemented requirements and non-implemented items.
|
||||
2. Exact touched file list.
|
||||
3. Risk list (behavioral, compatibility, recovery, perf).
|
||||
4. Test commands executed and pass/fail results.
|
||||
5. Mapping to plan acceptance criteria sections.
|
||||
|
||||
### 10.4 Mandatory verification on subagent completion
|
||||
|
||||
Every subagent completion is verified before merge/mark-done:
|
||||
|
||||
1. Scope verification
|
||||
- Confirm touched files are in owned scope only.
|
||||
- Confirm required plan items for that phase are implemented.
|
||||
|
||||
2. Correctness verification
|
||||
- Run targeted tests for touched behavior.
|
||||
- Run related regression suites (CRUD, overflow, WAL/recovery, index consistency where applicable).
|
||||
|
||||
3. Safety verification
|
||||
- Validate corruption/safety guard behavior (checksum, size limits, crash-state handling where applicable).
|
||||
- Validate backward compatibility with old uncompressed files when relevant.
|
||||
|
||||
4. Performance verification
|
||||
- Run benchmark smoke checks for modified hot paths.
|
||||
- Verify no obvious regressions against baseline thresholds.
|
||||
|
||||
5. Integration verification
|
||||
- Rebuild solution and run full test pass before final phase closure.
|
||||
|
||||
### 10.5 Verification commands (minimum gate)
|
||||
|
||||
Run these gates after each subagent completion (adjust filters to scope):
|
||||
|
||||
1. `dotnet build /Users/dohertj2/Desktop/CBDD/CBDD.slnx`
|
||||
2. `dotnet test /Users/dohertj2/Desktop/CBDD/tests/CBDD.Tests/ZB.MOM.WW.CBDD.Tests.csproj`
|
||||
3. Targeted suites for the changed area, for example:
|
||||
- `dotnet test /Users/dohertj2/Desktop/CBDD/tests/CBDD.Tests/ZB.MOM.WW.CBDD.Tests.csproj --filter \"FullyQualifiedName~Compression\"`
|
||||
- `dotnet test /Users/dohertj2/Desktop/CBDD/tests/CBDD.Tests/ZB.MOM.WW.CBDD.Tests.csproj --filter \"FullyQualifiedName~Compaction\"`
|
||||
4. Benchmark smoke run when hot paths changed:
|
||||
- `dotnet run -c Release --project /Users/dohertj2/Desktop/CBDD/tests/CBDD.Tests.Benchmark/ZB.MOM.WW.CBDD.Tests.Benchmark.csproj`
|
||||
|
||||
If any gate fails, the subagent task is not complete and must be returned for rework with failure notes.
|
||||
|
||||
## 11. Definition of Done (Release Gates)
|
||||
|
||||
1. Correctness
|
||||
- all new compression/compaction test suites green
|
||||
- no regressions in existing test suite
|
||||
|
||||
2. Compatibility
|
||||
- old DB files readable with no migration required
|
||||
- mixed-format operation validated
|
||||
|
||||
3. Safety
|
||||
- decompression guardrails enforced
|
||||
- corruption checks and deterministic failure behavior
|
||||
- crash recovery scenarios covered
|
||||
|
||||
4. Performance
|
||||
- documented benchmark deltas for write/read overhead and compaction throughput
|
||||
- no pathological GC spikes under compression-enabled workloads
|
||||
|
||||
5. Operability
|
||||
- telemetry counters exposed
|
||||
- admin diagnostics available for page usage/compression/fragmentation
|
||||
Reference in New Issue
Block a user