40 KiB
Executable File
RFC-CBDD: High-Performance Embedded Document Database for .NET
Status: Draft
Version: 0.1.0
Date: February 2026
Authors: CBDD Development Team
Abstract
This document specifies CBDD, a high-performance embedded document-oriented database engine for .NET 10. CBDD is designed from the ground up for zero-allocation performance, leveraging modern .NET features including Span<T>, Memory-Mapped Files, and Source Generators. The database implements a custom C-BSON (Compressed BSON) format that achieves 30-60% storage reduction compared to standard BSON while maintaining full type compatibility.
Key innovations include:
- Zero-allocation I/O via
Span<byte>andstackalloc - C-BSON format with field name compression (2-byte IDs vs. variable-length strings)
- Page-based storage with memory-mapped file I/O
- Multiple index types: B+Tree, R-Tree (geospatial), and HNSW (vector similarity)
- ACID transactions with Write-Ahead Logging and Snapshot Isolation
- Compile-time code generation for zero-reflection serialization
1. Introduction
1.1 Motivation
Most embedded databases for .NET fall into two categories:
- Native wrappers (SQLite, RocksDB): High performance but with interop overhead and GC pressure from marshalling
- Managed implementations (LiteDB, others): Pure C# but burdened by reflection, excessive allocations, and legacy design
CBDD bridges this gap by leveraging .NET 10's modern performance features while remaining a pure managed implementation:
Span<T>andMemory<T>for zero-copy I/O- Source Generators for zero-reflection serialization
- Memory-Mapped Files for OS-level page caching
- Stack allocation (
stackalloc) for ephemeral buffers
1.2 Scope
This RFC specifies:
- Storage Engine: Page file format, WAL protocol, transaction semantics
- C-BSON Format: Wire format, schema management, key compression
- Indexing: B+Tree, R-Tree, and HNSW implementations
- Query Engine: LINQ provider and hybrid execution model
- Code Generation: Mapper generation rules and attribute support
1.3 Terminology
MUST, SHOULD, MAY: As defined in RFC 2119
- Page: Fixed-size block of data (default 16KB)
- Slot: Variable-size entry within a slotted page
- C-BSON: Compressed BSON with field ID compression
- WAL: Write-Ahead Log for durability
- MBR: Minimum Bounding Rectangle (for R-Tree)
- HNSW: Hierarchical Navigable Small World (vector search algorithm)
2. Architecture Overview
┌─────────────────────────────────────────────────────────┐
│ CBDD Architecture │
├─────────────────────────────────────────────────────────┤
│ ┌───────────────┐ ┌───────────────┐ ┌─────────────┐ │
│ │ LINQ Provider │ │ Source Gen │ │ Collections │ │
│ │ (Queryable) │ │ (Mappers) │ │ (DbContext)│ │
│ └───────┬───────┘ └───────┬───────┘ └──────┬──────┘ │
│ │ │ │ │
│ ┌───────▼─────────────────────────────────────▼──────┐ │
│ │ Index Layer (B-Tree, R-Tree, HNSW) │ │
│ └───────┬─────────────────────────────────────┬──────┘ │
│ │ │ │
│ ┌───────▼──────────────────────────────────────▼─────┐ │
│ │ Storage Engine (Pages, Transactions) │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │ │
│ │ │ PageFile │ │ WAL │ │ FreeList │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────┘ │ │
│ └────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ C-BSON (Span-based Reader/Writer) │ │
│ └────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ OS Memory-Mapped Files (Kernel Page Cache) │ │
│ └────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
2.1 Storage Layer
The storage layer manages page-based I/O using memory-mapped files:
- PageFile: Fixed-size pages (8KB, 16KB, or 32KB)
- Page Types: Header, Data, Index, Vector, Spatial, Dictionary, Schema, Overflow, Free
- Free List: Linked list of reusable pages
2.2 C-BSON Layer
Provides zero-allocation BSON serialization/deserialization:
- BsonSpanWriter: Writes C-BSON to
Span<byte> - BsonSpanReader: Reads C-BSON from
ReadOnlySpan<byte> - Schema Management: Field name → ID mapping
Full specification: See C-BSON.md
2.3 Index Layer
Three specialized index types:
- B+Tree: General-purpose sorted index (range queries, equality)
- R-Tree: Geospatial index (proximity, bounding box queries)
- HNSW: Vector similarity search (k-NN, ANN)
2.4 Query Layer
LINQ-to-CBDD provider:
- Translates LINQ expressions to index operations
- Hybrid execution: Index-based filtering + in-memory LINQ to Objects
- Supports
Where,OrderBy,Skip,Take,GroupBy,Join, aggregations
2.5 Transaction Layer
ACID guarantees via:
- Atomicity: All-or-nothing commits
- Consistency: Schema validation
- Isolation: Snapshot isolation (MVCC-like)
- Durability: Write-Ahead Logging (WAL)
3. Storage Engine Specification
3.1 Page File Format
3.1.1 Page Sizes
CBDD supports 3 predefined page sizes:
| Configuration | Page Size | Use Case |
|---|---|---|
| Small | 8 KB | Embedded, tiny documents |
| Default | 16 KB | General purpose (InnoDB-like) |
| Large | 32 KB | Big documents (MongoDB-like) |
Rationale: 16KB aligns with Linux page cache and InnoDB defaults, balancing fragmentation vs. overhead.
3.1.2 File Layout
┌─────────────────────────────────────────────────────────┐
│ Page 0: File Header │
│ [PageHeader (32)] │
│ [Database Version (4)] │
│ [Page Size (4)] │
│ [First Free Page ID (4)] ← Free list head │
│ [Dictionary Root Page ID (4)] │
│ [Reserved (remaining)] │
├─────────────────────────────────────────────────────────┤
│ Page 1: Collection Metadata │
│ [SlottedPageHeader (24)] │
│ [Collection Schemas...] │
├─────────────────────────────────────────────────────────┤
│ Page 2+: Data, Index, Vector, Spatial, Dictionary... │
└─────────────────────────────────────────────────────────┘
3.2 Page Header Format
All pages start with a 32-byte header:
Offset Size Field Description
------ ---- ------------------ --------------------------
0 4 PageId Page number (0-indexed)
4 1 PageType Enum (see §3.3)
5 2 FreeBytes Unused space in page
7 4 NextPageId Linked list pointer
11 8 TransactionId Last modifying transaction
19 4 Checksum CRC32 of page data
23 4 DictionaryRootPageId (Page 0 only)
27 5 Reserved Future use
Total: 32 bytes
Implementation:
[StructLayout(LayoutKind.Explicit, Size = 32)]
public struct PageHeader
{
[FieldOffset(0)] public uint PageId;
[FieldOffset(4)] public PageType PageType;
[FieldOffset(5)] public ushort FreeBytes;
[FieldOffset(7)] public uint NextPageId;
[FieldOffset(11)] public ulong TransactionId;
[FieldOffset(19)] public uint Checksum;
[FieldOffset(23)] public uint DictionaryRootPageId;
}
3.3 Page Types
| Value | Type | Purpose |
|---|---|---|
| 0 | Empty | Uninitialized |
| 1 | Header | Page 0 (file header) |
| 2 | Collection | Schema and collection metadata |
| 3 | Data | Document storage (slotted page) |
| 4 | Index | B+Tree node |
| 5 | FreeList | Deprecated (unused) |
| 6 | Overflow | Continuation of large documents |
| 7 | Dictionary | String interning for C-BSON keys |
| 8 | Schema | Schema versioning |
| 9 | Vector | HNSW index node |
| 10 | Free | Reusable page (linked via NextPageId) |
| 11 | Spatial | R-Tree node |
3.4 Slotted Page Structure
Data pages use a slotted page design for variable-size documents:
┌─────────────────────────────────────────────────────────┐
│ [SlottedPageHeader (24)] │
│ PageId, PageType, SlotCount, FreeSpaceStart, │
│ FreeSpaceEnd, NextOverflowPage, TransactionId │
├─────────────────────────────────────────────────────────┤
│ [Slot Array (grows down)] │
│ ┌──────────────────────────────────┐ │
│ │ Slot 0: [Offset, Length, Flags] │ (8 bytes) │
│ │ Slot 1: [Offset, Length, Flags] │ │
│ │ Slot 2: [Offset, Length, Flags] │ │
│ │ ... │ │
│ └──────────────────────────────────┘ │
│ ↓ │
│ [Free Space] │
│ ↑ │
│ [Data Area (grows up)] │
│ ┌──────────────────────────────────┐ │
│ │ ... Document N C-BSON bytes ... │ │
│ │ ... Document 2 C-BSON bytes ... │ │
│ │ ... Document 1 C-BSON bytes ... │ │
│ │ ... Document 0 C-BSON bytes ... │ │
│ └──────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
SlottedPageHeader (24 bytes)
Offset Size Field Description
------ ---- ------------------ --------------------------
0 4 PageId
4 1 PageType (= 3 for Data pages)
8 2 SlotCount Number of slots
10 2 FreeSpaceStart Offset where slots end
12 2 FreeSpaceEnd Offset where data begins
14 4 NextOverflowPage For large documents
18 4 TransactionId
22 2 Reserved
SlotEntry (8 bytes)
Offset Size Field Description
------ ---- ------------------ --------------------------
0 2 Offset Byte offset to document data
2 2 Length Document length in bytes
4 4 Flags SlotFlags enum
SlotFlags:
None = 0: Active slotDeleted = 1: Slot marked for reuseHasOverflow = 2: Document continues in overflow pagesCompressed = 4: Reserved for future compression
3.5 DocumentLocation
Documents are addressed by (PageId, SlotIndex) tuple:
public readonly struct DocumentLocation
{
public uint PageId { get; init; } // 4 bytes
public ushort SlotIndex { get; init; } // 2 bytes
}
// Total: 6 bytes when serialized
Used in:
- Index entries (key → location mapping)
- Overflow page chains
- Internal references
3.6 Free Page Management
Deleted pages form a linked list for reuse:
PageHeader.PageType = FreePageHeader.NextPageIdpoints to next free page- Page 0's
NextPageIdpoints to head of free list
Allocation:
if (freeListHead != 0):
pageId = freeListHead
freeListHead = ReadPage(pageId).NextPageId
UpdatePage0(freeListHead)
else:
pageId = nextPageId++
ExpandFile()
Deallocation:
WritePage(pageId, Free with next=freeListHead)
freeListHead = pageId
UpdatePage0(freeListHead)
4. C-BSON Format Specification
C-BSON (Compressed BSON) is CBDD's wire format. See the dedicated C-BSON.md document for complete specification.
Key points:
- Element header:
[1 byte type][2 byte field ID](vs. BSON's[1 byte type][N byte name\0]) - Schema-based: Field names mapped to
ushortIDs viaConcurrentDictionary - Storage savings: 30-60% reduction for typical schemas
- Type compatible: Uses standard BSON type codes and value encoding
Example:
Standard BSON element: [0x02]['em','ai','l','\0'][value] = 6 bytes overhead
C-BSON element: [0x02][0x03, 0x00][value] = 3 bytes overhead
Savings: 50%
5. Indexing Specifications
5.1 B+Tree Index
5.1.1 Node Structure
B+Tree nodes are stored in Index pages (PageType = 4):
┌─────────────────────────────────────────────────────────┐
│ [PageHeader (32)] │
├─────────────────────────────────────────────────────────┤
│ [BTreeNodeHeader (20)] │
│ PageId, IsLeaf, EntryCount, ParentPageId, │
│ NextLeafPageId, PrevLeafPageId │
├─────────────────────────────────────────────────────────┤
│ [Entries...] │
│ For Leaf Nodes: │
│ ┌────────────────────────────────────┐ │
│ │ IndexKey (variable) │ │
│ │ DocumentLocation (6 bytes) │ │
│ └────────────────────────────────────┘ │
│ For Internal Nodes: │
│ ┌────────────────────────────────────┐ │
│ │ IndexKey (variable) │ │
│ │ ChildPageId (4 bytes) │ │
│ └────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
5.1.2 BTreeNodeHeader (20 bytes)
public struct BTreeNodeHeader
{
public uint PageId; // 4 bytes
public bool IsLeaf; // 1 byte
public ushort EntryCount; // 2 bytes
public uint ParentPageId; // 4 bytes
public uint NextLeafPageId; // 4 bytes (leaf only)
public uint PrevLeafPageId; // 4 bytes (leaf only)
}
5.1.3 IndexKey
Supports composite keys with multiple types:
public struct IndexKey : IComparable<IndexKey>
{
public object[] Values { get; set; } // Multi-column support
public int CompareTo(IndexKey other) { ... }
}
Serialization:
- Each value serialized as C-BSON element
- Supports: String, Int32, Int64, Double, ObjectId, DateTime, Guid
5.1.4 Operations
Insert: O(log n)
- Traverse to leaf
- Insert key-location pair
- Split if full (B+Tree standard split)
Search: O(log n)
- Binary search in nodes
- Equality: Single lookup
- Range: Scan linked leaf nodes
Delete: O(log n)
- Mark entry as deleted
- Lazy compaction on split
5.2 R-Tree Index (Geospatial)
5.2.1 Node Structure
R-Tree nodes use Spatial pages (PageType = 11):
┌─────────────────────────────────────────────────────────┐
│ [PageHeader (32)] │
├─────────────────────────────────────────────────────────┤
│ [SpatialPageHeader (16)] │
│ IsLeaf, Level, EntryCount, ParentPageId │
├─────────────────────────────────────────────────────────┤
│ [Entries...] (38 bytes each) │
│ ┌──────────────────────────────────────┐ │
│ │ MBR (GeoBox): 4 × double = 32 bytes │ │
│ │ MinLat, MinLon, MaxLat, MaxLon │ │
│ │ Pointer: DocumentLocation = 6 bytes │ │
│ └──────────────────────────────────────┘ │
│ (For internal nodes: Pointer = ChildPageId) │
└─────────────────────────────────────────────────────────┘
5.2.2 GeoBox (Minimum Bounding Rectangle)
public struct GeoBox
{
public double MinLat { get; set; }
public double MinLon { get; set; }
public double MaxLat { get; set; }
public double MaxLon { get; set; }
public bool Intersects(GeoBox other) { ... }
public bool Contains((double, double) point) { ... }
public GeoBox ExpandTo(GeoBox other) { ... }
}
5.2.3 Operations
Insert: O(log n)
- Choose subtree with minimal MBR expansion
- Insert and update MBRs up the tree
- Split using quadratic algorithm
Search (Proximity):
Query: Find points within radius R of (lat, lon)
1. Convert to GeoBox: (lat-R, lon-R) to (lat+R, lon+R)
2. Traverse R-Tree, pruning non-intersecting branches
3. For leaf entries: Calculate exact distance
4. Return sorted by distance
Search (Bounding Box):
Query: Find points within box (minLat, minLon, maxLat, maxLon)
1. Create GeoBox
2. Traverse R-Tree, returning intersecting leaf entries
5.3 HNSW Index (Vector Similarity)
5.3.1 Node Structure
HNSW nodes use Vector pages (PageType = 9):
┌─────────────────────────────────────────────────────────┐
│ [PageHeader (32)] │
├─────────────────────────────────────────────────────────┤
│ [VectorPageHeader (16)] │
│ Dimensions, MaxM, NodeSize, NodeCount │
├─────────────────────────────────────────────────────────┤
│ [Nodes...] (variable size) │
│ ┌──────────────────────────────────────┐ │
│ │ DocumentLocation (6 bytes) │ │
│ │ MaxLevel (1 byte) │ │
│ │ Vector (dimensions × 4 bytes) │ │
│ │ Links Level 0 (2M × 6 bytes) │ │
│ │ Links Level 1-15 (M × 6 bytes each) │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
5.3.2 HNSW Parameters
- M: Max bidirectional links per level (typically 16)
- Dimensions: Vector dimensionality (e.g., 1536 for OpenAI embeddings)
- ef_construction: Quality parameter during build (typically 200)
- ef_search: Quality parameter during search (typically 50)
5.3.3 Vector Similarity Metrics
public enum VectorMetric
{
Cosine, // cos(θ) = dot(a,b) / (||a|| × ||b||)
Euclidean, // √Σ(ai - bi)²
DotProduct // Σ(ai × bi)
}
5.3.4 Operations
Insert: O(log n) expected
- Assign random level (exponential distribution)
- Starting from top level, greedily descend
- At each level, add bidirectional links to nearest M neighbors
Search (k-NN):
Query: Find k nearest neighbors to query vector
1. Start at entry point (top level)
2. Greedily search to local minimum at each level
3. At level 0, maintain priority queue of k candidates
4. Expand candidate set with ef_search parameter
5. Return top k by similarity
6. Transaction and WAL Specification
6.1 Transaction Model
CBDD implements Snapshot Isolation (SI):
- Read transactions: See consistent snapshot as of transaction start
- Write transactions: Accumulate changes in-memory, commit atomically
- Conflict detection: Last-write-wins (optimistic concurrency)
6.2 Write-Ahead Log (WAL)
CBDD implements a full WAL for durability and crash recovery:
WAL Entry Format
┌─────────────────────────────────────────────────────────┐
│ WAL Entry Format │
├─────────────────────────────────────────────────────────┤
│ [Record Type: 1 byte] │
│ 0x01 = Begin │
│ 0x02 = Write │
│ 0x03 = Commit │
│ 0x04 = Abort │
│ 0x05 = Checkpoint │
├─────────────────────────────────────────────────────────┤
│ For Begin/Commit/Abort: │
│ [Transaction ID: 8 bytes] │
│ [Timestamp: 8 bytes (Unix ms)] │
│ Total: 17 bytes │
├─────────────────────────────────────────────────────────┤
│ For Write: │
│ [Transaction ID: 8 bytes] │
│ [PageId: 4 bytes] │
│ [After Image Length: 4 bytes] │
│ [After Image: variable bytes] │
│ Total: 17 + AfterImage.Length │
└─────────────────────────────────────────────────────────┘
WAL Protocol
Write Path:
1. Begin Transaction → WriteBeginRecord(txnId)
2. Modify Data → WriteDataRecord(txnId, pageId, afterImage)
3. Commit → WriteCommitRecord(txnId) + Flush()
Recovery Path:
var records = wal.ReadAll();
foreach (var record in records)
{
if (record.Type == WalRecordType.Write && IsCommitted(record.TransactionId))
{
pageFile.WritePage(record.PageId, record.AfterImage);
}
}
Implementation Details
Zero-Allocation Writes:
// Synchronous (stack allocated)
Span<byte> buffer = stackalloc byte[17];
buffer[0] = (byte)WalRecordType.Begin;
BitConverter.TryWriteBytes(buffer[1..9], transactionId);
BitConverter.TryWriteBytes(buffer[9..17], timestamp);
_walStream.Write(buffer);
// Asynchronous (pooled)
var buffer = ArrayPool<byte>.Shared.Rent(totalSize);
try
{
// ... write to buffer
await _walStream.WriteAsync(buffer.AsMemory(0, totalSize), ct);
}
finally
{
ArrayPool<byte>.Shared.Return(buffer);
}
Durability Guarantee:
public void Flush()
{
_walStream?.Flush(flushToDisk: true); // Force OS fsync
}
Checkpoint and Truncate:
// After applying WAL to pages:
pageFile.Flush(); // Ensure pages on disk
wal.Truncate(); // Remove committed WAL records
wal.Flush(); // Sync truncation
6.3 ACID Guarantees
- Atomicity: WAL ensures all-or-nothing commits
- Consistency: Schema validation before commit
- Isolation: Snapshot isolation (MVCC-like)
- Durability: WAL flush on commit
7. Query Processing
7.1 LINQ Provider
CBDD implements IQueryable<T> via custom query provider:
var results = db.Users.AsQueryable()
.Where(u => u.Age > 25 && u.City == "NYC")
.OrderBy(u => u.Name)
.Take(10)
.ToList();
Translation:
- Expression tree → BTree query plan
- Index selection: Choose index on
AgeorCity - Index scan: Retrieve candidate DocumentLocations
- Post-filter: Apply remaining predicates in-memory
- Materialize: Read documents, deserialize to objects
7.2 Index Selection Algorithm
For WHERE clause:
1. Extract equality and range predicates
2. Score each index by predicate coverage
3. Select index with highest score
4. Use index scan if selective, else full scan
Example:
WHERE Age > 25 AND City = "NYC"
Index options:
- Index on
Age→ Range scan (potentially many results) - Index on
City→ Equality lookup (likely fewer results) - Choose:
Cityindex, post-filterAge > 25
7.3 Hybrid Execution Model
CBDD combines index-based and in-memory execution:
- Index Phase: Use B-Tree/R-Tree to filter candidates
- Materialization: Read documents from pages
- LINQ to Objects: Apply complex predicates, projections, aggregations
Benefits:
- Leverage index selectivity
- Support full LINQ semantics
- Avoid building complex query execution engine
8. Source Generation Protocol
8.1 Mapper Generation
CBDD uses Roslyn Source Generators to produce zero-reflection mappers:
Input:
public class User
{
public ObjectId Id { get; set; }
public string Name { get; set; }
public int Age { get; set; }
}
public partial class MyDbContext : DocumentDbContext
{
public DocumentCollection<ObjectId, User> Users { get; set; }
}
Generated:
namespace MyApp.Mappers
{
public class UserMapper : ObjectIdMapperBase<User>
{
public override string CollectionName => "users";
public override int Serialize(User entity, BsonSpanWriter writer)
{
var start = writer.BeginDocument();
writer.WriteObjectId("_id", entity.Id);
writer.WriteString("name", entity.Name);
writer.WriteInt32("age", entity.Age);
writer.EndDocument(start);
return writer.Position;
}
public override User Deserialize(BsonSpanReader reader)
{
var user = new User();
reader.ReadDocumentSize();
while (reader.Remaining > 0)
{
var type = reader.ReadBsonType();
if (type == BsonType.EndOfDocument) break;
var name = reader.ReadElementHeader();
switch (name)
{
case "_id": user.Id = reader.ReadObjectId(); break;
case "name": user.Name = reader.ReadString(); break;
case "age": user.Age = reader.ReadInt32(); break;
default: reader.SkipValue(type); break;
}
}
return user;
}
}
}
8.2 Attribute Support
See Data Annotations Support section in README and related documentation.
Supported attributes:
[Table(Name, Schema)]→ Collection name mapping[Column(Name, TypeName)]→ Field name and special type mapping[Key]→ Primary key identification[NotMapped]→ Exclusion from serialization[Required],[StringLength],[Range]→ Validation
8.3 Code Generation Rules
- Lowercase policy: BSON field names are lowercase by default
- Attribute override:
[BsonProperty],[JsonPropertyName],[Column]override default names - Nested objects: Recursively analyzed and mapped
- Collections: Arrays and
IEnumerable<T>mapped to BSON arrays - Value types: Primitives, enums,
DateTime,ObjectId,Guidhandled natively
9. Security Considerations
9.1 Data Integrity
- Checksums: CRC32 on page headers (planned: extend to full pages)
- WAL: Ensures consistency even on crash
- Schema validation: Prevents type mismatches
9.2 Concurrent Access
Multi-Threading: ✅ Fully Supported
- Thread-safe writes: Multiple threads can write concurrently within the same process
- Internal synchronization:
SemaphoreSlimfor critical sections,ConcurrentDictionaryfor shared state - WAL coordination: Commit lock ensures serializable WAL writes
- Page cache: Thread-safe access to memory-mapped pages
Multi-Process: ❌ Not Supported
- Exclusive file lock:
FileShare.Noneprevents multiple processes from opening the same database - Rationale: Simplifies consistency guarantees, avoids complex inter-process coordination
- Future: May support via cooperative locking or server mode (HTTP API)
9.3 Injection Attacks
- No SQL injection: No query language, only type-safe LINQ
- Schema-validated: All operations type-checked at compile-time
10. Performance Considerations
10.1 Zero-Allocation Design
Stack allocation:
Span<byte> buffer = stackalloc byte[16384]; // Page buffer on stack
var writer = new BsonSpanWriter(buffer, keyMap);
No boxing:
- Value types remain unboxed throughout serialization
- No reflection or dynamic invocation
Pooling:
var buffer = ArrayPool<byte>.Shared.Rent(pageSize);
try { /* use buffer */ }
finally { ArrayPool<byte>.Shared.Return(buffer); }
10.2 Memory-Mapped Files
Benefits:
- OS kernel manages page cache
- Zero-copy reads (map file → process memory)
- Prefetching and read-ahead by OS
Tradeoffs:
- Limited to single process (file lock)
- Windows vs. Linux differences in
mmapbehavior
10.3 Cache Efficiency
Compact C-BSON:
- More documents per 16KB page
- Better CPU cache utilization
- Reduced TLB misses
11. Implementation Notes
11.1 .NET 10 Requirements
CBDD targets .NET 10 to leverage:
Span<T>andref structfor zero-copy I/O- Source Generators (Roslyn)
MemoryMarshalfor efficient struct serialization- Improved JIT optimizations
Future: Evaluate .NET Standard 2.1 for broader compatibility.
11.2 Platform Considerations
- Windows: Full support via MemoryMappedFile
- Linux/macOS: Supported, potential differences in
mmapbehavior - Endianness: Little-endian assumed (matches x86/x64/ARM)
11.3 Future Compatibility
Planned enhancements:
- Compression: Page-level or document-level LZ4
- Encryption: AES-256 for data-at-rest
- Multi-process: Via cooperative locking or server mode
12. References
12.1 BSON and Formats
12.2 Database Internals
- Database Internals by Alex Petrov (O'Reilly, 2019)
- B+ Trees (Wikipedia)
- R-Trees (Guttman, 1984)
12.3 Vector Search
12.4 Standards
Appendix A: Page Format Diagrams
A.1 Slotted Page Visual
Offset: 0 16383
┌────────────────────────────────────────────────┐
24 │ [Header: 24 bytes] │
├────────────────────────────────────────────────┤
│ [Slot 0: Offset=16320, Len=64, Flags=0] │
│ [Slot 1: Offset=16256, Len=64, Flags=0] │
│ [Slot 2: Offset=16192, Len=64, Flags=0] │
56 │ ← FreeSpaceStart │
│ │
│ ~~~~~~~~~~~~~~~~ Free Space ~~~~~~~~~~~~~~~~~~~ │
│ │
16192 │ ← FreeSpaceEnd │
│ [Document 2 data: 64 bytes] │
│ [Document 1 data: 64 bytes] │
│ [Document 0 data: 64 bytes] │
16384 └────────────────────────────────────────────────┘
A.2 B+Tree Node Visual
Internal Node:
┌──────────────────────────────────────────────────────┐
│ [PageHeader 32] [BTreeNodeHeader 20] │
├──────────────────────────────────────────────────────┤
│ Key1 | ChildPtr1 │
│ Key2 | ChildPtr2 │
│ Key3 | ChildPtr3 │
│ ... │
└──────────────────────────────────────────────────────┘
Leaf Node:
┌──────────────────────────────────────────────────────┐
│ [PageHeader 32] [BTreeNodeHeader 20] │
├──────────────────────────────────────────────────────┤
│ Key1 | DocumentLocation1 │
│ Key2 | DocumentLocation2 │
│ Key3 | DocumentLocation3 │
│ ... │
│ NextLeafPageId → [next leaf] │
└──────────────────────────────────────────────────────┘
Appendix B: C-BSON Hex Dump
See C-BSON.md, Section "Hex Dump Examples" for detailed wire format examples.
Appendix C: Transaction State Machine
┌─────────────┐
│ Idle │
└──────┬──────┘
│ BeginTransaction()
▼
┌─────────────┐
│ Active │ ← Accumulate writes in memory
└──┬───────┬──┘
│ │ Rollback()
│ └────────────┐
│ Commit() │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ Committing │ │ Aborted │
│ (WAL flush) │ └─────────────┘
└──────┬──────┘
│ Flush complete
▼
┌─────────────┐
│ Committed │
└─────────────┘
End of RFC-CBDD Specification