Add LMDB oplog migration path with dual-write cutover support
All checks were successful
NuGet Package Publish / nuget (push) Successful in 1m16s
All checks were successful
NuGet Package Publish / nuget (push) Successful in 1m16s
Introduce LMDB oplog store, migration flags, telemetry/backfill tooling, and parity tests to enable staged Surreal-to-LMDB rollout with rollback coverage.
This commit is contained in:
55
docs/adr/0002-lmdb-oplog-migration.md
Normal file
55
docs/adr/0002-lmdb-oplog-migration.md
Normal file
@@ -0,0 +1,55 @@
|
||||
# ADR 0002: LMDB Oplog Migration
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
The existing oplog persistence layer is Surreal-backed and tightly coupled to local CDC transaction boundaries. We need an LMDB-backed oplog path to improve prune efficiency and reduce query latency while preserving existing `IOplogStore` semantics and low-risk rollback.
|
||||
|
||||
Key constraints:
|
||||
- Dedupe by hash.
|
||||
- Hash lookup, node/time scans, and chain reconstruction.
|
||||
- Cutoff-based prune (not strict FIFO) with interleaved late arrivals.
|
||||
- Dataset isolation in all indexes.
|
||||
- Explicit handling for cross-engine atomicity when document metadata remains Surreal-backed.
|
||||
|
||||
## Decision
|
||||
|
||||
Adopt an LMDB oplog provider (`LmdbOplogStore`) with feature-flag controlled migration:
|
||||
- `UseLmdbOplog`
|
||||
- `DualWriteOplog`
|
||||
- `PreferLmdbReads`
|
||||
|
||||
### LMDB index schema
|
||||
|
||||
Single environment with named DBIs:
|
||||
- `oplog_by_hash`: `{datasetId}|{hash}` -> serialized `OplogEntry`
|
||||
- `oplog_by_hlc`: `{datasetId}|{wall}|{logic}|{nodeId}|{hash}` -> marker
|
||||
- `oplog_by_node_hlc`: `{datasetId}|{nodeId}|{wall}|{logic}|{hash}` -> marker
|
||||
- `oplog_prev_to_hash` (`DUPSORT`): `{datasetId}|{previousHash}` -> `{hash}`
|
||||
- `oplog_node_head`: `{datasetId}|{nodeId}` -> `{wall,logic,hash}`
|
||||
- `oplog_meta`: schema/version markers + dataset prune watermark
|
||||
|
||||
Composite keys use deterministic byte encodings with dataset prefixes on every index.
|
||||
|
||||
### Prune algorithm
|
||||
|
||||
Prune scans `oplog_by_hlc` up to cutoff and removes each candidate from all indexes, then recomputes touched node-head entries. Deletes run in bounded batches (`PruneBatchSize`) inside write transactions.
|
||||
|
||||
### Consistency model
|
||||
|
||||
Phase-1 consistency model is Option A (eventual cross-engine atomicity):
|
||||
- Surreal local CDC writes remain authoritative for atomic document+metadata+checkpoint transactions.
|
||||
- LMDB is backfilled/reconciled from Surreal when LMDB reads are preferred and gaps are detected.
|
||||
- Dual-write is available for sync-path writes to accelerate cutover confidence.
|
||||
|
||||
## Consequences
|
||||
|
||||
- Enables staged rollout (dual-write and read shadow validation before cutover).
|
||||
- Improves prune/query performance characteristics via ordered LMDB indexes.
|
||||
- Keeps rollback low-risk by retaining Surreal source-of-truth during migration windows.
|
||||
- Requires reconciliation logic and operational monitoring of mismatch counters/logs during migration.
|
||||
- Includes a dedicated backfill utility (`LmdbOplogBackfillTool`) with parity report output.
|
||||
- Exposes migration telemetry counters (`OplogMigrationTelemetry`) for mismatch/reconciliation tracking.
|
||||
Reference in New Issue
Block a user