Introduce LMDB oplog store, migration flags, telemetry/backfill tooling, and parity tests to enable staged Surreal-to-LMDB rollout with rollback coverage.
2.5 KiB
ADR 0002: LMDB Oplog Migration
Status
Accepted
Context
The existing oplog persistence layer is Surreal-backed and tightly coupled to local CDC transaction boundaries. We need an LMDB-backed oplog path to improve prune efficiency and reduce query latency while preserving existing IOplogStore semantics and low-risk rollback.
Key constraints:
- Dedupe by hash.
- Hash lookup, node/time scans, and chain reconstruction.
- Cutoff-based prune (not strict FIFO) with interleaved late arrivals.
- Dataset isolation in all indexes.
- Explicit handling for cross-engine atomicity when document metadata remains Surreal-backed.
Decision
Adopt an LMDB oplog provider (LmdbOplogStore) with feature-flag controlled migration:
UseLmdbOplogDualWriteOplogPreferLmdbReads
LMDB index schema
Single environment with named DBIs:
oplog_by_hash:{datasetId}|{hash}-> serializedOplogEntryoplog_by_hlc:{datasetId}|{wall}|{logic}|{nodeId}|{hash}-> markeroplog_by_node_hlc:{datasetId}|{nodeId}|{wall}|{logic}|{hash}-> markeroplog_prev_to_hash(DUPSORT):{datasetId}|{previousHash}->{hash}oplog_node_head:{datasetId}|{nodeId}->{wall,logic,hash}oplog_meta: schema/version markers + dataset prune watermark
Composite keys use deterministic byte encodings with dataset prefixes on every index.
Prune algorithm
Prune scans oplog_by_hlc up to cutoff and removes each candidate from all indexes, then recomputes touched node-head entries. Deletes run in bounded batches (PruneBatchSize) inside write transactions.
Consistency model
Phase-1 consistency model is Option A (eventual cross-engine atomicity):
- Surreal local CDC writes remain authoritative for atomic document+metadata+checkpoint transactions.
- LMDB is backfilled/reconciled from Surreal when LMDB reads are preferred and gaps are detected.
- Dual-write is available for sync-path writes to accelerate cutover confidence.
Consequences
- Enables staged rollout (dual-write and read shadow validation before cutover).
- Improves prune/query performance characteristics via ordered LMDB indexes.
- Keeps rollback low-risk by retaining Surreal source-of-truth during migration windows.
- Requires reconciliation logic and operational monitoring of mismatch counters/logs during migration.
- Includes a dedicated backfill utility (
LmdbOplogBackfillTool) with parity report output. - Exposes migration telemetry counters (
OplogMigrationTelemetry) for mismatch/reconciliation tracking.