Implement in-process multi-dataset sync isolation across core, network, persistence, and tests
All checks were successful
NuGet Package Publish / nuget (push) Successful in 1m14s

This commit is contained in:
Joseph Doherty
2026-02-22 11:58:34 -05:00
parent c06b56172a
commit 8e97061ab8
60 changed files with 4519 additions and 559 deletions

View File

@@ -30,6 +30,13 @@ To optimize reconnection, each node maintains a **Snapshot** of the last known s
- If the chain hash matches, they only exchange the delta.
- This avoids re-processing the entire operation history and ensures efficient gap recovery.
### Multi-Dataset Sync
CBDDC supports per-dataset sync pipelines in one process.
- Dataset identity (`datasetId`) is propagated in protocol and persistence records.
- Each dataset has independent oplog reads, confirmation state, and maintenance cadence.
- Legacy peers without dataset fields interoperate on `primary`.
### Peer-Confirmed Oplog Pruning
CBDDC maintenance pruning now uses a two-cutoff model:

View File

@@ -8,6 +8,7 @@ This index tracks CBDDC major functionality. Each feature has one canonical docu
- [Peer-to-Peer Gossip Sync](peer-to-peer-gossip-sync.md)
- [Secure Peer Transport](secure-peer-transport.md)
- [Peer-Confirmed Pruning](peer-confirmed-pruning.md)
- [Multi-Dataset Sync](multi-dataset-sync.md)
## Maintenance Rules

View File

@@ -0,0 +1,67 @@
# Multi-Dataset Sync
## Summary
CBDDC can run multiple sync pipelines inside one process by assigning each pipeline a `datasetId` (for example `primary`, `logs`, `timeseries`).
Each dataset pipeline has independent oplog state, vector-clock reads, peer confirmation watermarks, and maintenance scheduling.
## Why Use It
- Keep primary business data sync latency stable during high telemetry volume.
- Isolate append-only streams (`logs`, `timeseries`) from CRUD-heavy collections.
- Roll out incrementally using runtime flags and per-dataset enablement.
## Configuration
Register dataset options and enable the runtime coordinator:
```csharp
services.AddCBDDCSurrealEmbedded<SampleDocumentStore>(sp => options)
.AddCBDDCSurrealEmbeddedDataset("primary", o =>
{
o.InterestingCollections = ["Users", "TodoLists"];
})
.AddCBDDCSurrealEmbeddedDataset("logs", o =>
{
o.InterestingCollections = ["Logs"];
o.SyncLoopDelay = TimeSpan.FromMilliseconds(500);
})
.AddCBDDCSurrealEmbeddedDataset("timeseries", o =>
{
o.InterestingCollections = ["Timeseries"];
o.SyncLoopDelay = TimeSpan.FromMilliseconds(500);
})
.AddCBDDCNetwork<StaticPeerNodeConfigurationProvider>();
services.AddCBDDCMultiDataset(options =>
{
options.EnableMultiDatasetSync = true;
options.EnableDatasetPrimary = true;
options.EnableDatasetLogs = true;
options.EnableDatasetTimeseries = true;
});
```
## Wire and Storage Compatibility
- Protocol messages include optional `dataset_id` fields.
- Missing `dataset_id` is treated as `primary`.
- Surreal persistence records include `datasetId`; legacy rows without `datasetId` are read as `primary`.
## Operational Notes
- Each dataset runs its own `SyncOrchestrator` instance.
- Maintenance pruning is dataset-scoped (`datasetId` + cutoff).
- Snapshot APIs support dataset-scoped operations (`CreateSnapshotAsync(stream, datasetId)`).
## Migration
1. Deploy with `EnableMultiDatasetSync = false`.
2. Enable multi-dataset mode with only `primary` enabled.
3. Enable `logs`, verify primary sync SLO.
4. Enable `timeseries`, verify primary sync SLO again.
## Rollback
- Set `EnableDatasetLogs = false` and `EnableDatasetTimeseries = false` first.
- If needed, set `EnableMultiDatasetSync = false` to return to the single `primary` sync path.

View File

@@ -221,6 +221,14 @@ services.AddCBDDCCore()
});
```
### Multi-Dataset Partitioning
Surreal persistence now stores `datasetId` on oplog, metadata, snapshot metadata, confirmation, and CDC checkpoint records.
- Composite indexes include `datasetId` to prevent cross-dataset reads.
- Legacy rows missing `datasetId` are interpreted as `primary` during reads.
- Dataset-scoped store APIs (`ExportAsync(datasetId)`, `GetOplogAfterAsync(..., datasetId, ...)`) enforce isolation.
### CDC Durability Notes
1. **Checkpoint semantics**: each consumer id has an independent durable cursor (`timestamp + hash`).

View File

@@ -27,6 +27,15 @@ Capture these artifacts before remediation:
- Current runtime configuration (excluding secrets).
- Most recent deployment identifier and change window.
## Multi-Dataset Gates
Before enabling telemetry datasets in production:
1. Enable `primary` only and record baseline primary sync lag.
2. Enable `logs`; confirm primary lag remains within SLO.
3. Enable `timeseries`; confirm primary lag remains within SLO.
4. If primary SLO regresses, disable telemetry datasets first before broader rollback.
## Recovery Plays
### Peer unreachable or lagging